Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I want to get the href value:

<span class="title">
  <a href="https://www.example.com"></a>
</span>

I tried this:

Link = Link1.css('span[class=title] a::text').extract()[0]

But I just get the text inside the <a>. How can I get the link inside the href?

question from:https://stackoverflow.com/questions/21181628/get-href-using-css-selector-with-scrapy

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
452 views
Welcome To Ask or Share your Answers For Others

1 Answer

What you're looking for is:

Link = Link1.css('span[class=title] a::attr(href)').extract()[0]

Since you're matching a span "class" attribute also, you can even write

Link = Link1.css('span.title a::attr(href)').extract()[0]

Please note that ::text pseudo element and ::attr(attributename) functional pseudo element are NOT standard CSS3 selectors. They're extensions to CSS selectors in Scrapy 0.20.


Edit (2017-07-20): starting from Scrapy 1.0, you can use .extract_first() instead of .extract()[0]

Link = Link1.css('span[class=title] a::attr(href)').extract_first()
Link = Link1.css('span.title a::attr(href)').extract_first()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...