Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need to scrape the price of this page: https://www.asos.com/monki/monki-lisa-cropped-vest-top-with-ruched-side-in-black/prd/23590636?colourwayid=60495910&cid=2623

However it is always returning null:

My code:

'price' :response.xpath('//*[contains(@class, "current-price")]').get()

image

Can someone help please?

Thanks!

code

When Extracted using XHR: enter image description here

How to retrieve price?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
130 views
Welcome To Ask or Share your Answers For Others

1 Answer

Your problem is not the xpath, it's that the price is being retrieved with XHR.

If you use scrapy sheel and type view(response) you can see that the price is not being generated: enter image description here

Look at the source of the original webpage and search for the price: enter image description here

Then use this url the scrape the price:

    def parse(self, response):
        import re
        price_url = 'https://www.asos.com' + re.search(r'window.asos.pdp.config.stockPriceApiUrl = '(.+)'', response.text).group(1)
        yield scrapy.Request(url=price_url,
                             method='GET',
                             callback=self.parse_price,
                             headers=self.headers)

    def parse_price(self, response):
        import json
        jsonresponse = json.loads(response.text)
        ...............
        ...............
        ...............

I couldn't get around 403 error with the headers I provided, but maybe you'll have more luck.

Edit:

In order to get the price from the json file there's actually no need for json.loads

    def parse_price(self, response):
        jsonresponse = response.json()[0]
        price = jsonresponse['productPrice']['current']['text']
        # You can also use jsonresponse.get() if you prefer
        print(price)

Output:

£10.00

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...