Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need to continuously get the data on next button <1 2 3 ... 5> but there's no provided href link in the source also there's also elipsis. any idea please? here's my code

def start_requests(self):
    urls = (
        (self.parse_2, 'https://www.forever21.com/us/shop/catalog/category/f21/sale'),
    )
    for cb, url in urls:
        yield scrapy.Request(url, callback=cb)


def parse_2(self, response):
    for product_item_forever in response.css('div.pi_container'):
        forever_item = {
            'forever-title': product_item_forever.css('p.p_name::text').extract_first(),
            'forever-regular-price': product_item_forever.css('span.p_old_price::text').extract_first(),
            'forever-sale-price': product_item_forever.css('span.p_sale.t_pink::text').extract_first(),
            'forever-photo-url': product_item_forever.css('img::attr(data-original)').extract_first(),
            'forever-description-url': product_item_forever.css('a.item_slider.product_link::attr(href)').extract_first(),
        }
        yield forever_item

Please help me thank you

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
168 views
Welcome To Ask or Share your Answers For Others

1 Answer

It seems this pagination uses additional request to API. So, there are two ways:

  1. Use Splash/Selenium to render pages by pattern of QHarr;
  2. Make same calls to API. Check developer tools, you will find POST-request https://www.forever21.com/us/shop/Catalog/GetProducts will all proper params (they are too long, so I will not post full list here).

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...