Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm trying to scrape data from a trading website. I started out with the python 'requests' library, but the HTML page it returned was different compared to the one on my browser.

I observed that the web page had a minor delay in loading the missing information, on researching, I found out that this can be resolved using the 'requests-html' package. But, the 'requests-html' library returns the same HTML as 'requests'.

I am aware that this can be solved by using selenium but is there a way to do this using the above-mentioned libraries?

This is my code

from bs4 import BeautifulSoup
import requests
import time
from requests_html import HTMLSession

with HTMLSession() as s:
    login_url = 'https://www.screener.in/login/'
    USERNAME = "username"
    PASSWORD = "password"

    s.get(login_url)
    csrftoken = s.cookies['csrftoken']

    login_data = dict(csrfmiddlewaretoken=csrftoken, next='', username=USERNAME, password=PASSWORD)
    s.post(login_url, data=login_data, headers={"Referer": "https://www.screener.in/"})

    r = s.get('https://www.screener.in/company/ABBOTINDIA/')
    r.html.render(timeout=10, sleep=10)
    print(r.html.html)

Where am I going wrong? Is something wrong with the headers?

I am new to web scraping and would really appreciate the help.

question from:https://stackoverflow.com/questions/65626696/python-web-scraping-using-requests-html-not-working

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
279 views
Welcome To Ask or Share your Answers For Others

1 Answer

csrftoken and csrfmiddlewaretoken are not the same.

csrfmiddlewaretoken needs to be sent via the response data while csrftoken needs to be a cookie.

They also have (for me at least) different values.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...