python - HTTPS preventing website scraping in Python3

Question

Welcome To Ask or Share your Answers For Others

python - HTTPS preventing website scraping in Python3

asked Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

I am trying to scrap a website using Python code, following a tutorial, however the website has since been secured with "https" and when running the code it returns the below error occurs.

# -*- coding: utf-8 -*-
#import libraries
import urllib.request  as urllib2 
from bs4 import BeautifulSoup

#specify the url
quote_page = 'https://www.bloomberg.com/quote/SPX:IND'

#query the website and return the html to the variable ‘page’
page = urllib2.urlopen(quote_page)

#parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, 'html.parser')

#Take out the <div> of name and get its value
name_box = soup.find('h1', attrs={'class': 'companyName'})

name = name_box.text.strip() # strip() is used to remove starting and trailing
print(name)

#get the index price
price_box = soup.find('div', attrs={'class':'price__c3a38e1d'})
price = price_box.text
print(price)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

176 views

1 Answer

深蓝 · Answer 1 · 2022-01-31T07:13:32+0000

Can you try adding this to your code? This should bypass ssl verification.

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

Categories

python - HTTPS preventing website scraping in Python3

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags