How can I remove all HTML from a string in Python? For example, how can I turn:
blah blah <a href="blah">link</a>
into
blah blah link
Thanks!
See Question&Answers more detail:osHow can I remove all HTML from a string in Python? For example, how can I turn:
blah blah <a href="blah">link</a>
into
blah blah link
Thanks!
See Question&Answers more detail:osWhen your regular expression solution hits a wall, try this super easy (and reliable) BeautifulSoup program.
from BeautifulSoup import BeautifulSoup
html = "<a> Keep me </a>"
soup = BeautifulSoup(html)
text_parts = soup.findAll(text=True)
text = ''.join(text_parts)