Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

How can i scrap(get ) the data from a website.

Example :- I have a site say www.getfinancialdata.com

now i want to grab the data by running a script/url frm my system to this website and then

sorting the data and save in spreadsheet.

I have done this thing for a simple website where i can view the HTML content in the body of a web page (after i do view source code) But my problem is bit compex when i view the source i see it is the DOM data(no simple html content)there are jquery functions which populate the data . ow can i grab the data from DOM(Jquery)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
256 views
Welcome To Ask or Share your Answers For Others

1 Answer

I've had success using Selenium to scrape sites that use a lot of javascript. If it shows up in a browser, you can get it with Selenium. It's Java but there are bindings to drive it from your favorite scripting language; I use Python.

You may also want to look into headless browsers like Crowbar and PhantomJS. The thing I like about selenium is that being able to watch it drive the browser helps my debugging. Also there is a Firefox plugin (the IDE) that can generate some basic code to get you started... you just click along and it'll record what you've done (that code will definitely always need massaging/massive editing, but it's helpful while you're learning how to do this).

Note that this is a surprisingly hard thing to do. Especially on a large scale. Websites are messy, they are different from one another, and they change over time. This makes scraping either infuriating or a fun challenge, depending on your attitude.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...