Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am trying to get HTML source as string from web URL using CEFPython I want MainFrame's source content to be crawled and get string in

def save_screenshot(browser):    
    # Browser object provides GetUserData/SetUserData methods
    # for storing custom data associated with browser. The
    # "OnPaint.buffer_string" data is set in RenderHandler.OnPaint.
    buffer_string = browser.GetUserData("OnPaint.buffer_string")
    if not buffer_string:
        raise Exception("buffer_string is empty, OnPaint never called?")
    mainFrame = browser.GetMainFrame()
    print("Main frame is ", mainFrame)
    # print("buffer string" ,buffer_string)

    # visitor object
    visitorObj = cef_string()
    temp = mainFrame.GetSource(visitorObj).GetString()
    print("temp : ", temp)

    visitorText = mainFrame.GetText(temp)
    siteHTML = mainFrame.GetSource(visitorText)
    print("siteHTML is ", siteHTML)

Problem: The code is returning nothing for siteHTML

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
197 views
Welcome To Ask or Share your Answers For Others

1 Answer

Your mainframe.GetSource(visitor) is asynchronous. Therefore you cannot call GetString() from it.

This is the way to do, unfortunately you need to think in asynchronous manner:

class Visitor(object)
    def Visit(self, value):
        print("This is the HTML source:")
        print(value)
myvisitor = Visitor()
mainFrame = browser.GetMainFrame()
mainFrame.GetSource(myvisitor)

One more thing to beware of: the visitor object myvisitor in the above example is passed on to GetSource() in weak reference. In other words, you must keep that object alive until the source is passed back. If you put the last three lines in the above snippet in a function, you have to make sure the function does not return until the job is done.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...