c# - Select only items in a specific DIV using HtmlAgilityPack

Question

Welcome To Ask or Share your Answers For Others

c# - Select only items in a specific DIV using HtmlAgilityPack

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

I'm trying to use the HtmlAgilityPack to pull all of the links from a page that are contained within a div declared as <div class='content'> However, when I use the code below I simply get ALL links on the entire page. This doesn't really make sense to me since I am calling SelectNodes from the sub-node I selected earlier (which when viewed in the debugger only shows the HTML from that specific div). So, it's like it's going back to the very root node every time I call SelectNodes. The code I use is below:

HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(@"http://example.com");
HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='content']");
foreach(HtmlNode link in node.SelectNodes("//a[@href]"))
{
    Console.WriteLine(link.Value);
}

Is this the expected behavior? And if so, how do I get it to do what I'm expecting?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

203 views

1 Answer

深蓝 · Answer 1 · 2021-10-17T03:05:44+0000

This will work:

node.SelectNodes("a[@href]")

Also, you can do it in a single selector:

doc.DocumentNode.SelectSingleNode("//div[@class='content']//a[@href]")

Also, note that link.Value isn't defined for HtmlNode, so your code doesn't compile.

Categories

c# - Select only items in a specific DIV using HtmlAgilityPack

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags