Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm parsing html of a website with JSoup. I want to parse this part:

<td class="lastpost">
This is a text 1<br>
<a href="post/13594">Website Page - 1</a>
</td>

I want like this:

String text = "This is a text 1";
String textNo = "Website Page - 1";
String link = "post/13594";

How can I get the parts like this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
195 views
Welcome To Ask or Share your Answers For Others

1 Answer

Your code would only get all the text that is in the td elements that you are selecting. If you want to store the text in separate variables, you should grab the parts separately like the following code. Extra comments added so you can understand how/why it is getting each piece.

// Get the first td element that has class="lastpost"
Element lastPost = document.select("td.lastpost").first();
// Get the first a element that is a child of the td
Element linkElement = lastPost.getElementsByTag("a").first();

// This text is the first child node of td, get that node and call toString
String text = lastPost.childNode(0).toString();
// This is the text within the a (link) element
String textNo = linkElement.text();
// This text is the href attribute value of the a (link) element
String link = linkElement.attr("href");

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...