I'm parsing a html fragment without knowing that this is a fragment. I use the jsoup HTML parser. For example:
String html = "<script>document.location = "http://example.com/";</script>";
Document document = Jsoup.parse(html);
System.out.println(document.html());
Output:
<html>
<head>
<script>document.location = "http://example.com/";</script>
</head>
<body></body>
</html>
Question: Is there a way to know that the <html>
, <head>
and <body>
tags were added by Jsoup and were not in the original html fragment?
Update:
I also tried to enable the errors tracking:
Parser parser = Parser.htmlParser();
parser.setTrackErrors(500);
Document document = parser.parseInput(html, "example.com");
ParseErrorList errors = parser.getErrors();
But I get an empty list of errors.