Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm trying to parse some HTML with DOM in PHP, but I'm having some problems. First, in case this change the solution, the HTML that I have is not a full page, rather, it's only part of it.

<!-- This is the HTML that I have --><a href='/games/'>
<div id='game'>
<img src='http://images.example.com/games.gif' width='300' height='137' border='0'>
<br><b> Game </b>
</div>
<div id='double'>
<img src='http://images.example.com/double.gif' width='300' height='27' border='0' alt='' title=''>
</div>
</a>

Now I'm trying to get only the div with the id double. I've tried the following code, but it doesn't seem to be working properly. What might I be doing wrong?

//The HTML has been loaded into the variable $html
$dom=new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false; 
$keepme = $dom->getElementById('double'); 

$contents = '<div style="text-align:center">'.$keepme.'</a></div>';
echo $contents;
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
270 views
Welcome To Ask or Share your Answers For Others

1 Answer

I think DOMDocument::getElementById will not work in your case : (quoting)

For this function to work, you will need either to set some ID attributes with DOMElement::setIdAttribute or a DTD which defines an attribute to be of type ID.
In the later case, you will need to validate your document with DOMDocument::validate or DOMDocument->validateOnParse before using this function.


A solution that might work is using some XPath query to extract the element you are looking for.

First of all, let's load the HTML portion, like you first did :

$dom=new domDocument;
$dom->loadHTML($html);
var_dump($dom->saveHTML());

The var_dump is here only to prove that the HTML portion has been loaded successfully -- judging from its output, it has.


Then, instanciate the DOMXPath class, and use it to query for the element you want to get :

$xpath = new DOMXpath($dom);
$result = $xpath->query("//*[@id = 'double']");
$keepme = $result->item(0);

We now have to element you want ;-)


But, in order to inject its HTML content in another HTML segment, we must first get its HTML content.

I don't remember any "easy" way to do that, but something like this sould do the trick :

$tempDom = new DOMDocument();
$tempImported = $tempDom->importNode($keepme, true);
$tempDom->appendChild($tempImported);
$newHtml = $tempDom->saveHTML();
var_dump($newHtml);

And... We have the HTML content of your double <div> :

string '<div id="double">
<img src="http://images.example.com/double.gif" width="300" height="27" border="0" alt="" title="">
</div>
' (length=125)


Now, you just have to do whatever you want with it ;-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...