Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I would like to get the urls from a webpage that starts with "../category/" from these tags below:

<a href="../category/product/pc.html" target="_blank">PC</a><br>
<a href="../category/product/carpet.html" target="_blank">Carpet</a><br>

Any suggestion would be very much appreciated.

Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
262 views
Welcome To Ask or Share your Answers For Others

1 Answer

No regular expressions is required. A simple XPath query with DOM will suffice:

$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$nodes = $xpath->query('//a[starts-with(@href, "../category/")]');
foreach ($nodes as $node) {
    echo $node->nodeValue.' = '.$node->getAttribute('href').PHP_EOL;
}

Will print:

PC = ../category/product/pc.html
Carpet = ../category/product/carpet.html

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...