Sunday, 15 August 2010

php - How to get all text nodes value between span nodes -



php - How to get all text nodes value between span nodes -

i have next html structure

<span class="x">a</span> <br> • first <br> • sec <br> • sec <br> • 3rd <br> <br> <span class="x">b</span>

i need text value(comma separated) occur between span nodes i.e first,second,second,third

how can done using xpath,dom

you can query these elements using xpath, need "cleanup" of these bullet points in php simplexml supports xpath 1.0 without extended string editing capabilities.

most of import xpath expression, explain in detail:

//span[text()='a']/following::text(): fetch text nodes after span content "a" [. = //span[text()='b']/preceding::text()] compare each of them set of text nodes before span content "b"

and here's total code, might want invest more effort in removing bullet point. create sure php evaluating utf-8, otherwise mojibake instead of bullet point.

<?php $html = ' <span class="x">a</span> <br> • first <br> • sec <br> • sec <br> • 3rd <br> <br> <span class="x">b</span></wrap> '; libxml_use_internal_errors(true); $dom = new domdocument(); $dom->preservewhitespace = false; $dom->stricterrorchecking = false; $dom->recover = true; $dom->loadhtml($html); $xpath = new domxpath($dom); $results = $xpath->query("//span[text()='a']/following::text()[. = //span[text()='b']/preceding::text()]"); foreach ($results $result) { $token = trim(str_replace('•', '', $result->nodevalue)); if ($token) $tokens[] = $token; } echo implode(',', $tokens); ?>

php dom xpath

No comments:

Post a Comment