More Splatters of Hpricot #
Element#siblings_at, Element#nodes_at
>> doc.at(:h3)
=> {elem h3 {text "bottles and cans"} h3}
>> doc.at(:h3).siblings_at(0..2)
=> #<Hpricot::Elements[
{elem <h3> {text "bottles and cans"} </h3>},
{elem <p> {text "just clap your hands"} </p>},
{elem <ul> {elem <li> {text "2 turntables"} </li>}
{elem <li> {text "a microphone"} </li>} </ul>}]>
Basically: grab this element, as well as the two siblings below it. The nodes_at method will include text elements and comments.
>> doc.at(:h3).nodes_at(0..2)
=> #<Hpricot::Elements[
{elem <h3> {text "bottles and cans"} </h3>},
{text "\n"},
{elem <p> {text "just clap your hands"} </p>}]>
You can also do stuff like nodes_at(-2, 2, 5) to grab specific elements. Nodes positioned at two places above, two places below and five places below the selected element. (doc)
text() and comment()
>> doc.search("p/text()")
=> #<Hpricot::Elements[{text "just clap your hands"}]>
>> doc.at("//comment()")
=> {comment "<!-- insert mp3 of applause here -->"}
Element#to_original_html
>> doc = Hpricot("<p>a bunch of <b>messy <i>messy</b> html that" +
"doesn't</u> match up<!_ egg _!>")
=> #<Hpricot::Doc {elem <p> {text "a bunch of "} {elem <b> {text "messy "}
{elem <i> {text "messy"}} </b>} {text " html that doesn't"}
{bogusetag</uu</u>>} {text " match up<!_ egg _!>"}}>
>> puts doc.to_original_html
<p>a bunch of <b>messy <i>messy</b> html that doesn't match up<!_ egg _!>
XPath indices
>> doc.at("li[1]")
=> {elem li {text "a microphone"} li}
Those indices work like E:nth-of-type.
Version 0.5 of Hpricot is nearing. Please test the latest gem to help me figure out any subtleties. Also, rdoc is now included. So, yeah: HELP!
gem install hpricot --source code.whytheluckystiff.net

