Posted over 11 years ago. Visible to the public.

Parse XML or HTML with Nokogiri

To parse XML-documents, I recommend the gem nokogiri Archive .

A few hints:

  • xml = Nokogiri::XML("<list><item>foo</item><item>bar</item></list>") parses an xml string. You can also call Nokogiri::HTML to be more liberal about accepting invalid XML.
  • xml / 'list item' returns all matching nodes; list item is used like a CSS selector
  • xml / './/list/item' also returns all matching nodes, but .//list/item is now an XPath selector
    • XPath seems to be triggered by a leading . or /
  • xml % 'item' returns the first matching node
  • node.attribute('foo') returns the attribute named foo
  • node.attribute('foo').value returns its value
  • node.content returns the content

Careful with XPath:

Whenever an XML document declares a namespace, like

<list xmlns="'> <item /> </list>

xml % './/list' will not match any more (since there is no list tag any more, just a {}:list tag).

You may use xml % './/xmlns:list' instead.

XPath examples Archive

XPath sandbox Archive

Once an application no longer requires constant development, it needs periodic maintenance for stable and secure operation. makandra offers monthly maintenance contracts that let you focus on your business while we make sure the lights stay on.

Owner of this card:

Tobias Kraze
Last edit:
about 9 years ago
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Tobias Kraze to makandra dev
This website uses short-lived cookies to improve usability.
Accept or learn more