Posted almost 9 years ago. Visible to the public.

Parse XML or HTML with Nokogiri

To parse XML-documents, I recommend the gem nokogiri.

A few hints:

  • xml = Nokogiri::XML("<list><item>foo</item><item>bar</item></list>") parses an xml string. You can also call Nokogiri::HTML to be more liberal about accepting invalid XML.
  • xml / 'list item' returns all matching nodes; list item is used like a CSS selector
  • xml / './/list/item' also returns all matching nodes, but .//list/item is now an XPath selector
    • XPath seems to be triggered by a leading . or /
  • xml % 'item' returns the first matching node
  • node.attribute('foo') returns the attribute named foo
  • node.attribute('foo').value returns its value
  • node.content returns the content

Careful with XPath:

Whenever an XML document declares a namespace, like

<list xmlns="'> <item /> </list>

xml % './/list' will not match any more (since there is no list tag any more, just a {}:list tag).

You may use xml % './/xmlns:list' instead.

XPath examples

XPath sandbox

makandra has been working exclusively with Ruby on Rails since 2007. Our laser focus on a single technology has made us a leader in this space.

Owner of this card:

Tobias Kraze
Last edit:
almost 7 years ago
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Tobias Kraze to makandra dev
This website uses cookies to improve usability and analyze traffic.
Accept or learn more