Posted almost 9 years ago. Visible to the public.

Parse XML or HTML with Nokogiri

To parse XML-documents, I recommend the gem nokogiri.

A few hints:

  • xml = Nokogiri::XML("<list><item>foo</item><item>bar</item></list>") parses an xml string. You can also call Nokogiri::HTML to be more liberal about accepting invalid XML.
  • xml / 'list item' returns all matching nodes; list item is used like a CSS selector
  • xml / './/list/item' also returns all matching nodes, but .//list/item is now an XPath selector
    • XPath seems to be triggered by a leading . or /
  • xml % 'item' returns the first matching node
  • node.attribute('foo') returns the attribute named foo
  • node.attribute('foo').value returns its value
  • node.content returns the content

Careful with XPath:

Whenever an XML document declares a namespace, like

Copy
<list xmlns="http://mylist.org'> <item /> </list>

xml % './/list' will not match any more (since there is no list tag any more, just a {http://mylist.org}:list tag).

You may use xml % './/xmlns:list' instead.

XPath examples

XPath sandbox

Growing Rails Applications in Practice
Check out our new e-book:
Learn to structure large Ruby on Rails codebases with the tools you already know and love.

Owner of this card:

Avatar
Tobias Kraze
Last edit:
almost 7 years ago
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Tobias Kraze to makandra dev
This website uses cookies to improve usability and analyze traffic.
Accept or learn more