Posted almost 10 years ago. Visible to the public.

Parse XML or HTML with Nokogiri

To parse XML-documents, I recommend the gem nokogiri.

A few hints:

  • xml = Nokogiri::XML("<list><item>foo</item><item>bar</item></list>") parses an xml string. You can also call Nokogiri::HTML to be more liberal about accepting invalid XML.
  • xml / 'list item' returns all matching nodes; list item is used like a CSS selector
  • xml / './/list/item' also returns all matching nodes, but .//list/item is now an XPath selector
    • XPath seems to be triggered by a leading . or /
  • xml % 'item' returns the first matching node
  • node.attribute('foo') returns the attribute named foo
  • node.attribute('foo').value returns its value
  • node.content returns the content

Careful with XPath:

Whenever an XML document declares a namespace, like

Copy
<list xmlns="http://mylist.org'> <item /> </list>

xml % './/list' will not match any more (since there is no list tag any more, just a {http://mylist.org}:list tag).

You may use xml % './/xmlns:list' instead.

XPath examples

XPath sandbox

By refactoring problematic code and creating automated tests, makandra can vastly improve the maintainability of your Rails application.

Owner of this card:

Avatar
Tobias Kraze
Last edit:
almost 8 years ago
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Tobias Kraze to makandra dev
This website uses short-lived cookies to improve usability.
Accept or learn more