How to inspect RSS feeds with Spreewald, XPath, and Selenium

Posted . Visible to the public.

Spreewald Show archive.org snapshot gives you the <step> within <selector> meta step that will constrain page inspection to a given scope.

Unfortunately, this does not work with RSS feeds, as they're XML documents and not valid when viewed from Capybara's internal browser (e.g. a <link> tag cannot have content in HTML).

Inspecting XML

If you're inspecting XML that is invalid in HTML, you need to inspect the page source instead of the DOM. You may use Spreewald's "... in the HTML" meta step, or add this proxy step for better semantics:

Then /^I should( not)? see "(.+?)" in the page source$/ do |negate, text|
  step %(I should#{ negate } see "#{ text }" in the HMTL)
end

Now you can say Then I should see "Application title" in the page source within "channel > title" whenever a simple "within ..." does not work.

Note that you need to write all selectors in lowercase letters in your tests. Capybara talks HTML, after all, and in HTML tags and attributes are case-insensitive (as opposed to XML).

Using Xpath

If you need xpath to express things that CSS selectors cannot do, add the following selector in selectors.rb:

    when /^"(.+?)" \(xpath\)$/
      [:xpath, $1]

Then use it like this: Then I should see "Application title" in the page source within "//channel/title" (xpath)

Using Selenium and Firefox

Selenium doesn't seem to be able to inspect RSS when you follow a link to the feed. (At least) Firefox seems to intercept those RSS links, and Capybara internally remains on the previous page.
This behavior can be circumvented with the following simple step:

# Firefox seems to intercept RSS links. Use this step to circumvent that.
When /^I open the RSS feed behind "(.+?)"$/ do |link_label|
  link = find_link(link_label)
  visit link[:href]
end
Dominik Schöler
Last edit
Dominik Schöler
License
Source code in this card is licensed under the MIT License.
Posted by Dominik Schöler to makandra dev (2016-06-06 11:51)