A flaky test is a test that is often green, but sometimes red. It may only fail on some PCs, or only when the entire test suite is run.
There are many causes for flaky tests. This card focuses on a specific class of feature with heavy side effects, mostly on on the UI. Features like the following can amplify your flakiness issues by unexpectedly changing elements, causing excessive requests or other timing issues:
There are very few tests that actually observe features like lazy loading or polling. But when such features are always enabled, every test that crosses an affected component suffers from the randomness and slowdown that they introduce.
While you can fix every one of these issues by carefully controlling concurrency and timing in your tests, this requires a lot of work and sustained diligence. A better approach is to use feature flags to only enable these features for the one test that needs it.
Below you can find some real-world examples for always-on features breaking E2E tests.
This is from a real project:
The affected scenario never looked at autocomplete suggestions. 🤷
This is from a real project:
The affected scenario did not care about any images. 🤷
This is from a real project:
<select>
automatically validates against the server when changed.<select>
, then immediately scrolled down many pages and submitted the form.This is an actual timing issue in the code, not the test.
However, it took a day to find a bug that could not realistically affect a user. 🤷
Also the affected scenario never looked at the validation results. 🤷
Note
The issue above can no longer occur in Unpoly 2 Show archive.org snapshot 's
[up-validate]
Show archive.org snapshot .
This is from a real project:
parallel_tests
), acquiring the lock (1) took so long that the clientThis is something that we fixed in the code, not the test. The code is better now.
However, it took a day to find a bug that no user has seen in 6 years. 🤷
This is the earlier example from a real project:
The affected scenario did not care about any processed file versions. 🤷
This is from a real project:
<div>
in sync with the latest server-site data.there should be no JavaScript error
fails the scenario.The affected scenario did not care about the periodic server updates. 🤷
Testing an animated UI causes countless problems. Here are only a few:
For each problematic features, make a configuration option that enables it. Then enable it by default for all environments:
# config/environments/application.rb
config.feature_polling = true
config.feature_animations = true
config.feature_lazy_load_images = true
config.feature_live_validations = true
config.feature_form_content_locks = true
config.feature_photo_processing = true
config.feature_autocomplete = true
Only for the test
environment we disable each feature by default:
# config/environments/test.rb
config.feature_polling = false
config.feature_animations = false
config.feature_lazy_load_images = false
config.feature_live_validations = false
config.feature_form_content_locks = false
config.feature_photo_processing = false
config.feature_autocomplete = false
Now change your code for polling, animations, etc. to honor these feature flag. When a component sees its feature flag disabled, it should disable itself.
For this example we will use an
Unpoly
Show archive.org snapshot
compiler
Show archive.org snapshot
to build a [poll]
attribute. This attribute causes an element to be reloaded from the server every 5 seconds:
<ul id="users" poll>
<li>zofletcher</li>
<li>monicaphelps</li>
<li>johnharris </li>
</ul>
A simple, feature-flag-ware polling implementation could look like this:
up.compiler('[poll]', function(element) {
// Don't poll if polling is disabled.
if (document.body.dataset.featurePolling === 'false') return
// Reload element every 5 seconds.
var timer = setInterval(function() { up.reload(element) }, 5_000)
// Stop reloading when the element is removed from the DOM.
return function() { clearInterval(timer) }
})
Note how line #2 lets us disable polling using a [data-feature-polling=false]
attribute on the <body>
element:
<html>
<body data-feature-polling='false'>
...
</body>
</html>
We enable polling for all environments by default, but disable polling for the test
environment:
# config/application.rb
config.feature_polling = true
# config/environments/test.rb
config.feature_polling = false
We echo the environment setting in our application layout:
<html>
<body data-feature-polling='<%= Rails.configuration.feature_polling %>'>
...
</body>
</html>
Now polling is disabled by default for all tests.
Our test suite has immediately become faster and more stable.
We now enable the polling feature for the one E2E test that tests polling.
In an RSpec request spec this would look like this:
scenario 'The project list is updated periodically' do
# Enable polling for this test
allow(Rails.configuration).to receive(:feature_polling).and_return(true)
# Go to the projects index and see an empty list.
visit projects_path
expect(page).to have_text('No projects yet')
# When another user creates a project it automatically appears in our list.
create(:project, name: 'Super project')
expect(page).to have_text('Superproject')
end
If you're using Cucumber, it would look like this:
Scenario: The project list is updated periodically
Given the polling feature is enabled
When I go to the list of projects
Then I should see "No projects yet"
When another user creates a project "Superproject"
Then I should see "Superproject"
Here is the step that mocks Rails.configuration.feature_polling
for that one scenario:
Given /^the (.*?) feature is enabled$/ do |feature|
allow(Rails.configuration).to receive("feature_#{feature}").and_return(true)
end
Note
You may want to also make the polling frequency configurable. This way the test for polling does not need to wait 5 seconds to observe a change.