By default parallel_tests Show archive.org snapshot will spawn as many test processes as you have CPUs. If you have issues with flaky tests, reducing the number of parallel processes may help.
Important
Flaky test suites can and should be fixed. This card is only relevant if you need to run a flaky test suite that you cannot fix for some reason. If you have no issues with flaky tests you should run as many parallel test processes as possible.
Test case
In my case halfing the number of processes from 8 to 4 reduced test failures by 80% while only increasing test runtime by 10%:
CPUs | Test runtime | Test runtime (%) | Failures | Failures (%) |
---|---|---|---|---|
8 | 308 | 100% | 14 | 100% |
8 | 304 | 99% | 10 | 71% |
6 | 315 | 102% | 6 | 43% |
4 | 343 | 111% | 1 | 7% |
4 | 346 | 112% | 6 | 43% |
4 | 333 | 108% | 2 | 14% |
4 | 340 | 110% | 3 | 21% |
3 | 378 | 123% | 2 | 14% |
3 | 370 | 120% | 2 | 14% |
How to start fewer processes
When you're using the
parallel_tests
Show archive.org snapshot
gem you can use the PARALLEL_TEST_PROCESSORS
environment variable:
PARALLEL_TEST_PROCESSORS=4 geordi cucumber features
PARALLEL_TEST_PROCESSORS=4 parallel_cucumber features
To set a default process count you can add this to your ~/.bashrc
or ~/.zshrc
:
export PARALLEL_TEST_PROCESSORS=4
What's a good number of processes?
I'm going to experiment with 4, since that's the number of physical CPUs on my PC. If your CPU has
hyperthreading
Show archive.org snapshot
Linux may report a higher number of CPUs. In my case Linux reports 8 and parallel_tests
defaults to that:
$ lscpu
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Note that your system needs to handle many more processes than just your tests while your test suite is running:
- The Rails server booted by each test process (in a separate process)
- The Chrome browser started by each test process
- Your IDE
- Your window environment
- Background services
Fixing flaky tests
Running fewer test processes is only a bandaid. Your test suite has issues with uncontrolled timing issues. Reducing the number of test processes just makes any race conditions occur less frequently.
We have a separate card for fixing flaky integration tests.
Informing other developers
If you cannot fix your test suite, you may suggest to your colleagues that they run fewer processes.
The following script will print a yellow message to the console if the user is running more test processes than physical CPUs:
You are running more test processes than your PC has physical CPUs (8). If you encounter flaky tests, consider running tests with PARALLEL_TEST_PROCESSORS=8.
Run the script while starting your test suite (e.g. in Cucumber copy it to features/support/suggest_fewer_test_processes.rb
):
class SuggestFewerTestProcesses
class CannotReadCPUCount < StandardError; end
def check_process_count
if process_count > physical_cpu_count
warn("You are running more test processes than your PC has physical CPUs (#{physical_cpu_count}). If you encounter flaky tests, consider running tests with PARALLEL_TEST_PROCESSORS=#{physical_cpu_count}.")
end
rescue CannotReadCPUCount => e
warn('Cannot read CPU count: ' + e.message)
end
private
def process_count
if (env_value = ENV['TEST_ENV_NUMBER'])
env_value.to_i
else
1
end
end
def physical_cpu_count
stdout_str, error_str, status = Open3.capture3('lscpu')
if status.success?
# lscpu output looks like this:
#
# ...
# Core(s) per socket: 4
# Socket(s): 1
# ...
if stdout_str =~ /Socket\(s\)\:\s*(\d+)/
sockets = Regexp.last_match(1).to_i
else
raise CannotReadCPUCount, 'Cannot parse socket count from lscpu output'
end
if stdout_str =~ /Core\(s\) per socket\:\s*(\d*)/
cores_per_socket = Regexp.last_match(1).to_i
else
raise CannotReadCPUCount, 'Cannot parse socket count from lscpu output'
end
sockets * cores_per_socket
else
raise CannotReadCPUCount, error_str
end
end
def warn(message)
puts yellow(message)
end
def yellow(string)
"\e[33m#{string}\e[0m"
end
end
SuggestFewerTestProcesses.new.check_process_count