GTAC 2016: How Flaky Tests in Continuous Integration

Testing

Learn how Google's continuous integration system handles flaky tests, a common problem that affects 16% of all tests, and discover strategies for identifying and fixing flaky tests to improve test reliability.

Key takeaways

Flaky tests are a common problem in continuous integration, with around 16% of all tests at Google being flaky.
Flaky tests can be caused by various factors such as code changes, resource contention, and concurrency issues.
A flaky test is defined as a test that fails intermittently, but not consistently.
Flaky tests can be added to a test database, along with their history, to track and analyze their behavior.
By analyzing the history of flaky tests, it is possible to identify patterns and correlations that can inform the development of better tests.
Flaks (flaky tests) are more likely to occur in tests that are run frequently, and in tests that are complex or have many dependencies.
A UI test with Selenium WebDriver may be more likely to be flaky than a unit test.
Integration tests and web tests tend to be more flaky than smaller, more isolated tests.
Code modification frequency is a good signal for predicting which test targets are likely to be flaky.
File modification by a single author is less likely to cause flakiness than modification by multiple authors.
Code review is important for identifying and fixing flaky tests.
Continuous integration systems have to deal with a certain level of flakiness in tests, and Google’s CI system handles this by running tests in parallel and rescheduling them until they pass or fail.
Google has a large team of developers and many small teams, which makes it easier to manage flaky tests.
Google’s CI system has a high volume of tests and a fast turnaround time, which makes it critical to prioritize and improve test reliability.
Google’s process is to monitor test results, identify flaky tests, and prioritize their improvement.
Google prioritizes fixing flaky tests over debugging code changes.
Google’s CI system has a high level of autonomy, which allows developers to decide how to fix flaky tests.
Google uses machine learning to predict which tests are likely to be flaky and to detect real failures.
Google publishes its data and methodology for detecting flaky tests.
Google is open to collaboration and willing to share its data and insights with other companies.
Google believes that testing is important for ensuring the quality of its software.

GTAC 2016: How Flaky Tests in Continuous Integration

More talks