Mutation Testing
Chapter 10: Mutation Testing
You have 100% code coverage. All tests pass. But are your tests actually good? Mutation testing answers this by deliberately breaking your code ("mutating" it) and checking if tests catch the bugs. If a test suite doesn't fail when code is broken, the tests aren't testing effectively. This chapter introduces mutation testing and shows you how to measure and improve test quality.
What is Mutation Testing?
Mutation testing modifies your source code in small ways (mutations) and runs your test suite. If tests still pass after the mutation, your tests missed a bug—they're not strong enough.
Example:
Mutation (change > to >=):
If your tests still pass with this mutation, you don't have a test for n == 0. A strong test suite would fail, catching the mutation.
How Mutation Testing Works
- Run Tests: Verify all tests pass on original code
- Create Mutation: Modify code (change
+to-,>to>=, etc.) - Run Tests on Mutant: Do tests fail?
- Killed: Tests failed (good! Tests caught the bug)
- Survived: Tests passed (bad! Tests missed the bug)
- Repeat: Try all possible mutations
- Calculate Score:
killed / (killed + survived) * 100%
Mutation Score: Percentage of mutations your tests catch. Higher is better. 80%+ is strong.
Installing mutmut
pip install mutmutmutmut is the most popular Python mutation testing tool.
Running Mutation Tests
# Run mutation testing
mutmut run
# View results
mutmut results
# Show specific mutations
mutmut show 1mutmut automatically finds your source code and tests, creates mutations, and runs tests on each mutation.
Types of Mutations
Arithmetic Operators:
+→-,*,/-→+*→/,+
Comparison Operators:
>→>=,<,====→!=,>=,<=<→<=,>,==
Boolean Operators:
and→oror→andnot→ removed
Constants:
True→False0→1"string"→""
Return Values:
return x→return Nonereturn True→return False
Interpreting Mutation Results
High Mutation Score (80%+): Strong test suite. Most mutations are caught.
Medium Score (60-80%): Decent coverage but room for improvement. Some edge cases untested.
Low Score (<60%): Weak tests. Many mutations survive. Tests verify execution but not correctness.
Example Output:
Total mutations: 100
Killed: 85
Survived: 10
Timeout: 5
Mutation Score: 85%85% is strong. The 10 survivors indicate specific gaps in test coverage.
Improving Your Mutation Score
Step 1: Identify Survivors. Run mutmut show <id> to see mutations that survived.
Step 2: Understand Why. Why did tests pass? Missing edge case? Weak assertion?
Step 3: Add Tests. Write tests that would kill the mutation.
Example: Mutation survived because you test add(2, 3) but not add(0, 5).
Step 4: Re-run. Verify new tests kill the mutation.
Mutation Testing Best Practices
Don't Chase 100%: Some mutations are equivalent (don't change behavior) or impossible to kill. 85-90% is excellent.
Focus on Critical Code: Run mutation testing on business logic, not trivial getters or configuration.
Integrate with CI: Run mutation testing periodically, not on every commit (it's slow).
Use Results to Guide Testing: Survivors show where tests are weak. Add tests for those cases.
Set Minimum Score: Fail builds if mutation score drops below threshold (e.g., 80%).
Mutation Testing vs Code Coverage
Code coverage measures execution. Mutation testing measures quality.
High Coverage, Low Mutation Score:
def calculate_discount(price, is_member):
if is_member:
return price * 0.9
return price
def test_discount():
result = calculate_discount(100, True)
# No assertion! Just executes code.100% code coverage, 0% mutation score. The test verifies nothing.
High Coverage, High Mutation Score:
def test_discount_strong():
# Member gets discount
assert calculate_discount(100, True) == 90
# Non-member pays full price
assert calculate_discount(100, False) == 100100% coverage, 100% mutation score. Tests verify correctness.
Coverage measures "did this run?" Mutation testing measures "did this work correctly?"
Equivalent Mutations
Some mutations don't change behavior:
def is_adult(age):
return age >= 18
# Mutation: age >= 18 to age > 17
# These are equivalent! Can't distinguish with tests.Equivalent mutations can't be killed. Don't worry about them—they're unavoidable. Aim for 85-90%, not 100%.
Performance Considerations
Mutation testing is slow. Each mutation requires running the full test suite.
Optimize:
- Run mutation testing on changed files only
- Use parallel execution:
mutmut run --threads-to-use=4 - Run nightly, not on every commit
- Focus on critical modules
Example CI Integration:
# Run weekly mutation testing
schedule:
- cron: '0 2 * * 1' # Monday 2 AM
jobs:
mutation-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run mutation testing
run: |
pip install mutmut
mutmut run --paths-to-mutate=src/core
mutmut resultsWhen to Use Mutation Testing
Use Mutation Testing For:
- Critical business logic (payment processing, financial calculations, data transformations)
- Security-sensitive code (authentication, authorization, encryption, input validation)
- Complex algorithms (sorting, searching, optimization, graph algorithms)
- Code with high test coverage that still has bugs (mutation testing reveals test quality issues)
- Stable APIs and libraries (where correctness is paramount and code changes infrequently)
Skip Mutation Testing For:
- Simple CRUD operations (basic create, read, update, delete with minimal logic)
- Trivial utilities (simple getters, setters, data structure wrappers)
- UI code (visual components change frequently, mutation testing adds little value)
- Code that changes frequently (tests will change too, making mutation testing impractical)
- Prototype or experimental code (focus on functionality first, quality later)
Mutation testing is expensive—it multiplies test execution time by the number of mutations. Use it strategically on code that matters most: code that handles money, security, or critical business rules. For a typical application, apply mutation testing to 10-20% of your codebase: the core logic that absolutely must be correct.
Mutation Testing Workflow
- Achieve High Coverage: Get to 80%+ code coverage first with traditional testing
- Run Mutation Testing: Identify weak tests
- Improve Tests: Kill surviving mutations by adding better assertions and edge cases
- Maintain Score: Set minimum mutation score and prevent regression
- Focus on ROI: Don't chase 100%—focus on critical code
Mutation testing is the final quality check after coverage, code review, and property-based testing.
Course Recommendations
Advanced Python Testing
- Mutation testing in depth
- Test quality metrics beyond coverage
- Combining testing strategies
- Enroll at paiml.com
Software Quality Engineering
- Comprehensive quality metrics
- Testing strategies comparison
- Building quality into processes
- Enroll at paiml.com
Test-Driven Development Mastery
- Writing tests that catch real bugs
- TDD with mutation testing
- Test quality from day one
- Enroll at paiml.com
Quiz
Final Thoughts
Mutation testing is the ultimate test quality metric. It answers the critical question: "Do my tests actually catch bugs, or do they just execute code?" Use it strategically on critical code to build confidence in your test suite. Combined with TDD, property-based testing, and code review, mutation testing ensures your tests truly protect against regressions.
Congratulations on completing this comprehensive guide to Testing in Python! You now have the knowledge to write tests that are fast, reliable, maintainable, and effective at catching real bugs.
📝 Test Your Knowledge: Mutation Testing
Take this quiz to reinforce what you've learned in this chapter.