Advanced Testing with PMAT
Chapter 11: Advanced Testing with PMAT
You've mastered unit tests, mocks, coverage, and mutation testing. But how do you enforce these standards across your entire project? How do you ensure test quality doesn't regress over time? This chapter introduces PMAT (Pragmatic AI MCP Agent Toolkit), a comprehensive quality enforcement tool that takes your Python testing to the next level.
What is PMAT?
PMAT is a code quality analysis and enforcement tool designed for extreme testing rigor. Originally built for Rust, it now supports Python testing workflows with mutation testing, quality gates, semantic search, and AI-powered analysis.
Key Capabilities:
- Mutation Testing: Property-based mutation generation with real test execution
- Quality Gates: Enforce coverage, complexity, and test quality thresholds
- Semantic Search: Find untested code paths using natural language
- **M
CP Integration**: Works with Claude Code and AI assistants
- Multi-language: Python, Rust, TypeScript, Go, C++
Install PMAT:
# Via cargo (Rust package manager)
cargo install pmat
# Verify installation
pmat --versionPython Mutation Testing with PMAT
Mutation testing measures test quality by introducing bugs and checking if tests catch them. PMAT generates mutations from Python AST and runs pytest on each one.
Example:
PMAT generates mutations:
price * 0.9→price * 0.8(different discount)if is_member→if not is_member(inverted logic)return price * 0.9→return price(removed operation)
Then runs pytest on each mutation. If tests pass with a mutation, it survives—revealing a test gap.
Running PMAT Mutation Tests
# Generate mutations for Python file
pmat mutate generate calculator.py
# Run tests on mutations (requires pytest)
pmat mutate test calculator.py
# View mutation score
pmat mutate reportOutput example:
Mutation Testing Results
========================
File: calculator.py
Total mutations: 15
Killed: 13
Survived: 2
Mutation Score: 86.7%
Survivors:
Line 5: `if is_member` → `if not is_member`
Line 10: `price * 0.9` → `price * 0.95`86.7% is strong. The 2 survivors show specific test gaps.
Quality Gates with PMAT
Quality gates enforce minimum standards. PMAT fails builds if thresholds aren't met.
Configuration (.pmat-gates.toml):
[gates]
min_coverage = 80.0
max_cyclomatic_complexity = 15
max_cognitive_complexity = 20
min_mutation_score = 75.0
zero_satd = true # No TODO/FIXME allowedRun quality gates:
pmat quality-gate
# Strict mode (fail on warnings)
pmat quality-gate --strictOutput example:
✅ Coverage: 85% (threshold: 80%)
✅ Cyclomatic Complexity: 12 (max: 15)
✅ Cognitive Complexity: 18 (max: 20)
❌ Mutation Score: 67% (min: 75%)
❌ SATD: 3 instances found (threshold: 0)
FAILED: 2/5 gates passedFix the failures before deploying.
Semantic Code Search
PMAT's semantic search finds code by meaning, not just keywords. Useful for finding untested edge cases.
Example:
# Find authentication-related code
pmat semantic search "user authentication logic"
# Find error handling
pmat semantic search "exception handling patterns"
# Find similar code (potential duplicates)
pmat analyze similar --file auth.pyThis helps identify code that needs test coverage but might have been missed by keyword searches.
Integrating PMAT with CI/CD
Add PMAT to your GitHub Actions or GitLab CI:
GitHub Actions (.github/workflows/quality.yml):
name: Quality Gates
on: [push, pull_request]
jobs:
pmat-quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install PMAT
run: cargo install pmat
- name: Run Quality Gates
run: pmat quality-gate --strict
- name: Run Mutation Tests
run: pmat mutate test src/
- name: Upload Reports
uses: actions/upload-artifact@v3
with:
name: pmat-reports
path: reports/This ensures every commit meets quality standards.
PMAT vs Other Tools
PMAT vs pytest-cov:
- pytest-cov measures execution (what ran)
- PMAT measures quality (what worked correctly)
- Use both: pytest-cov for coverage, PMAT for mutation score
PMAT vs mutmut:
- mutmut: Pure mutation testing
- PMAT: Mutation testing + quality gates + semantic search + MCP integration
- PMAT is faster and more comprehensive
PMAT vs pylint:
- pylint: Static analysis (finds style issues)
- PMAT: Dynamic analysis (runs tests, measures quality)
- Complementary tools—use both
MCP Integration for AI-Powered Testing
PMAT provides an MCP (Model Context Protocol) server that integrates with Claude Code and other AI assistants.
Setup (~/.config/Claude/claude_desktop_config.json):
{
"mcpServers": {
"pmat": {
"command": "pmat",
"args": ["agent", "mcp-server"]
}
}
}Available MCP Tools:
pmat_quality_check- Run quality gatespmat_mutation_test- Execute mutation testingpmat_semantic_search- Search code semanticallypmat_analyze_cluster- Find code patterns
AI assistants can now analyze your code quality and suggest improvements.
Advanced Mutation Operators
PMAT supports Python-specific mutation operators:
Binary Operators (AOR):
+→-,*,/-→+
Relational Operators (ROR):
>→>=,<,====→!=
Logical Operators (LOR):
and→oror→and
Identity Operators:
is→is notis not→is
Membership Operators:
in→not innot in→in
These operators generate realistic bugs that tests should catch.
Real-World PMAT Workflow
Step 1: Write Code with Type Hints
def process_payment(amount: float, currency: str = "USD") -> dict:
"""Process payment and return receipt."""
if amount <= 0:
raise ValueError("Amount must be positive")
fee = amount * 0.029 # 2.9% processing fee
total = amount + fee
return {
"amount": amount,
"fee": fee,
"total": total,
"currency": currency
}Type hints help PMAT generate better mutations and provide better error messages.
Step 2: Write Tests (80%+ Coverage)
import pytest
def test_process_payment_valid():
result = process_payment(100.0)
assert result["amount"] == 100.0
assert result["total"] == 102.90
def test_process_payment_invalid_amount():
with pytest.raises(ValueError):
process_payment(0)These tests give 100% coverage but don't test edge cases thoroughly.
Step 3: Run Mutation Tests
pmat mutate test src/payment.pyOutput shows survivors:
Survived: amount * 0.029 → amount * 0.030
Survived: amount + fee → amount - feeTests don't verify the exact fee calculation!
Step 4: Improve Tests to Kill Mutations
def test_process_payment_fee_calculation():
result = process_payment(100.0)
assert result["fee"] == 2.90 # Exact fee check
def test_process_payment_total_calculation():
result = process_payment(100.0)
assert result["total"] == result["amount"] + result["fee"]Re-run: All mutations killed. 100% mutation score.
Step 5: Enforce Gates
pmat quality-gate --strictAll gates pass. Ready to commit.
Step 6: CI Integration
Add to GitHub Actions. Every commit runs PMAT.
Step 7: Maintain Quality
Monitor mutation score trends. Don't let it regress.
This workflow ensures test quality never regresses. It catches bugs before they reach production.
Performance Considerations
Mutation testing is slow. Optimize:
- Parallel Execution: PMAT runs mutations concurrently
- Smart Sampling: Test representative mutations, not all
- Incremental: Only mutate changed files
- CI Scheduling: Run full mutation tests nightly, not per commit
Example:
# Fast: Test only changed files
pmat mutate test --changed
# Fast: Parallel execution
pmat mutate test --jobs 8
# Slow but thorough: Full project
pmat mutate test .PMAT Configuration Best Practices
Start Conservative (New Projects):
[gates]
min_coverage = 70.0
min_mutation_score = 60.0
max_cyclomatic_complexity = 20
zero_satd = falseUse these settings when starting with PMAT. They're achievable and build momentum. Teams need time to adapt to mutation testing workflows.
Graduate to Strict (Mature Projects):
[gates]
min_coverage = 85.0
min_mutation_score = 80.0
max_cyclomatic_complexity = 15
zero_satd = trueAfter 2-3 months, increase standards. By now, the team understands mutation testing and can write high-quality tests from the start.
Production Critical (Financial, Healthcare, Security):
[gates]
min_coverage = 90.0
min_mutation_score = 85.0
max_cyclomatic_complexity = 10
max_cognitive_complexity = 15
zero_satd = trueFor code that handles money, health data, or security, use extreme standards. 85%+ mutation score means tests catch nearly all bugs. Complexity limits keep code maintainable.
Per-Module Overrides:
[[overrides]]
path = "src/experimental/**"
min_mutation_score = 50.0 # Lower for experimental code
[[overrides]]
path = "src/auth/**"
min_mutation_score = 95.0 # Higher for auth logicApply different standards to different parts of your codebase. Experimental features get slack; security-critical code gets scrutiny.
Gradually increase standards as team matures and testing skills improve.
Common PMAT Pitfalls and Solutions
Pitfall 1: Mutation Testing Takes Too Long
Problem: Full mutation testing takes hours on large codebases.
Solutions:
- Run incremental tests:
pmat mutate test --changed - Use sampling: Test 10% of mutations randomly
- Schedule nightly runs instead of per-commit
- Focus on critical modules first
Pitfall 2: Too Many False Positives
Problem: Quality gates fail on trivial issues.
Solutions:
- Start with conservative thresholds (60% mutation score)
- Exclude generated code and migrations
- Use
[[overrides]]for different module standards - Gradually tighten over 3-6 months
Pitfall 3: Equivalent Mutations
Problem: Some mutations can't be killed (e.g., x > 5 vs x >= 6).
Solutions:
- Accept 85-90% mutation score, not 100%
- Mark equivalent mutations as ignored
- Focus on killing high-value mutations
- Don't waste time on mathematical equivalents
Pitfall 4: Team Resistance
Problem: Developers see PMAT as bureaucratic overhead.
Solutions:
- Show real bugs caught by mutation testing
- Start with one critical module
- Make CI feedback fast (<5 minutes)
- Celebrate improved mutation scores
- Train team on mutation testing principles
Pitfall 5: Over-Optimization
Problem: Chasing 100% mutation score wastes time.
Solutions:
- Diminishing returns above 85%
- Focus on business-critical code
- Skip trivial getters/setters
- Balance quality with delivery speed
Troubleshooting PMAT
"No mutations generated":
- Check file has type annotations
- Verify PMAT supports your Python version (3.8+)
- Run
pmat --debug mutate generate file.py
"All tests timeout":
- Reduce timeout:
--timeout 30 - Check for infinite loops in code
- Run with fewer parallel jobs:
--jobs 2
"Quality gates fail unexpectedly":
- Check
.pmat-gates.tomlsyntax - Review thresholds are realistic
- Run
pmat quality-gate --verbosefor details
"MCP integration doesn't work":
- Verify Claude Desktop config path
- Check
pmat agent mcp-serverruns - Review Claude logs:
~/.config/Claude/logs/
Course Recommendations
Advanced Python Testing Automation
- PMAT deep dive and configuration
- Mutation testing strategies
- Quality gate design patterns
- Enroll at paiml.com
DevOps Quality Engineering
- CI/CD quality integration
- Automated quality enforcement
- Building quality pipelines
- Enroll at paiml.com
AI-Powered Development Tools
- MCP integration patterns
- AI-assisted testing workflows
- Building AI development tools
- Enroll at paiml.com
Quiz
Final Thoughts
PMAT elevates Python testing from "good enough" to "production-grade quality enforcement." By combining mutation testing, quality gates, semantic search, and AI integration, PMAT ensures your test suite genuinely protects against bugs. Use it to enforce standards, catch test gaps, and maintain quality as your project grows. Combined with pytest, coverage tools, and the techniques from earlier chapters, PMAT completes your Python testing toolkit.
📝 Test Your Knowledge: Advanced Testing with PMAT
Take this quiz to reinforce what you've learned in this chapter.