Advanced Testing with PMAT

Chapter 11: Advanced Testing with PMAT

You've mastered unit tests, mocks, coverage, and mutation testing. But how do you enforce these standards across your entire project? How do you ensure test quality doesn't regress over time? This chapter introduces PMAT (Pragmatic AI MCP Agent Toolkit), a comprehensive quality enforcement tool that takes your Python testing to the next level.

What is PMAT?

PMAT is a code quality analysis and enforcement tool designed for extreme testing rigor. Originally built for Rust, it now supports Python testing workflows with mutation testing, quality gates, semantic search, and AI-powered analysis.

Key Capabilities:

Mutation Testing: Property-based mutation generation with real test execution
Quality Gates: Enforce coverage, complexity, and test quality thresholds
Semantic Search: Find untested code paths using natural language
**M

CP Integration**: Works with Claude Code and AI assistants

Multi-language: Python, Rust, TypeScript, Go, C++

Install PMAT:

# Via cargo (Rust package manager)
cargo install pmat

# Verify installation
pmat --version

Python Mutation Testing with PMAT

Mutation testing measures test quality by introducing bugs and checking if tests catch them. PMAT generates mutations from Python AST and runs pytest on each one.

Example:

PMAT generates mutations:

price * 0.9 → price * 0.8 (different discount)
if is_member → if not is_member (inverted logic)
return price * 0.9 → return price (removed operation)

Then runs pytest on each mutation. If tests pass with a mutation, it survives—revealing a test gap.

Running PMAT Mutation Tests

# Generate mutations for Python file
pmat mutate generate calculator.py

# Run tests on mutations (requires pytest)
pmat mutate test calculator.py

# View mutation score
pmat mutate report

Output example:

Mutation Testing Results
========================
File: calculator.py
Total mutations: 15
Killed: 13
Survived: 2
Mutation Score: 86.7%

Survivors:
  Line 5: `if is_member` → `if not is_member`
  Line 10: `price * 0.9` → `price * 0.95`

86.7% is strong. The 2 survivors show specific test gaps.

Quality Gates with PMAT

Quality gates enforce minimum standards. PMAT fails builds if thresholds aren't met.

Configuration (.pmat-gates.toml):

[gates]
min_coverage = 80.0
max_cyclomatic_complexity = 15
max_cognitive_complexity = 20
min_mutation_score = 75.0
zero_satd = true  # No TODO/FIXME allowed

Run quality gates:

pmat quality-gate

# Strict mode (fail on warnings)
pmat quality-gate --strict

Output example:

✅ Coverage: 85% (threshold: 80%)
✅ Cyclomatic Complexity: 12 (max: 15)
✅ Cognitive Complexity: 18 (max: 20)
❌ Mutation Score: 67% (min: 75%)
❌ SATD: 3 instances found (threshold: 0)

FAILED: 2/5 gates passed

Fix the failures before deploying.

Semantic Code Search

PMAT's semantic search finds code by meaning, not just keywords. Useful for finding untested edge cases.

Example:

# Find authentication-related code
pmat semantic search "user authentication logic"

# Find error handling
pmat semantic search "exception handling patterns"

# Find similar code (potential duplicates)
pmat analyze similar --file auth.py

This helps identify code that needs test coverage but might have been missed by keyword searches.

Integrating PMAT with CI/CD

Add PMAT to your GitHub Actions or GitLab CI:

GitHub Actions (.github/workflows/quality.yml):

name: Quality Gates

on: [push, pull_request]

jobs:
  pmat-quality:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install PMAT
        run: cargo install pmat

      - name: Run Quality Gates
        run: pmat quality-gate --strict

      - name: Run Mutation Tests
        run: pmat mutate test src/

      - name: Upload Reports
        uses: actions/upload-artifact@v3
        with:
          name: pmat-reports
          path: reports/

This ensures every commit meets quality standards.

PMAT vs Other Tools

PMAT vs pytest-cov:

pytest-cov measures execution (what ran)
PMAT measures quality (what worked correctly)
Use both: pytest-cov for coverage, PMAT for mutation score

PMAT vs mutmut:

mutmut: Pure mutation testing
PMAT: Mutation testing + quality gates + semantic search + MCP integration
PMAT is faster and more comprehensive

PMAT vs pylint:

pylint: Static analysis (finds style issues)
PMAT: Dynamic analysis (runs tests, measures quality)
Complementary tools—use both

MCP Integration for AI-Powered Testing

PMAT provides an MCP (Model Context Protocol) server that integrates with Claude Code and other AI assistants.

Setup (~/.config/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "pmat": {
      "command": "pmat",
      "args": ["agent", "mcp-server"]
    }
  }
}

Available MCP Tools:

pmat_quality_check - Run quality gates
pmat_mutation_test - Execute mutation testing
pmat_semantic_search - Search code semantically
pmat_analyze_cluster - Find code patterns

AI assistants can now analyze your code quality and suggest improvements.

Advanced Mutation Operators

PMAT supports Python-specific mutation operators:

Binary Operators (AOR):

+ → -, *, /
- → +

Relational Operators (ROR):

> → >=, <, ==
== → !=

Logical Operators (LOR):

and → or
or → and

Identity Operators:

is → is not
is not → is

Membership Operators:

in → not in
not in → in

These operators generate realistic bugs that tests should catch.

Real-World PMAT Workflow

Step 1: Write Code with Type Hints

def process_payment(amount: float, currency: str = "USD") -> dict:
    """Process payment and return receipt."""
    if amount <= 0:
        raise ValueError("Amount must be positive")

    fee = amount * 0.029  # 2.9% processing fee
    total = amount + fee

    return {
        "amount": amount,
        "fee": fee,
        "total": total,
        "currency": currency
    }

Type hints help PMAT generate better mutations and provide better error messages.

Step 2: Write Tests (80%+ Coverage)

import pytest

def test_process_payment_valid():
    result = process_payment(100.0)
    assert result["amount"] == 100.0
    assert result["total"] == 102.90

def test_process_payment_invalid_amount():
    with pytest.raises(ValueError):
        process_payment(0)

These tests give 100% coverage but don't test edge cases thoroughly.

Step 3: Run Mutation Tests

pmat mutate test src/payment.py

Output shows survivors:

Survived: amount * 0.029 → amount * 0.030
Survived: amount + fee → amount - fee

Tests don't verify the exact fee calculation!

Step 4: Improve Tests to Kill Mutations

def test_process_payment_fee_calculation():
    result = process_payment(100.0)
    assert result["fee"] == 2.90  # Exact fee check

def test_process_payment_total_calculation():
    result = process_payment(100.0)
    assert result["total"] == result["amount"] + result["fee"]

Re-run: All mutations killed. 100% mutation score.

Step 5: Enforce Gates

pmat quality-gate --strict

All gates pass. Ready to commit.

Step 6: CI Integration
Add to GitHub Actions. Every commit runs PMAT.

Step 7: Maintain Quality
Monitor mutation score trends. Don't let it regress.

This workflow ensures test quality never regresses. It catches bugs before they reach production.

Performance Considerations

Mutation testing is slow. Optimize:

Parallel Execution: PMAT runs mutations concurrently
Smart Sampling: Test representative mutations, not all
Incremental: Only mutate changed files
CI Scheduling: Run full mutation tests nightly, not per commit

Example:

# Fast: Test only changed files
pmat mutate test --changed

# Fast: Parallel execution
pmat mutate test --jobs 8

# Slow but thorough: Full project
pmat mutate test .

PMAT Configuration Best Practices

Start Conservative (New Projects):

[gates]
min_coverage = 70.0
min_mutation_score = 60.0
max_cyclomatic_complexity = 20
zero_satd = false

Use these settings when starting with PMAT. They're achievable and build momentum. Teams need time to adapt to mutation testing workflows.

Graduate to Strict (Mature Projects):

[gates]
min_coverage = 85.0
min_mutation_score = 80.0
max_cyclomatic_complexity = 15
zero_satd = true

After 2-3 months, increase standards. By now, the team understands mutation testing and can write high-quality tests from the start.

Production Critical (Financial, Healthcare, Security):

[gates]
min_coverage = 90.0
min_mutation_score = 85.0
max_cyclomatic_complexity = 10
max_cognitive_complexity = 15
zero_satd = true

For code that handles money, health data, or security, use extreme standards. 85%+ mutation score means tests catch nearly all bugs. Complexity limits keep code maintainable.

Per-Module Overrides:

[[overrides]]
path = "src/experimental/**"
min_mutation_score = 50.0  # Lower for experimental code

[[overrides]]
path = "src/auth/**"
min_mutation_score = 95.0  # Higher for auth logic

Apply different standards to different parts of your codebase. Experimental features get slack; security-critical code gets scrutiny.

Gradually increase standards as team matures and testing skills improve.

Common PMAT Pitfalls and Solutions

Pitfall 1: Mutation Testing Takes Too Long

Problem: Full mutation testing takes hours on large codebases.

Solutions:

Run incremental tests: pmat mutate test --changed
Use sampling: Test 10% of mutations randomly
Schedule nightly runs instead of per-commit
Focus on critical modules first

Pitfall 2: Too Many False Positives

Problem: Quality gates fail on trivial issues.

Solutions:

Start with conservative thresholds (60% mutation score)
Exclude generated code and migrations
Use [[overrides]] for different module standards
Gradually tighten over 3-6 months

Pitfall 3: Equivalent Mutations

Problem: Some mutations can't be killed (e.g., x > 5 vs x >= 6).

Solutions:

Accept 85-90% mutation score, not 100%
Mark equivalent mutations as ignored
Focus on killing high-value mutations
Don't waste time on mathematical equivalents

Pitfall 4: Team Resistance

Problem: Developers see PMAT as bureaucratic overhead.

Solutions:

Show real bugs caught by mutation testing
Start with one critical module
Make CI feedback fast (<5 minutes)
Celebrate improved mutation scores
Train team on mutation testing principles

Pitfall 5: Over-Optimization

Problem: Chasing 100% mutation score wastes time.

Solutions:

Diminishing returns above 85%
Focus on business-critical code
Skip trivial getters/setters
Balance quality with delivery speed

Troubleshooting PMAT

"No mutations generated":

Check file has type annotations
Verify PMAT supports your Python version (3.8+)
Run pmat --debug mutate generate file.py

"All tests timeout":

Reduce timeout: --timeout 30
Check for infinite loops in code
Run with fewer parallel jobs: --jobs 2

"Quality gates fail unexpectedly":

Check .pmat-gates.toml syntax
Review thresholds are realistic
Run pmat quality-gate --verbose for details

"MCP integration doesn't work":

Verify Claude Desktop config path
Check pmat agent mcp-server runs
Review Claude logs: ~/.config/Claude/logs/

Course Recommendations

Advanced Python Testing Automation

PMAT deep dive and configuration
Mutation testing strategies
Quality gate design patterns
Enroll at paiml.com

DevOps Quality Engineering

CI/CD quality integration
Automated quality enforcement
Building quality pipelines
Enroll at paiml.com

AI-Powered Development Tools

MCP integration patterns
AI-assisted testing workflows
Building AI development tools
Enroll at paiml.com

Quiz

Final Thoughts

PMAT elevates Python testing from "good enough" to "production-grade quality enforcement." By combining mutation testing, quality gates, semantic search, and AI integration, PMAT ensures your test suite genuinely protects against bugs. Use it to enforce standards, catch test gaps, and maintain quality as your project grows. Combined with pytest, coverage tools, and the techniques from earlier chapters, PMAT completes your Python testing toolkit.

📝 Test Your Knowledge: Advanced Testing with PMAT

Take this quiz to reinforce what you've learned in this chapter.