Property-Based Testing
Chapter 9: Property-Based Testing with Hypothesis
Traditional tests check specific examples: "Does sort([3, 1, 2]) return [1, 2, 3]?" Property-based testing checks universal properties: "Does sort(x) always return a sorted list for any input x?" Instead of writing dozens of examples manually, you define properties and let Hypothesis generate hundreds of test cases automatically. This chapter introduces property-based testing and shows you how to find bugs you'd never think to test for manually.
What is Property-Based Testing?
Property-based testing defines properties that should hold for all inputs, then generates random inputs to verify those properties.
Example-Based Test (Traditional):
You manually pick three examples and hope they're representative.
Property-Based Test:
Hypothesis generates hundreds of lists automatically—empty lists, single elements, massive lists, lists with duplicates, negatives, etc. You define the property ("reversing twice returns the original"), Hypothesis finds edge cases.
Installing Hypothesis
pip install hypothesisHypothesis integrates with pytest seamlessly.
Defining Properties
Good properties are universal truths about your code's behavior.
Property Examples:
For Sorting:
- Output length equals input length
- Output is sorted (each element <= next element)
- Output contains same elements as input
For Reversal:
- Reversing twice returns original
- Length unchanged
- First element becomes last
For Addition:
- Commutative:
a + b == b + a - Associative:
(a + b) + c == a + (b + c) - Identity:
a + 0 == a
Hypothesis Strategies
Strategies tell Hypothesis what kind of data to generate.
Built-in Strategies:
import hypothesis.strategies as st
st.integers() # Any integer
st.integers(min_value=0, max_value=100) # 0-100
st.floats() # Any float
st.text() # Any string
st.booleans() # True or False
st.lists(st.integers()) # Lists of integers
st.dictionaries(keys=st.text(), values=st.integers()) # Dict
st.tuples(st.integers(), st.text()) # Tuple of (int, str)Composite Strategies:
# Lists of 1-10 positive integers
st.lists(st.integers(min_value=1), min_size=1, max_size=10)
# Email-like strings
st.from_regex(r"[a-z]+@[a-z]+\.(com|org)")
# Custom objects
@st.composite
def users(draw):
name = draw(st.text(min_size=1))
age = draw(st.integers(min_value=0, max_value=120))
return User(name=name, age=age)Finding Bugs with Hypothesis
Hypothesis excels at finding edge cases you'd never think to test.
Example: Buggy Median Function:
def median(numbers):
sorted_nums = sorted(numbers)
n = len(sorted_nums)
return sorted_nums[n // 2] # Bug: wrong for even-length lists
@given(st.lists(st.integers(), min_size=1))
def test_median_property(numbers):
result = median(numbers)
# Property: median should be in the list
assert result in numbersHypothesis quickly finds: median([1, 2]) returns 2 (the higher value), not the true median 1.5. The property catches the bug.
Example: Encoding/Decoding:
def encode(text):
return text.encode('utf-8')
def decode(data):
return data.decode('utf-8')
@given(st.text())
def test_encode_decode_roundtrip(text):
# Property: encoding then decoding returns original
assert decode(encode(text)) == textThis finds Unicode edge cases automatically.
Shrinking: Finding Minimal Failing Examples
When Hypothesis finds a failure, it "shrinks" the input to find the smallest example that still fails.
Example:
def buggy_sort(lst):
if len(lst) > 5:
return sorted(lst)[:-1] # Bug: drops last element on long lists
return sorted(lst)
@given(st.lists(st.integers()))
def test_sort_preserves_length(lst):
assert len(buggy_sort(lst)) == len(lst)Hypothesis might find this with input [5, 2, 8, 1, 9, 3, 7], then shrink it to [0, 0, 0, 0, 0, 0]—the minimal example that triggers the bug. Shrinking makes debugging easier.
Property-Based Testing Best Practices
Start with Simple Properties: Don't overcomplicate. "Output length equals input length" is a great first property.
Combine with Example Tests: Use property tests for general behavior, example tests for specific known edge cases.
Use Realistic Data: Generate data that matches your domain. For emails, use email-like strings, not random text.
Test Invariants: Look for things that should always be true—sorted output, no data loss, reversibility.
Set Bounds: Unbounded generation can be slow. Use min_size, max_size, min_value, max_value to keep tests fast.
Real-World Property-Based Testing Examples
Testing a JSON Serializer:
import json
@given(st.dictionaries(st.text(), st.integers()))
def test_json_roundtrip(data):
# Property: JSON serialization is reversible
serialized = json.dumps(data)
deserialized = json.loads(serialized)
assert deserialized == dataTesting Password Validation:
@given(st.text(min_size=8, alphabet=st.characters()))
def test_password_validator_accepts_long_strings(password):
# Property: passwords >= 8 chars should be valid
result = validate_password(password)
assert result.is_valid or len(result.errors) > 0Testing Database Operations:
@given(st.lists(st.integers(min_value=1, max_value=100), unique=True))
def test_database_insert_retrieve(user_ids):
# Property: inserted IDs can all be retrieved
for user_id in user_ids:
db.insert_user(user_id)
for user_id in user_ids:
user = db.get_user(user_id)
assert user is not None
assert user.id == user_idWhen to Use Property-Based Testing
Use Property-Based Testing For:
- Parsers and serializers (roundtrip properties)
- Sorting and data transformation
- Mathematical operations (commutativity, associativity)
- Encoding/decoding
- Data validation
- API contracts
Stick with Example-Based Tests For:
- Specific business rules
- UI behavior
- Known edge cases you want to document
- Simple CRUD operations
Best Strategy: Combine both. Use example tests for specific cases, property tests for general behavior.
Hypothesis Configuration
Control Hypothesis behavior with settings:
from hypothesis import given, settings
@given(st.lists(st.integers()))
@settings(
max_examples=1000, # Run 1000 test cases (default: 100)
deadline=None, # Disable time limit
)
def test_expensive_operation(data):
result = expensive_operation(data)
assert verify_result(result)Common Pitfalls
Pitfall 1: Overly Specific Properties
Don't just reimplement the function in your property test:
# Bad: reimplements sort
@given(st.lists(st.integers()))
def test_sort_bad(lst):
result = my_sort(lst)
expected = sorted(lst) # Just testing against sorted()
assert result == expectedBetter: Test properties:
@given(st.lists(st.integers()))
def test_sort_properties(lst):
result = my_sort(lst)
# Property 1: sorted
assert all(result[i] <= result[i+1] for i in range(len(result)-1))
# Property 2: same elements
assert sorted(result) == sorted(lst)Pitfall 2: Non-Deterministic Code
Property tests assume determinism. Random behavior breaks property testing unless you control the seed.
Pitfall 3: Slow Properties
Keep property tests fast. Hypothesis runs hundreds of examples—each must be quick.
Hypothesis and Fuzzing
Property-based testing is related to fuzzing—both generate inputs to find bugs. Hypothesis is smarter than random fuzzing because it:
- Understands data types: Generates valid integers, not random bytes
- Shrinks failures: Finds minimal failing examples
- Guides generation: Uses feedback to generate interesting inputs
This makes Hypothesis more effective than naive fuzzing for finding bugs.
Advanced Hypothesis Features
Stateful Testing: Test sequences of operations, not just single function calls.
from hypothesis.stateful import RuleBasedStateMachine, rule
class BankAccountMachine(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.balance = 0
@rule(amount=st.integers(min_value=1, max_value=1000))
def deposit(self, amount):
self.balance += amount
assert self.balance >= 0
@rule(amount=st.integers(min_value=1, max_value=100))
def withdraw(self, amount):
if amount <= self.balance:
self.balance -= amount
assert self.balance >= 0
TestBankAccount = BankAccountMachine.TestCaseHypothesis generates random sequences: deposit, withdraw, deposit, deposit, withdraw—testing state transitions automatically.
Custom Strategies: Build complex data generators for your domain.
@st.composite
def valid_email(draw):
username = draw(st.text(alphabet=st.characters(min_codepoint=97, max_codepoint=122), min_size=1, max_size=20))
domain = draw(st.sampled_from(["gmail.com", "yahoo.com", "example.com"]))
return f"{username}@{domain}"
@given(valid_email())
def test_email_validator(email):
assert validate_email(email) == TrueFiltering Examples: Exclude invalid inputs.
@given(st.integers())
def test_division(n):
assume(n != 0) # Skip n=0
result = 100 / n
assert result * n == 100Use assume() to filter out invalid inputs, but don't filter too much—Hypothesis will struggle to find valid examples.
Property Discovery Techniques
Finding good properties requires practice. Here are techniques:
Inverse Operations: If you have encode and decode, test decode(encode(x)) == x.
Invariants: Properties that never change. For sorted lists, all(lst[i] <= lst[i+1]). For sets, len(set) == len(unique_items).
Idempotence: Operations that can be repeated without changing the result. abs(abs(x)) == abs(x). dedupe(dedupe(lst)) == dedupe(lst).
Comparison with Alternative Implementation: If you have a simple but slow implementation and a fast complex one, verify they produce the same results.
@given(st.lists(st.integers()))
def test_fast_sort_matches_slow_sort(lst):
assert fast_sort(lst) == slow_but_simple_sort(lst)Metamorphic Properties: Changing input in predictable ways produces predictable output changes.
@given(st.lists(st.integers()))
def test_sort_preserves_reversal(lst):
# Sorting then reversing should equal reverse-sorting
sorted_then_reversed = list(reversed(sorted(lst)))
reverse_sorted = sorted(lst, reverse=True)
assert sorted_then_reversed == reverse_sortedDebugging Property Test Failures
When Hypothesis finds a failure, it provides the minimal failing example:
Falsifying example: test_function(lst=[0, 0])Step 1: Reproduce Locally. Hypothesis prints the exact input that failed. Use it to write a focused example test.
Step 2: Understand the Failure. Why does this input break your property? Is the property wrong or is the code buggy?
Step 3: Fix and Re-test. Fix the bug, then run Hypothesis again to verify the fix handles all cases.
Step 4: Add Example Test. Convert the failing case to an example test to prevent regression and document the edge case.
Integration with pytest
Hypothesis integrates seamlessly with pytest:
# Run property tests
pytest tests/
# Run with more examples
pytest --hypothesis-show-statistics
# Seed for reproducibility
pytest --hypothesis-seed=12345Property tests appear as regular pytest tests in output. Failed properties show the minimal failing example in the error message.
Getting Started with Hypothesis in Your Project
Step 1: Identify Candidates. Look for functions with clear mathematical properties, parsers, encoders, or data transformations. These benefit most from property testing.
Step 2: Start Simple. Begin with one simple property. Don't try to test everything with properties immediately.
Step 3: Add Incrementally. As you gain confidence, add more property tests. Combine them with your existing example tests.
Step 4: Learn from Failures. When Hypothesis finds a bug, understand why your property failed. This teaches you about your code's behavior.
Step 5: Share with Team. Property testing has a learning curve. Share examples with your team and demonstrate the bugs Hypothesis finds.
Common First Properties to Try:
- Serialization roundtrips:
deserialize(serialize(x)) == x - Reversibility:
undo(do(x)) == x - Length preservation:
len(transform(lst)) == len(lst) - Sorting properties: output is sorted, contains same elements
- Idempotence:
f(f(x)) == f(x)
Start with these patterns and expand as you discover more properties in your codebase. Property-based testing complements traditional testing—use both for comprehensive coverage.
Course Recommendations
Advanced Python Testing
- Property-based testing mastery
- Hypothesis advanced features
- Combining property and example tests
- Enroll at paiml.com
Software Verification
- Formal methods and properties
- Invariant discovery
- Proof-based testing
- Enroll at paiml.com
Test-Driven Development Mastery
- TDD with property-based testing
- Property discovery techniques
- Real-world TDD projects
- Enroll at paiml.com
Quiz
Property-based testing represents a paradigm shift from example-based testing. It finds bugs in edge cases you never thought to test manually, making your code more robust and reliable in production environments.
📝 Test Your Knowledge: Property-Based Testing
Take this quiz to reinforce what you've learned in this chapter.