Chapter 3: Data Structures
Chapter 3: Python Data Structures
Overview
Python's built-in data structures - lists, tuples, dictionaries, and sets - are the foundation of data manipulation. This chapter also introduces NumPy arrays and Pandas DataFrames, essential tools for data science. Mastering these structures enables efficient data processing, analysis, and transformation.
Learning Objectives
By the end of this chapter, you will be able to:
- Create and manipulate lists and tuples with indexing, slicing, and methods
- Work with dictionaries for key-value data storage
- Use sets for unique collections and set operations
- Understand NumPy arrays for numerical computing
- Leverage Pandas DataFrames and Series for data analysis
- Choose the right data structure for different tasks
3.1 Use Lists and Tuples
Creating Lists
Lists are ordered, mutable collections:
List Indexing
Access list elements by position:
Adding to Lists
Lists are mutable - you can modify them:
Modifying Lists
Change and swap list elements:
Removing from Lists
Multiple ways to remove items:
Creating Tuples
Tuples are ordered, immutable collections:
Sequence Operations
Lists and tuples share many operations:
Unpacking
Extract values from sequences:
List as Stack (LIFO)
Use lists as Last-In-First-Out stacks:
3.2 Explore Dictionaries
Creating Dictionaries
Dictionaries map keys to values:
Dictionary Operations
Access and modify dictionaries:
Dictionary Methods
Work with keys, values, and items:
Safe Dictionary Access
Use get() to avoid KeyError:
Dictionary Keys Must Be Immutable
Only immutable types can be dictionary keys:
3.3 Dive into Sets
Creating Sets
Sets store unique, unordered elements:
Set Operations
Add, remove, and check membership:
Mathematical Set Operations
Perform union, intersection, difference:
3.4 Work with NumPy Arrays
NumPy provides efficient multi-dimensional arrays for numerical computing:
NumPy Array Creation
Use arange and reshape:
NumPy Array Introspection
Examine array properties:
NumPy Data Types
Control memory usage with data types:
NumPy Array Slicing
Slice arrays to get views:
NumPy Math Operations
Element-wise operations on arrays:
NumPy Matrix Operations
Linear algebra operations:
3.5 Use Pandas DataFrames
Pandas DataFrames are table-like 2D structures - essential for data science:
DataFrame Exploration
Inspect DataFrames with head, tail, describe:
DataFrame Column Access
Select and slice columns:
DataFrame Filtering
Use conditions to filter rows:
3.6 Use Pandas Series
A Series is a 1D labeled array - like a single DataFrame column:
Series Methods
Useful statistical and data methods:
DataFrame Columns are Series
Access DataFrame columns as Series:
Summary
In this chapter, you mastered Python's core data structures:
- Lists: Mutable, ordered collections with indexing and slicing
- Tuples: Immutable, ordered collections perfect for fixed data
- Dictionaries: Key-value mappings for fast lookups
- Sets: Unique elements with mathematical set operations
- NumPy arrays: Efficient multi-dimensional numerical arrays
- Pandas DataFrames/Series: Powerful structures for data analysis
These structures form the foundation of data science workflows.
Quiz
Next Steps
Now that you've mastered data structures, continue with:
- Chapter 4: Data Conversion Recipes - Transform between types
- Chapter 5: Execution Control - Conditional logic and advanced iteration
- Chapter 6: Functions - Create reusable, testable code
Data structures are the foundation of all data science workflows - you'll use these daily!
📝 Test Your Knowledge: Chapter 3: Data Structures
Take this quiz to reinforce what you've learned in this chapter.