Chapter 3: Data Structures

Chapter 3: Python Data Structures

Overview

Python's built-in data structures - lists, tuples, dictionaries, and sets - are the foundation of data manipulation. This chapter also introduces NumPy arrays and Pandas DataFrames, essential tools for data science. Mastering these structures enables efficient data processing, analysis, and transformation.

Learning Objectives

By the end of this chapter, you will be able to:

  • Create and manipulate lists and tuples with indexing, slicing, and methods
  • Work with dictionaries for key-value data storage
  • Use sets for unique collections and set operations
  • Understand NumPy arrays for numerical computing
  • Leverage Pandas DataFrames and Series for data analysis
  • Choose the right data structure for different tasks

3.1 Use Lists and Tuples

Creating Lists

Lists are ordered, mutable collections:

List Indexing

Access list elements by position:

Adding to Lists

Lists are mutable - you can modify them:

Modifying Lists

Change and swap list elements:

Removing from Lists

Multiple ways to remove items:

Creating Tuples

Tuples are ordered, immutable collections:

Sequence Operations

Lists and tuples share many operations:

Unpacking

Extract values from sequences:

List as Stack (LIFO)

Use lists as Last-In-First-Out stacks:

3.2 Explore Dictionaries

Creating Dictionaries

Dictionaries map keys to values:

Dictionary Operations

Access and modify dictionaries:

Dictionary Methods

Work with keys, values, and items:

Safe Dictionary Access

Use get() to avoid KeyError:

Dictionary Keys Must Be Immutable

Only immutable types can be dictionary keys:

3.3 Dive into Sets

Creating Sets

Sets store unique, unordered elements:

Set Operations

Add, remove, and check membership:

Mathematical Set Operations

Perform union, intersection, difference:

3.4 Work with NumPy Arrays

NumPy provides efficient multi-dimensional arrays for numerical computing:

NumPy Array Creation

Use arange and reshape:

NumPy Array Introspection

Examine array properties:

NumPy Data Types

Control memory usage with data types:

NumPy Array Slicing

Slice arrays to get views:

NumPy Math Operations

Element-wise operations on arrays:

NumPy Matrix Operations

Linear algebra operations:

3.5 Use Pandas DataFrames

Pandas DataFrames are table-like 2D structures - essential for data science:

DataFrame Exploration

Inspect DataFrames with head, tail, describe:

DataFrame Column Access

Select and slice columns:

DataFrame Filtering

Use conditions to filter rows:

3.6 Use Pandas Series

A Series is a 1D labeled array - like a single DataFrame column:

Series Methods

Useful statistical and data methods:

DataFrame Columns are Series

Access DataFrame columns as Series:

Summary

In this chapter, you mastered Python's core data structures:

  • Lists: Mutable, ordered collections with indexing and slicing
  • Tuples: Immutable, ordered collections perfect for fixed data
  • Dictionaries: Key-value mappings for fast lookups
  • Sets: Unique elements with mathematical set operations
  • NumPy arrays: Efficient multi-dimensional numerical arrays
  • Pandas DataFrames/Series: Powerful structures for data analysis

These structures form the foundation of data science workflows.

Quiz

Next Steps

Now that you've mastered data structures, continue with:

  • Chapter 4: Data Conversion Recipes - Transform between types
  • Chapter 5: Execution Control - Conditional logic and advanced iteration
  • Chapter 6: Functions - Create reusable, testable code

Data structures are the foundation of all data science workflows - you'll use these daily!

📝 Test Your Knowledge: Chapter 3: Data Structures

Take this quiz to reinforce what you've learned in this chapter.