The Art of Choosing Data Structures
Welcome to our exploration of selecting appropriate data structures! This skill is fundamental to effective programming and can dramatically impact the efficiency and clarity of your code. In our previous sessions, we've examined various Python data structures: lists, tuples, dictionaries, and sets. Now, we'll focus on when and why to use each one.
Think of data structures as containers for your information. Just as you wouldn't store soup in a colander or try to organize paperwork in a fishing net, selecting the right data structure for your specific needs is crucial. The right choice can make your code faster, more memory-efficient, and easier to understand.
Throughout this session, we'll examine common programming scenarios and identify which data structures best address each situation. We'll also explore the trade-offs between different options, helping you develop intuition for making these important decisions in your own projects.
Python Data Structures at a Glance
Before we dive into specific use cases, let's briefly recap Python's core data structures and their key characteristics:
| Data Structure | Ordered? | Mutable? | Indexable? | Allows Duplicates? | Key Feature |
|---|---|---|---|---|---|
| List | Yes | Yes | Yes | Yes | Sequence of items with flexible manipulation |
| Tuple | Yes | No | Yes | Yes | Immutable sequence of items |
| Dictionary | Yes* | Yes | No (key-based) | No (keys) | Key-value pairs with fast lookup |
| Set | No | Yes | No | No | Unique items with fast membership testing |
* In Python 3.7+, dictionaries maintain insertion order, but this is an implementation detail rather than a guarantee of the abstract data type.
These characteristics serve as a starting point for our selection process. Next, we'll explore how they translate to real-world programming situations.
When to Use Lists
Lists are Python's most versatile and commonly used data structure. They're ideal when you need:
Ordered Collection of Items
When the sequence of elements matters, lists preserve the order you define. This is essential for:
- Processing items in a specific sequence
- Representing steps in a procedure
- Maintaining a ranking or priority order
# File: top_movies.py
# Location: ~/python_examples/week2/
# Order matters - these are ranked from #1 to #5
top_movies = [
"The Shawshank Redemption",
"The Godfather",
"The Dark Knight",
"The Godfather Part II",
"12 Angry Men"
]
# Accessing by position (index)
print(f"The #1 movie is: {top_movies[0]}")
print(f"The #3 movie is: {top_movies[2]}")
Collection That Will Change Size
Lists shine when you need to add or remove elements during execution:
- Building a collection dynamically
- Implementing a queue or stack
- Filtering and transforming data
# File: task_manager.py
# Location: ~/python_examples/week2/
# A simple task queue
tasks = []
# Adding tasks
tasks.append("Answer emails")
tasks.append("Prepare presentation")
tasks.append("Review code")
print(f"Current tasks: {tasks}")
# Completing a task (removing from beginning)
completed_task = tasks.pop(0)
print(f"Completed: {completed_task}")
print(f"Remaining tasks: {tasks}")
Need for Multiple Operations on the Same Collection
Lists provide a rich set of methods for manipulation:
- Sorting and reversing
- Slicing and dicing
- Extending with other collections
# File: data_processing.py
# Location: ~/python_examples/week2/
temperatures = [72, 65, 83, 70, 59, 88, 75]
# Multiple operations on the same list
average = sum(temperatures) / len(temperatures)
print(f"Average temperature: {average:.1f}")
# Sort in place
temperatures.sort()
print(f"Temperatures in ascending order: {temperatures}")
# Find median (middle value)
middle_idx = len(temperatures) // 2
median = temperatures[middle_idx]
print(f"Median temperature: {median}")
Real-world example: Consider a web application's event log. Events arrive in chronological order and need to be processed sequentially. New events are constantly being added, and sometimes older events need to be archived (removed). A list is perfect here, as it maintains order and allows for efficient additions and removals.
When Not to Use Lists
Despite their versatility, lists aren't always the best choice:
- When you frequently need to check if an item exists in a large collection (sets or dictionaries are faster)
- When you need to associate values with keys (use dictionaries)
- When you need to ensure all elements are unique (use sets)
- When the data should never change (use tuples)
When to Use Tuples
Tuples are immutable ordered sequences. They're the right choice when:
The Data Should Not Change
Tuples enforce immutability, making them ideal for:
- Constants and fixed values
- Configuration settings
- Ensuring data integrity
# File: coordinates.py
# Location: ~/python_examples/week2/
# Geographical coordinates (latitude, longitude)
new_york = (40.7128, -74.0060)
tokyo = (35.6895, 139.6917)
paris = (48.8566, 2.3522)
# These cannot be accidentally changed
# new_york[0] = 41.0 # This would raise an error
# Calculate distance (simplified)
def distance(point1, point2):
return ((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)**0.5
ny_to_paris = distance(new_york, paris)
print(f"Distance from New York to Paris: {ny_to_paris:.2f} coordinate units")
Heterogeneous Data that Forms a Single Logical Unit
Tuples excel at representing "records" with different types of data:
- Database records
- Named coordinates or points
- Person's information (name, age, address, etc.)
# File: student_records.py
# Location: ~/python_examples/week2/
# Student records as tuples: (name, age, gpa, major)
students = [
("Alice Smith", 20, 3.8, "Computer Science"),
("Bob Johnson", 22, 3.2, "Mathematics"),
("Charlie Brown", 19, 3.9, "Physics"),
("Diana Miller", 21, 3.5, "Biology")
]
# Process student records
for student in students:
name, age, gpa, major = student # Unpacking
if gpa >= 3.5:
print(f"{name}, a {major} major, made the Dean's List!")
Dictionary Keys
Because tuples are immutable, they can be used as dictionary keys (unlike lists):
# File: grid_data.py
# Location: ~/python_examples/week2/
# Using coordinate tuples as dictionary keys
grid_values = {
(0, 0): 100,
(0, 1): 75,
(1, 0): 50,
(1, 1): 25
}
# Retrieve value at a specific coordinate
position = (1, 0)
print(f"Value at position {position}: {grid_values[position]}")
Analogy: Think of tuples as sealed packages that come from a factory with their contents verified and secured. Once sealed, the contents cannot be modified, ensuring that what was packed is exactly what will be delivered.
When Not to Use Tuples
Tuples aren't the best choice when:
- You need to modify the collection after creation
- The collection size will change
- You need to add or remove elements
- You need set operations like unions and intersections
When to Use Dictionaries
Dictionaries store key-value pairs and offer extremely fast lookups. They're perfect when:
You Need Fast Lookups by Key
This is the dictionary's superpower—retrieving values by their associated keys:
- Looking up information by ID, name, or other identifier
- Implementing caches or memoization
- Counting occurrences (frequency counting)
# File: user_database.py
# Location: ~/python_examples/week2/
# User information accessible by username
users = {
"alice_smith": {
"name": "Alice Smith",
"email": "alice@example.com",
"role": "admin"
},
"bob_jones": {
"name": "Bob Jones",
"email": "bob@example.com",
"role": "user"
},
"charlie_davis": {
"name": "Charlie Davis",
"email": "charlie@example.com",
"role": "user"
}
}
# Fast lookup by key
username = "alice_smith"
if username in users:
user_info = users[username]
print(f"User found: {user_info['name']}, Role: {user_info['role']}")
else:
print(f"User {username} not found")
You Need to Map Keys to Values
When you have natural pairs of related information:
- Configuration settings (name → value)
- Language translations (word → translation)
- Inventory tracking (item → quantity)
# File: language_translator.py
# Location: ~/python_examples/week2/
# Simple English to Spanish dictionary
translations = {
"hello": "hola",
"goodbye": "adiós",
"thank you": "gracias",
"yes": "sí",
"no": "no",
"please": "por favor"
}
# Translate a sentence
def translate(text, dictionary):
words = text.lower().split()
translated_words = []
for word in words:
if word in dictionary:
translated_words.append(dictionary[word])
else:
translated_words.append(word) # Keep untranslated
return " ".join(translated_words)
english_text = "Hello please thank you goodbye"
spanish_text = translate(english_text, translations)
print(f"English: {english_text}")
print(f"Spanish: {spanish_text}")
You Need to Count Occurrences
Dictionaries are superb for keeping track of how many times items appear:
# File: word_counter.py
# Location: ~/python_examples/week2/
text = "the quick brown fox jumps over the lazy dog the fox was quick"
words = text.lower().split()
# Count word frequencies
word_counts = {}
for word in words:
if word in word_counts:
word_counts[word] += 1
else:
word_counts[word] = 1
# Alternative using get() method
word_counts_alt = {}
for word in words:
word_counts_alt[word] = word_counts_alt.get(word, 0) + 1
# Print the most frequent words
sorted_words = sorted(word_counts.items(), key=lambda x: x[1], reverse=True)
for word, count in sorted_words[:3]: # Top 3
print(f"'{word}' appears {count} times")
Real-world application: Consider an e-commerce platform that needs to check product prices quickly. With thousands of products, searching through a list would be inefficient. A dictionary with product codes as keys and prices as values allows for instant price lookups, crucial for real-time operations.
When Not to Use Dictionaries
Despite their versatility, dictionaries aren't ideal when:
- You need to maintain a specific order (though modern Python dictionaries do preserve insertion order)
- You need to perform operations on all values frequently
- Memory efficiency is a critical concern (dictionaries have some overhead)
- You only care about the presence of items, not associated values (use sets)
When to Use Sets
Sets are unordered collections of unique elements with fast membership testing. They excel when:
You Need to Ensure Uniqueness
Sets automatically eliminate duplicates, making them perfect for:
- Removing duplicates from a collection
- Tracking distinct values
- Implementing "have we seen this before?" logic
# File: unique_visitors.py
# Location: ~/python_examples/week2/
# Log of website visitors by IP address
visitor_ips = [
"192.168.1.1", "10.0.0.2", "192.168.1.1", "192.168.1.3",
"10.0.0.2", "172.16.0.1", "192.168.1.1", "10.0.0.4"
]
# Get unique visitors
unique_visitors = set(visitor_ips)
print(f"Total visits: {len(visitor_ips)}")
print(f"Unique visitors: {len(unique_visitors)}")
print(f"Unique IPs: {unique_visitors}")
You Need Fast Membership Testing
Checking if an item exists in a set is extremely fast, even for large collections:
# File: word_filter.py
# Location: ~/python_examples/week2/
# A set of words to filter (e.g., profanity filter)
banned_words = {"bad", "inappropriate", "offensive", "rude"}
def is_acceptable(text):
"""Check if text contains any banned words."""
words = set(text.lower().split())
# Fast check for intersection
if words & banned_words: # Set intersection
return False
return True
message1 = "This is a perfectly fine message"
message2 = "This message contains something inappropriate"
print(f"Message 1 acceptable: {is_acceptable(message1)}")
print(f"Message 2 acceptable: {is_acceptable(message2)}")
You Need Set Operations
Sets support powerful mathematical operations like unions, intersections, and differences:
# File: student_groups.py
# Location: ~/python_examples/week2/
# Students in different courses
math_students = {"Alice", "Bob", "Charlie", "Diana", "Eve"}
physics_students = {"Bob", "Diana", "Frank", "Grace"}
cs_students = {"Alice", "Charlie", "Eve", "Frank"}
# Students in both Math and CS (intersection)
math_and_cs = math_students & cs_students
print(f"Taking both Math and CS: {math_and_cs}")
# Students in any science course (union)
science_students = physics_students | cs_students
print(f"Taking at least one science course: {science_students}")
# Math students not taking Physics (difference)
math_not_physics = math_students - physics_students
print(f"Taking Math but not Physics: {math_not_physics}")
# Students taking exactly one of Math or Physics (symmetric difference)
one_subject_only = math_students ^ physics_students
print(f"Taking either Math or Physics (but not both): {one_subject_only}")
Analogy: Think of sets as a VIP list at an exclusive event. The bouncer can instantly check if someone's name is on the list (membership testing), and each person can only be on the list once (uniqueness).
When Not to Use Sets
Sets aren't the right choice when:
- The order of elements matters
- You need to store duplicate items
- You need to associate values with keys
- You need to access elements by position (index)
Combining Data Structures for Complex Problems
Real-world problems often require multiple data structures working together. Let's examine some composite examples:
A Simple Library Management System
This example shows how different data structures handle different aspects of the problem:
# File: library_system.py
# Location: ~/python_examples/week2/
# Book database as a dictionary with ISBN as key
books = {
"978-1451673319": {
"title": "Fahrenheit 451",
"author": "Ray Bradbury",
"year": 1953,
"copies": 3
},
"978-0061120084": {
"title": "To Kill a Mockingbird",
"author": "Harper Lee",
"year": 1960,
"copies": 5
},
"978-0451524935": {
"title": "1984",
"author": "George Orwell",
"year": 1949,
"copies": 4
}
}
# Checked out books with due dates (dictionary of sets)
checked_out = {
"user1": {("978-1451673319", "2023-06-15")},
"user2": {("978-0061120084", "2023-06-10"), ("978-0451524935", "2023-06-20")}
}
# Available books (calculated using sets)
def get_available_books():
all_isbns = set(books.keys())
checked_out_isbns = set()
for user_checkouts in checked_out.values():
for isbn, _ in user_checkouts:
checked_out_isbns.add(isbn)
available_isbns = all_isbns - checked_out_isbns
# Create a list of available book details
available_books = []
for isbn in available_isbns:
book = books[isbn].copy() # Copy the book details
book["isbn"] = isbn # Add the ISBN to the details
available_books.append(book)
return available_books
# Check out a book
def checkout_book(user_id, isbn, due_date):
if isbn not in books:
return False, "Book not found"
# Check if book is already checked out
for user_checkouts in checked_out.values():
for checkout_isbn, _ in user_checkouts:
if checkout_isbn == isbn:
return False, "Book already checked out"
# Add to user's checkouts
if user_id not in checked_out:
checked_out[user_id] = set()
checked_out[user_id].add((isbn, due_date))
return True, "Book checked out successfully"
# Print available books
print("Available Books:")
for book in get_available_books():
print(f"- {book['title']} by {book['author']} (ISBN: {book['isbn']})")
# Demonstrate checkout
status, message = checkout_book("user3", "978-0451524935", "2023-07-01")
print(f"\nCheckout result: {message}")
In this example:
- A dictionary stores the book database, allowing fast lookups by ISBN
- Nested dictionaries store book details
- A dictionary of sets tracks checked-out books
- Set operations find available books efficiently
Customer Order Processing System
Let's examine another composite example:
# File: order_processing.py
# Location: ~/python_examples/week2/
# Product catalog (dictionary for fast lookup)
products = {
"SKU001": {"name": "Desk Lamp", "price": 19.99, "weight": 2.5},
"SKU002": {"name": "Office Chair", "price": 149.99, "weight": 15.0},
"SKU003": {"name": "Notebook", "price": 4.99, "weight": 0.5},
"SKU004": {"name": "Pen Set", "price": 12.99, "weight": 0.3},
"SKU005": {"name": "Desk Organizer", "price": 24.99, "weight": 1.8}
}
# Orders as a list (to maintain chronological order)
orders = [
{
"order_id": "ORD-001",
"customer_id": "CUST-1234",
"date": "2023-05-15",
"items": [("SKU002", 1), ("SKU003", 2)]
},
{
"order_id": "ORD-002",
"customer_id": "CUST-5678",
"date": "2023-05-16",
"items": [("SKU001", 1), ("SKU004", 1), ("SKU005", 1)]
}
]
# Process orders and calculate totals
def process_orders():
for order in orders:
print(f"\nProcessing Order: {order['order_id']}")
print(f"Customer: {order['customer_id']}")
print(f"Date: {order['date']}")
print("Items:")
total_price = 0
total_weight = 0
for sku, quantity in order["items"]:
if sku in products:
product = products[sku]
item_price = product["price"] * quantity
item_weight = product["weight"] * quantity
print(f" - {product['name']} × {quantity}: ${item_price:.2f}")
total_price += item_price
total_weight += item_weight
else:
print(f" - Unknown product: {sku}")
print(f"Total Price: ${total_price:.2f}")
print(f"Total Weight: {total_weight:.1f} lbs")
# Find popular products (using a dictionary for counting)
def analyze_popular_products():
product_counts = {}
for order in orders:
for sku, quantity in order["items"]:
if sku in products:
if sku not in product_counts:
product_counts[sku] = 0
product_counts[sku] += quantity
# Sort by popularity
popular_products = sorted(
product_counts.items(),
key=lambda x: x[1],
reverse=True
)
return popular_products
# Process all orders
process_orders()
# Show popular products
print("\nPopular Products:")
for sku, count in analyze_popular_products():
product = products[sku]
print(f"- {product['name']}: {count} units sold")
This example demonstrates:
- A dictionary for the product catalog, enabling fast lookups
- A list for orders, maintaining chronological sequence
- Tuples for order items (SKU and quantity)
- Another dictionary for counting product popularity
Real-world insight: In professional software development, these composite patterns are extremely common. Database records often become dictionaries, collections of unique identifiers become sets, and ordered operations use lists. Mastering when to use each structure and how to combine them effectively is a hallmark of experienced programmers.
Performance Considerations
When selecting a data structure, performance characteristics can be critical, especially for larger datasets:
| Operation | List | Tuple | Dictionary | Set |
|---|---|---|---|---|
| Access by index | O(1) | O(1) | N/A | N/A |
| Search (in/contains) | O(n) | O(n) | O(1) | O(1) |
| Insert/Add | O(1)* | N/A | O(1) | O(1) |
| Delete | O(n) | N/A | O(1) | O(1) |
* Amortized - resizing may occasionally take longer
These time complexities tell an important story:
- If you need to frequently check if an item exists in a large collection, a list or tuple will be slow (O(n)), while dictionaries and sets are fast (O(1))
- If you need to access items by position, lists and tuples are optimal
- If you need to delete items frequently, dictionaries and sets outperform lists
Let's demonstrate the practical impact with a simple benchmark:
# File: data_structure_benchmark.py
# Location: ~/python_examples/week2/
import time
import random
# Generate test data
data_size = 10000
test_data = list(range(data_size))
random.shuffle(test_data)
# Convert to different data structures
test_list = test_data.copy()
test_set = set(test_data)
test_dict = {x: x for x in test_data}
# Test lookup performance
lookup_items = random.sample(test_data, 1000) # 1000 random lookups
def benchmark_lookup(description, operation, iterations=1000):
start_time = time.time()
for _ in range(iterations):
for item in lookup_items:
operation(item)
end_time = time.time()
elapsed = end_time - start_time
print(f"{description}: {elapsed:.6f} seconds")
# Run benchmarks
benchmark_lookup("List lookup", lambda x: x in test_list)
benchmark_lookup("Set lookup", lambda x: x in test_set)
benchmark_lookup("Dict lookup", lambda x: x in test_dict)
# Output will show significant differences in performance
# For example:
# List lookup: 0.123456 seconds
# Set lookup: 0.000123 seconds
# Dict lookup: 0.000135 seconds
For large datasets, the difference can be dramatic—the set and dictionary lookups might be hundreds or thousands of times faster than list lookups.
Real-world application: In a social network application, checking if a user is connected to another user must be extremely fast. Using sets to store connections allows for O(1) lookup times, enabling features like "friend suggestions" or "people you may know" to work efficiently even with millions of users.
A Decision-Making Framework
Here's a practical framework for selecting the appropriate data structure for your specific needs:
Step 1: Understand Your Data Access Patterns
Start by answering these questions:
- How will you most frequently access the data? (By index, by key, sequentially, randomly?)
- Will you need to add or remove items frequently?
- Does order matter?
- Are duplicates allowed or meaningful?
- Will you need to check membership frequently?
Step 2: Consider Size and Performance Requirements
- How large will the dataset be?
- Are there tight performance constraints?
- Is memory usage a concern?
Step 3: Apply This Decision Tree
Based on your answers, use this simplified decision tree:
- If you need key-value mapping → Use a dictionary
- If you need to ensure uniqueness or do set operations → Use a set
- If you need a sequence that won't change → Use a tuple
- If you need a modifiable, ordered sequence → Use a list
- If you need fast lookups by content, not position → Use a set or dictionary
Step 4: Combine as Needed
Don't hesitate to create composite structures for complex problems, such as:
- Lists of dictionaries (e.g., database records)
- Dictionaries of sets (e.g., user to connections mapping)
- Dictionaries of lists (e.g., category to items mapping)
Practical example: Let's apply this framework to a real problem—tracking employee attendance:
# File: attendance_tracker.py
# Location: ~/python_examples/week2/
# Requirements analysis:
# 1. Need to look up attendance by employee ID (key-value mapping) → Dictionary
# 2. Need to track dates chronologically (ordered sequence) → List
# 3. For each date, need to know if present/absent (no duplicates) → Set
# 4. Employee details won't change (immutable data) → Tuple
# Result: A dictionary of employee records (tuples) with attendance as sets
# Employee records: (name, department, start_date)
employees = {
"E001": ("Alice Smith", "Engineering", "2020-03-15"),
"E002": ("Bob Johnson", "Marketing", "2021-07-22"),
"E003": ("Charlie Davis", "Finance", "2019-11-08"),
"E004": ("Diana Wilson", "Engineering", "2022-01-10")
}
# Attendance records: {employee_id: {present_dates}}
attendance = {
"E001": {"2023-05-01", "2023-05-02", "2023-05-03", "2023-05-05"},
"E002": {"2023-05-01", "2023-05-02", "2023-05-03", "2023-05-04", "2023-05-05"},
"E003": {"2023-05-01", "2023-05-03", "2023-05-05"},
"E004": {"2023-05-01", "2023-05-02", "2023-05-04", "2023-05-05"}
}
# Track all workdays
workdays = ["2023-05-01", "2023-05-02", "2023-05-03", "2023-05-04", "2023-05-05"]
# Generate attendance report
def generate_attendance_report():
print("Attendance Report: May 1-5, 2023\n")
for emp_id, emp_details in employees.items():
name, department, _ = emp_details
present_days = attendance.get(emp_id, set())
absent_days = set(workdays) - present_days
attendance_rate = len(present_days) / len(workdays) * 100
print(f"Employee: {name} ({emp_id}), Department: {department}")
print(f"Attendance Rate: {attendance_rate:.1f}%")
if absent_days:
print(f"Absent on: {', '.join(sorted(absent_days))}")
else:
print("Perfect attendance!")
print()
# Find employees with perfect attendance
def find_perfect_attendance():
perfect_attendance = []
for emp_id, present_days in attendance.items():
if len(present_days) == len(workdays):
name = employees[emp_id][0] # Get employee name
perfect_attendance.append(name)
return perfect_attendance
# Generate the report
generate_attendance_report()
# List employees with perfect attendance
perfect = find_perfect_attendance()
print(f"Employees with perfect attendance: {', '.join(perfect)}")
This example demonstrates how a thoughtful combination of data structures leads to clean, efficient code that matches the problem's requirements.
Hands-On Exercises
Let's practice selecting appropriate data structures with some exercises. For each scenario, consider which data structure(s) would be most appropriate and implement a solution.
Exercise 1: Contact Management System
Implement a simple contact management system that allows adding, searching, and categorizing contacts. Think about how to efficiently look up contacts by different criteria (name, email, phone) and how to organize contacts into groups.
# File: contact_manager.py
# Location: ~/python_examples/week2/exercises/
# Your solution here
# Example approach:
contacts = {} # Dictionary with email as key for unique identification
def add_contact(name, email, phone, categories=None):
if email in contacts:
print(f"Contact with email {email} already exists")
return False
categories = set(categories) if categories else set()
contacts[email] = {"name": name, "phone": phone, "categories": categories}
return True
def search_by_name(name):
return [contacts[email] for email in contacts if name.lower() in contacts[email]["name"].lower()]
def add_to_category(email, category):
if email in contacts:
contacts[email]["categories"].add(category)
return True
return False
def get_contacts_in_category(category):
return [contacts[email] for email in contacts if category in contacts[email]["categories"]]
# Test the system
add_contact("Alice Smith", "alice@example.com", "555-123-4567", ["work", "friends"])
add_contact("Bob Johnson", "bob@example.com", "555-987-6543", ["family"])
add_contact("Charlie Brown", "charlie@example.com", "555-567-8901", ["work"])
print("Contacts in 'work' category:")
for contact in get_contacts_in_category("work"):
print(f"- {contact['name']}: {contact['phone']}")
search_result = search_by_name("Alice")
if search_result:
print(f"\nFound contact: {search_result[0]['name']}")
Exercise 2: Shopping Cart System
Design a shopping cart system for an online store. Consider how to track products, quantities, calculate totals, and handle operations like adding, removing, and updating quantities.
# File: shopping_cart.py
# Location: ~/python_examples/week2/exercises/
# Your solution here
# Example approach:
# Product catalog as a dictionary
products = {
"P001": {"name": "T-Shirt", "price": 19.99, "category": "Clothing"},
"P002": {"name": "Jeans", "price": 49.99, "category": "Clothing"},
"P003": {"name": "Sneakers", "price": 79.99, "category": "Footwear"},
"P004": {"name": "Backpack", "price": 39.99, "category": "Accessories"},
"P005": {"name": "Water Bottle", "price": 9.99, "category": "Accessories"}
}
# Shopping cart as a dictionary with product_id -> quantity mapping
class ShoppingCart:
def __init__(self):
self.items = {} # product_id -> quantity
def add_item(self, product_id, quantity=1):
if product_id not in products:
print(f"Product {product_id} not found")
return False
if product_id in self.items:
self.items[product_id] += quantity
else:
self.items[product_id] = quantity
return True
def remove_item(self, product_id):
if product_id in self.items:
del self.items[product_id]
return True
return False
def update_quantity(self, product_id, quantity):
if product_id not in self.items:
return False
if quantity <= 0:
return self.remove_item(product_id)
self.items[product_id] = quantity
return True
def get_total(self):
total = 0
for product_id, quantity in self.items.items():
total += products[product_id]["price"] * quantity
return total
def view_cart(self):
if not self.items:
print("Your cart is empty")
return
print("Your Shopping Cart:")
total = 0
for product_id, quantity in self.items.items():
product = products[product_id]
item_total = product["price"] * quantity
total += item_total
print(f"- {product['name']} × {quantity}: ${item_total:.2f}")
print(f"\nTotal: ${total:.2f}")
# Test the cart
cart = ShoppingCart()
cart.add_item("P001", 2) # 2 T-shirts
cart.add_item("P003", 1) # 1 pair of sneakers
cart.add_item("P005", 3) # 3 water bottles
cart.view_cart()
print("\nUpdating quantities...")
cart.update_quantity("P001", 1) # Reduce to 1 T-shirt
cart.remove_item("P005") # Remove water bottles
cart.view_cart()
Work through these exercises to practice selecting appropriate data structures for different scenarios. Try implementing your own solutions before looking at the examples.
Choosing Wisely: The Art of Data Structure Selection
Selecting the right data structure is both a science and an art. It requires understanding the strengths and weaknesses of each option, analyzing your specific requirements, and sometimes making trade-offs.
Remember these key principles:
- Match the structure to your access patterns – How you'll use the data is often more important than what the data is
- Consider performance implications – For large datasets, the difference between O(1) and O(n) operations can be enormous
- Don't sacrifice clarity for optimization – Readable code with slightly suboptimal data structures is usually better than confusing code with perfectly optimized structures
- Combine structures when necessary – Complex problems often require composite solutions
- Revise as requirements evolve – Be willing to refactor when you discover new requirements or patterns
With practice, selecting the right data structure will become intuitive. You'll develop a feel for which tool fits each problem, just as an experienced carpenter knows instinctively whether to reach for a hammer, a saw, or a drill.
Next Steps
As we continue our Python journey, we'll explore how these fundamental data structures become even more powerful when combined with:
- Functions – Creating reusable operations that work with your data
- Classes – Building custom data structures that encapsulate both data and behavior
- Algorithms – Implementing efficient solutions to complex problems
- External libraries – Leveraging specialized data structures for specific domains
In our next session, we'll focus on functions – the building blocks of reusable code that operate on these data structures. We'll see how well-designed functions can make your data structures even more powerful and your code more modular and maintainable.