Selecting Appropriate Data Structures for Different Problems

The Art of Choosing Data Structures

Welcome to our exploration of selecting appropriate data structures! This skill is fundamental to effective programming and can dramatically impact the efficiency and clarity of your code. In our previous sessions, we've examined various Python data structures: lists, tuples, dictionaries, and sets. Now, we'll focus on when and why to use each one.

Think of data structures as containers for your information. Just as you wouldn't store soup in a colander or try to organize paperwork in a fishing net, selecting the right data structure for your specific needs is crucial. The right choice can make your code faster, more memory-efficient, and easier to understand.

Throughout this session, we'll examine common programming scenarios and identify which data structures best address each situation. We'll also explore the trade-offs between different options, helping you develop intuition for making these important decisions in your own projects.

Python Data Structures at a Glance

Before we dive into specific use cases, let's briefly recap Python's core data structures and their key characteristics:

Data Structure	Ordered?	Mutable?	Indexable?	Allows Duplicates?	Key Feature
List	Yes	Yes	Yes	Yes	Sequence of items with flexible manipulation
Tuple	Yes	No	Yes	Yes	Immutable sequence of items
Dictionary	Yes*	Yes	No (key-based)	No (keys)	Key-value pairs with fast lookup
Set	No	Yes	No	No	Unique items with fast membership testing

* In Python 3.7+, dictionaries maintain insertion order, but this is an implementation detail rather than a guarantee of the abstract data type.

These characteristics serve as a starting point for our selection process. Next, we'll explore how they translate to real-world programming situations.

When to Use Lists

Lists are Python's most versatile and commonly used data structure. They're ideal when you need:

Ordered Collection of Items

When the sequence of elements matters, lists preserve the order you define. This is essential for:

Processing items in a specific sequence
Representing steps in a procedure
Maintaining a ranking or priority order


# File: top_movies.py
# Location: ~/python_examples/week2/

# Order matters - these are ranked from #1 to #5
top_movies = [
    "The Shawshank Redemption",
    "The Godfather",
    "The Dark Knight",
    "The Godfather Part II",
    "12 Angry Men"
]

# Accessing by position (index)
print(f"The #1 movie is: {top_movies[0]}")
print(f"The #3 movie is: {top_movies[2]}")

Collection That Will Change Size

Lists shine when you need to add or remove elements during execution:

Building a collection dynamically
Implementing a queue or stack
Filtering and transforming data


# File: task_manager.py
# Location: ~/python_examples/week2/

# A simple task queue
tasks = []

# Adding tasks
tasks.append("Answer emails")
tasks.append("Prepare presentation")
tasks.append("Review code")

print(f"Current tasks: {tasks}")

# Completing a task (removing from beginning)
completed_task = tasks.pop(0)
print(f"Completed: {completed_task}")
print(f"Remaining tasks: {tasks}")

Need for Multiple Operations on the Same Collection

Lists provide a rich set of methods for manipulation:

Sorting and reversing
Slicing and dicing
Extending with other collections


# File: data_processing.py
# Location: ~/python_examples/week2/

temperatures = [72, 65, 83, 70, 59, 88, 75]

# Multiple operations on the same list
average = sum(temperatures) / len(temperatures)
print(f"Average temperature: {average:.1f}")

# Sort in place
temperatures.sort()
print(f"Temperatures in ascending order: {temperatures}")

# Find median (middle value)
middle_idx = len(temperatures) // 2
median = temperatures[middle_idx]
print(f"Median temperature: {median}")

Real-world example: Consider a web application's event log. Events arrive in chronological order and need to be processed sequentially. New events are constantly being added, and sometimes older events need to be archived (removed). A list is perfect here, as it maintains order and allows for efficient additions and removals.

When Not to Use Lists

Despite their versatility, lists aren't always the best choice:

When you frequently need to check if an item exists in a large collection (sets or dictionaries are faster)
When you need to associate values with keys (use dictionaries)
When you need to ensure all elements are unique (use sets)
When the data should never change (use tuples)

When to Use Tuples

Tuples are immutable ordered sequences. They're the right choice when:

The Data Should Not Change

Tuples enforce immutability, making them ideal for:

Constants and fixed values
Configuration settings
Ensuring data integrity


# File: coordinates.py
# Location: ~/python_examples/week2/

# Geographical coordinates (latitude, longitude)
new_york = (40.7128, -74.0060)
tokyo = (35.6895, 139.6917)
paris = (48.8566, 2.3522)

# These cannot be accidentally changed
# new_york[0] = 41.0  # This would raise an error

# Calculate distance (simplified)
def distance(point1, point2):
    return ((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)**0.5

ny_to_paris = distance(new_york, paris)
print(f"Distance from New York to Paris: {ny_to_paris:.2f} coordinate units")

Heterogeneous Data that Forms a Single Logical Unit

Tuples excel at representing "records" with different types of data:

Database records
Named coordinates or points
Person's information (name, age, address, etc.)


# File: student_records.py
# Location: ~/python_examples/week2/

# Student records as tuples: (name, age, gpa, major)
students = [
    ("Alice Smith", 20, 3.8, "Computer Science"),
    ("Bob Johnson", 22, 3.2, "Mathematics"),
    ("Charlie Brown", 19, 3.9, "Physics"),
    ("Diana Miller", 21, 3.5, "Biology")
]

# Process student records
for student in students:
    name, age, gpa, major = student  # Unpacking
    
    if gpa >= 3.5:
        print(f"{name}, a {major} major, made the Dean's List!")

Dictionary Keys

Because tuples are immutable, they can be used as dictionary keys (unlike lists):


# File: grid_data.py
# Location: ~/python_examples/week2/

# Using coordinate tuples as dictionary keys
grid_values = {
    (0, 0): 100,
    (0, 1): 75,
    (1, 0): 50,
    (1, 1): 25
}

# Retrieve value at a specific coordinate
position = (1, 0)
print(f"Value at position {position}: {grid_values[position]}")

Analogy: Think of tuples as sealed packages that come from a factory with their contents verified and secured. Once sealed, the contents cannot be modified, ensuring that what was packed is exactly what will be delivered.

When Not to Use Tuples

Tuples aren't the best choice when:

You need to modify the collection after creation
The collection size will change
You need to add or remove elements
You need set operations like unions and intersections

When to Use Dictionaries

Dictionaries store key-value pairs and offer extremely fast lookups. They're perfect when:

You Need Fast Lookups by Key

This is the dictionary's superpower—retrieving values by their associated keys:

Looking up information by ID, name, or other identifier
Implementing caches or memoization
Counting occurrences (frequency counting)


# File: user_database.py
# Location: ~/python_examples/week2/

# User information accessible by username
users = {
    "alice_smith": {
        "name": "Alice Smith",
        "email": "alice@example.com",
        "role": "admin"
    },
    "bob_jones": {
        "name": "Bob Jones",
        "email": "bob@example.com",
        "role": "user"
    },
    "charlie_davis": {
        "name": "Charlie Davis",
        "email": "charlie@example.com",
        "role": "user"
    }
}

# Fast lookup by key
username = "alice_smith"
if username in users:
    user_info = users[username]
    print(f"User found: {user_info['name']}, Role: {user_info['role']}")
else:
    print(f"User {username} not found")

You Need to Map Keys to Values

When you have natural pairs of related information:

Configuration settings (name → value)
Language translations (word → translation)
Inventory tracking (item → quantity)


# File: language_translator.py
# Location: ~/python_examples/week2/

# Simple English to Spanish dictionary
translations = {
    "hello": "hola",
    "goodbye": "adiós",
    "thank you": "gracias",
    "yes": "sí",
    "no": "no",
    "please": "por favor"
}

# Translate a sentence
def translate(text, dictionary):
    words = text.lower().split()
    translated_words = []
    
    for word in words:
        if word in dictionary:
            translated_words.append(dictionary[word])
        else:
            translated_words.append(word)  # Keep untranslated
    
    return " ".join(translated_words)

english_text = "Hello please thank you goodbye"
spanish_text = translate(english_text, translations)
print(f"English: {english_text}")
print(f"Spanish: {spanish_text}")

You Need to Count Occurrences

Dictionaries are superb for keeping track of how many times items appear:


# File: word_counter.py
# Location: ~/python_examples/week2/

text = "the quick brown fox jumps over the lazy dog the fox was quick"
words = text.lower().split()

# Count word frequencies
word_counts = {}
for word in words:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

# Alternative using get() method
word_counts_alt = {}
for word in words:
    word_counts_alt[word] = word_counts_alt.get(word, 0) + 1

# Print the most frequent words
sorted_words = sorted(word_counts.items(), key=lambda x: x[1], reverse=True)
for word, count in sorted_words[:3]:  # Top 3
    print(f"'{word}' appears {count} times")

Real-world application: Consider an e-commerce platform that needs to check product prices quickly. With thousands of products, searching through a list would be inefficient. A dictionary with product codes as keys and prices as values allows for instant price lookups, crucial for real-time operations.

When Not to Use Dictionaries

Despite their versatility, dictionaries aren't ideal when:

You need to maintain a specific order (though modern Python dictionaries do preserve insertion order)
You need to perform operations on all values frequently
Memory efficiency is a critical concern (dictionaries have some overhead)
You only care about the presence of items, not associated values (use sets)

When to Use Sets

Sets are unordered collections of unique elements with fast membership testing. They excel when:

You Need to Ensure Uniqueness

Sets automatically eliminate duplicates, making them perfect for:

Removing duplicates from a collection
Tracking distinct values
Implementing "have we seen this before?" logic


# File: unique_visitors.py
# Location: ~/python_examples/week2/

# Log of website visitors by IP address
visitor_ips = [
    "192.168.1.1", "10.0.0.2", "192.168.1.1", "192.168.1.3",
    "10.0.0.2", "172.16.0.1", "192.168.1.1", "10.0.0.4"
]

# Get unique visitors
unique_visitors = set(visitor_ips)
print(f"Total visits: {len(visitor_ips)}")
print(f"Unique visitors: {len(unique_visitors)}")
print(f"Unique IPs: {unique_visitors}")

You Need Fast Membership Testing

Checking if an item exists in a set is extremely fast, even for large collections:


# File: word_filter.py
# Location: ~/python_examples/week2/

# A set of words to filter (e.g., profanity filter)
banned_words = {"bad", "inappropriate", "offensive", "rude"}

def is_acceptable(text):
    """Check if text contains any banned words."""
    words = set(text.lower().split())
    
    # Fast check for intersection
    if words & banned_words:  # Set intersection
        return False
    return True

message1 = "This is a perfectly fine message"
message2 = "This message contains something inappropriate"

print(f"Message 1 acceptable: {is_acceptable(message1)}")
print(f"Message 2 acceptable: {is_acceptable(message2)}")

You Need Set Operations

Sets support powerful mathematical operations like unions, intersections, and differences:


# File: student_groups.py
# Location: ~/python_examples/week2/

# Students in different courses
math_students = {"Alice", "Bob", "Charlie", "Diana", "Eve"}
physics_students = {"Bob", "Diana", "Frank", "Grace"}
cs_students = {"Alice", "Charlie", "Eve", "Frank"}

# Students in both Math and CS (intersection)
math_and_cs = math_students & cs_students
print(f"Taking both Math and CS: {math_and_cs}")

# Students in any science course (union)
science_students = physics_students | cs_students
print(f"Taking at least one science course: {science_students}")

# Math students not taking Physics (difference)
math_not_physics = math_students - physics_students
print(f"Taking Math but not Physics: {math_not_physics}")

# Students taking exactly one of Math or Physics (symmetric difference)
one_subject_only = math_students ^ physics_students
print(f"Taking either Math or Physics (but not both): {one_subject_only}")

Analogy: Think of sets as a VIP list at an exclusive event. The bouncer can instantly check if someone's name is on the list (membership testing), and each person can only be on the list once (uniqueness).

When Not to Use Sets

Sets aren't the right choice when:

The order of elements matters
You need to store duplicate items
You need to associate values with keys
You need to access elements by position (index)

Combining Data Structures for Complex Problems

Real-world problems often require multiple data structures working together. Let's examine some composite examples:

A Simple Library Management System

This example shows how different data structures handle different aspects of the problem:


# File: library_system.py
# Location: ~/python_examples/week2/

# Book database as a dictionary with ISBN as key
books = {
    "978-1451673319": {
        "title": "Fahrenheit 451",
        "author": "Ray Bradbury",
        "year": 1953,
        "copies": 3
    },
    "978-0061120084": {
        "title": "To Kill a Mockingbird",
        "author": "Harper Lee",
        "year": 1960,
        "copies": 5
    },
    "978-0451524935": {
        "title": "1984",
        "author": "George Orwell",
        "year": 1949,
        "copies": 4
    }
}

# Checked out books with due dates (dictionary of sets)
checked_out = {
    "user1": {("978-1451673319", "2023-06-15")},
    "user2": {("978-0061120084", "2023-06-10"), ("978-0451524935", "2023-06-20")}
}

# Available books (calculated using sets)
def get_available_books():
    all_isbns = set(books.keys())
    checked_out_isbns = set()
    
    for user_checkouts in checked_out.values():
        for isbn, _ in user_checkouts:
            checked_out_isbns.add(isbn)
    
    available_isbns = all_isbns - checked_out_isbns
    
    # Create a list of available book details
    available_books = []
    for isbn in available_isbns:
        book = books[isbn].copy()  # Copy the book details
        book["isbn"] = isbn  # Add the ISBN to the details
        available_books.append(book)
    
    return available_books

# Check out a book
def checkout_book(user_id, isbn, due_date):
    if isbn not in books:
        return False, "Book not found"
    
    # Check if book is already checked out
    for user_checkouts in checked_out.values():
        for checkout_isbn, _ in user_checkouts:
            if checkout_isbn == isbn:
                return False, "Book already checked out"
    
    # Add to user's checkouts
    if user_id not in checked_out:
        checked_out[user_id] = set()
    
    checked_out[user_id].add((isbn, due_date))
    return True, "Book checked out successfully"

# Print available books
print("Available Books:")
for book in get_available_books():
    print(f"- {book['title']} by {book['author']} (ISBN: {book['isbn']})")

# Demonstrate checkout
status, message = checkout_book("user3", "978-0451524935", "2023-07-01")
print(f"\nCheckout result: {message}")

In this example:

A dictionary stores the book database, allowing fast lookups by ISBN
Nested dictionaries store book details
A dictionary of sets tracks checked-out books
Set operations find available books efficiently

Customer Order Processing System

Let's examine another composite example:


# File: order_processing.py
# Location: ~/python_examples/week2/

# Product catalog (dictionary for fast lookup)
products = {
    "SKU001": {"name": "Desk Lamp", "price": 19.99, "weight": 2.5},
    "SKU002": {"name": "Office Chair", "price": 149.99, "weight": 15.0},
    "SKU003": {"name": "Notebook", "price": 4.99, "weight": 0.5},
    "SKU004": {"name": "Pen Set", "price": 12.99, "weight": 0.3},
    "SKU005": {"name": "Desk Organizer", "price": 24.99, "weight": 1.8}
}

# Orders as a list (to maintain chronological order)
orders = [
    {
        "order_id": "ORD-001",
        "customer_id": "CUST-1234",
        "date": "2023-05-15",
        "items": [("SKU002", 1), ("SKU003", 2)]
    },
    {
        "order_id": "ORD-002",
        "customer_id": "CUST-5678",
        "date": "2023-05-16",
        "items": [("SKU001", 1), ("SKU004", 1), ("SKU005", 1)]
    }
]

# Process orders and calculate totals
def process_orders():
    for order in orders:
        print(f"\nProcessing Order: {order['order_id']}")
        print(f"Customer: {order['customer_id']}")
        print(f"Date: {order['date']}")
        print("Items:")
        
        total_price = 0
        total_weight = 0
        
        for sku, quantity in order["items"]:
            if sku in products:
                product = products[sku]
                item_price = product["price"] * quantity
                item_weight = product["weight"] * quantity
                
                print(f"  - {product['name']} × {quantity}: ${item_price:.2f}")
                
                total_price += item_price
                total_weight += item_weight
            else:
                print(f"  - Unknown product: {sku}")
        
        print(f"Total Price: ${total_price:.2f}")
        print(f"Total Weight: {total_weight:.1f} lbs")

# Find popular products (using a dictionary for counting)
def analyze_popular_products():
    product_counts = {}
    
    for order in orders:
        for sku, quantity in order["items"]:
            if sku in products:
                if sku not in product_counts:
                    product_counts[sku] = 0
                product_counts[sku] += quantity
    
    # Sort by popularity
    popular_products = sorted(
        product_counts.items(),
        key=lambda x: x[1],
        reverse=True
    )
    
    return popular_products

# Process all orders
process_orders()

# Show popular products
print("\nPopular Products:")
for sku, count in analyze_popular_products():
    product = products[sku]
    print(f"- {product['name']}: {count} units sold")

This example demonstrates:

A dictionary for the product catalog, enabling fast lookups
A list for orders, maintaining chronological sequence
Tuples for order items (SKU and quantity)
Another dictionary for counting product popularity

Real-world insight: In professional software development, these composite patterns are extremely common. Database records often become dictionaries, collections of unique identifiers become sets, and ordered operations use lists. Mastering when to use each structure and how to combine them effectively is a hallmark of experienced programmers.

Performance Considerations

When selecting a data structure, performance characteristics can be critical, especially for larger datasets:

Operation	List	Tuple	Dictionary	Set
Access by index	O(1)	O(1)	N/A	N/A
Search (in/contains)	O(n)	O(n)	O(1)	O(1)
Insert/Add	O(1)*	N/A	O(1)	O(1)
Delete	O(n)	N/A	O(1)	O(1)

* Amortized - resizing may occasionally take longer

These time complexities tell an important story:

If you need to frequently check if an item exists in a large collection, a list or tuple will be slow (O(n)), while dictionaries and sets are fast (O(1))
If you need to access items by position, lists and tuples are optimal
If you need to delete items frequently, dictionaries and sets outperform lists

Let's demonstrate the practical impact with a simple benchmark:


# File: data_structure_benchmark.py
# Location: ~/python_examples/week2/

import time
import random

# Generate test data
data_size = 10000
test_data = list(range(data_size))
random.shuffle(test_data)

# Convert to different data structures
test_list = test_data.copy()
test_set = set(test_data)
test_dict = {x: x for x in test_data}

# Test lookup performance
lookup_items = random.sample(test_data, 1000)  # 1000 random lookups

def benchmark_lookup(description, operation, iterations=1000):
    start_time = time.time()
    
    for _ in range(iterations):
        for item in lookup_items:
            operation(item)
    
    end_time = time.time()
    elapsed = end_time - start_time
    print(f"{description}: {elapsed:.6f} seconds")

# Run benchmarks
benchmark_lookup("List lookup", lambda x: x in test_list)
benchmark_lookup("Set lookup", lambda x: x in test_set)
benchmark_lookup("Dict lookup", lambda x: x in test_dict)

# Output will show significant differences in performance
# For example:
# List lookup: 0.123456 seconds
# Set lookup: 0.000123 seconds
# Dict lookup: 0.000135 seconds

For large datasets, the difference can be dramatic—the set and dictionary lookups might be hundreds or thousands of times faster than list lookups.

Real-world application: In a social network application, checking if a user is connected to another user must be extremely fast. Using sets to store connections allows for O(1) lookup times, enabling features like "friend suggestions" or "people you may know" to work efficiently even with millions of users.

A Decision-Making Framework

Here's a practical framework for selecting the appropriate data structure for your specific needs:

Step 1: Understand Your Data Access Patterns

Start by answering these questions:

How will you most frequently access the data? (By index, by key, sequentially, randomly?)
Will you need to add or remove items frequently?
Does order matter?
Are duplicates allowed or meaningful?
Will you need to check membership frequently?

Step 2: Consider Size and Performance Requirements

How large will the dataset be?
Are there tight performance constraints?
Is memory usage a concern?

Step 3: Apply This Decision Tree

Based on your answers, use this simplified decision tree:

If you need key-value mapping → Use a dictionary
If you need to ensure uniqueness or do set operations → Use a set
If you need a sequence that won't change → Use a tuple
If you need a modifiable, ordered sequence → Use a list
If you need fast lookups by content, not position → Use a set or dictionary

Step 4: Combine as Needed

Don't hesitate to create composite structures for complex problems, such as:

Lists of dictionaries (e.g., database records)
Dictionaries of sets (e.g., user to connections mapping)
Dictionaries of lists (e.g., category to items mapping)

Practical example: Let's apply this framework to a real problem—tracking employee attendance:


# File: attendance_tracker.py
# Location: ~/python_examples/week2/

# Requirements analysis:
# 1. Need to look up attendance by employee ID (key-value mapping) → Dictionary
# 2. Need to track dates chronologically (ordered sequence) → List
# 3. For each date, need to know if present/absent (no duplicates) → Set
# 4. Employee details won't change (immutable data) → Tuple

# Result: A dictionary of employee records (tuples) with attendance as sets

# Employee records: (name, department, start_date)
employees = {
    "E001": ("Alice Smith", "Engineering", "2020-03-15"),
    "E002": ("Bob Johnson", "Marketing", "2021-07-22"),
    "E003": ("Charlie Davis", "Finance", "2019-11-08"),
    "E004": ("Diana Wilson", "Engineering", "2022-01-10")
}

# Attendance records: {employee_id: {present_dates}}
attendance = {
    "E001": {"2023-05-01", "2023-05-02", "2023-05-03", "2023-05-05"},
    "E002": {"2023-05-01", "2023-05-02", "2023-05-03", "2023-05-04", "2023-05-05"},
    "E003": {"2023-05-01", "2023-05-03", "2023-05-05"},
    "E004": {"2023-05-01", "2023-05-02", "2023-05-04", "2023-05-05"}
}

# Track all workdays
workdays = ["2023-05-01", "2023-05-02", "2023-05-03", "2023-05-04", "2023-05-05"]

# Generate attendance report
def generate_attendance_report():
    print("Attendance Report: May 1-5, 2023\n")
    
    for emp_id, emp_details in employees.items():
        name, department, _ = emp_details
        present_days = attendance.get(emp_id, set())
        absent_days = set(workdays) - present_days
        
        attendance_rate = len(present_days) / len(workdays) * 100
        
        print(f"Employee: {name} ({emp_id}), Department: {department}")
        print(f"Attendance Rate: {attendance_rate:.1f}%")
        
        if absent_days:
            print(f"Absent on: {', '.join(sorted(absent_days))}")
        else:
            print("Perfect attendance!")
        print()

# Find employees with perfect attendance
def find_perfect_attendance():
    perfect_attendance = []
    
    for emp_id, present_days in attendance.items():
        if len(present_days) == len(workdays):
            name = employees[emp_id][0]  # Get employee name
            perfect_attendance.append(name)
    
    return perfect_attendance

# Generate the report
generate_attendance_report()

# List employees with perfect attendance
perfect = find_perfect_attendance()
print(f"Employees with perfect attendance: {', '.join(perfect)}")

This example demonstrates how a thoughtful combination of data structures leads to clean, efficient code that matches the problem's requirements.

Hands-On Exercises

Let's practice selecting appropriate data structures with some exercises. For each scenario, consider which data structure(s) would be most appropriate and implement a solution.

Exercise 1: Contact Management System

Implement a simple contact management system that allows adding, searching, and categorizing contacts. Think about how to efficiently look up contacts by different criteria (name, email, phone) and how to organize contacts into groups.


# File: contact_manager.py
# Location: ~/python_examples/week2/exercises/

# Your solution here

# Example approach:
contacts = {}  # Dictionary with email as key for unique identification

def add_contact(name, email, phone, categories=None):
    if email in contacts:
        print(f"Contact with email {email} already exists")
        return False
    
    categories = set(categories) if categories else set()
    contacts[email] = {"name": name, "phone": phone, "categories": categories}
    return True

def search_by_name(name):
    return [contacts[email] for email in contacts if name.lower() in contacts[email]["name"].lower()]

def add_to_category(email, category):
    if email in contacts:
        contacts[email]["categories"].add(category)
        return True
    return False

def get_contacts_in_category(category):
    return [contacts[email] for email in contacts if category in contacts[email]["categories"]]

# Test the system
add_contact("Alice Smith", "alice@example.com", "555-123-4567", ["work", "friends"])
add_contact("Bob Johnson", "bob@example.com", "555-987-6543", ["family"])
add_contact("Charlie Brown", "charlie@example.com", "555-567-8901", ["work"])

print("Contacts in 'work' category:")
for contact in get_contacts_in_category("work"):
    print(f"- {contact['name']}: {contact['phone']}")

search_result = search_by_name("Alice")
if search_result:
    print(f"\nFound contact: {search_result[0]['name']}")

Exercise 2: Shopping Cart System

Design a shopping cart system for an online store. Consider how to track products, quantities, calculate totals, and handle operations like adding, removing, and updating quantities.


# File: shopping_cart.py
# Location: ~/python_examples/week2/exercises/

# Your solution here

# Example approach:
# Product catalog as a dictionary
products = {
    "P001": {"name": "T-Shirt", "price": 19.99, "category": "Clothing"},
    "P002": {"name": "Jeans", "price": 49.99, "category": "Clothing"},
    "P003": {"name": "Sneakers", "price": 79.99, "category": "Footwear"},
    "P004": {"name": "Backpack", "price": 39.99, "category": "Accessories"},
    "P005": {"name": "Water Bottle", "price": 9.99, "category": "Accessories"}
}

# Shopping cart as a dictionary with product_id -> quantity mapping
class ShoppingCart:
    def __init__(self):
        self.items = {}  # product_id -> quantity
    
    def add_item(self, product_id, quantity=1):
        if product_id not in products:
            print(f"Product {product_id} not found")
            return False
        
        if product_id in self.items:
            self.items[product_id] += quantity
        else:
            self.items[product_id] = quantity
        return True
    
    def remove_item(self, product_id):
        if product_id in self.items:
            del self.items[product_id]
            return True
        return False
    
    def update_quantity(self, product_id, quantity):
        if product_id not in self.items:
            return False
        
        if quantity <= 0:
            return self.remove_item(product_id)
        
        self.items[product_id] = quantity
        return True
    
    def get_total(self):
        total = 0
        for product_id, quantity in self.items.items():
            total += products[product_id]["price"] * quantity
        return total
    
    def view_cart(self):
        if not self.items:
            print("Your cart is empty")
            return
        
        print("Your Shopping Cart:")
        total = 0
        
        for product_id, quantity in self.items.items():
            product = products[product_id]
            item_total = product["price"] * quantity
            total += item_total
            
            print(f"- {product['name']} × {quantity}: ${item_total:.2f}")
        
        print(f"\nTotal: ${total:.2f}")

# Test the cart
cart = ShoppingCart()
cart.add_item("P001", 2)  # 2 T-shirts
cart.add_item("P003", 1)  # 1 pair of sneakers
cart.add_item("P005", 3)  # 3 water bottles

cart.view_cart()

print("\nUpdating quantities...")
cart.update_quantity("P001", 1)  # Reduce to 1 T-shirt
cart.remove_item("P005")  # Remove water bottles

cart.view_cart()

Work through these exercises to practice selecting appropriate data structures for different scenarios. Try implementing your own solutions before looking at the examples.

Choosing Wisely: The Art of Data Structure Selection

Selecting the right data structure is both a science and an art. It requires understanding the strengths and weaknesses of each option, analyzing your specific requirements, and sometimes making trade-offs.

Remember these key principles:

Match the structure to your access patterns – How you'll use the data is often more important than what the data is
Consider performance implications – For large datasets, the difference between O(1) and O(n) operations can be enormous
Don't sacrifice clarity for optimization – Readable code with slightly suboptimal data structures is usually better than confusing code with perfectly optimized structures
Combine structures when necessary – Complex problems often require composite solutions
Revise as requirements evolve – Be willing to refactor when you discover new requirements or patterns

With practice, selecting the right data structure will become intuitive. You'll develop a feel for which tool fits each problem, just as an experienced carpenter knows instinctively whether to reach for a hammer, a saw, or a drill.

Next Steps

As we continue our Python journey, we'll explore how these fundamental data structures become even more powerful when combined with:

Functions – Creating reusable operations that work with your data
Classes – Building custom data structures that encapsulate both data and behavior
Algorithms – Implementing efficient solutions to complex problems
External libraries – Leveraging specialized data structures for specific domains

In our next session, we'll focus on functions – the building blocks of reusable code that operate on these data structures. We'll see how well-designed functions can make your data structures even more powerful and your code more modular and maintainable.