Week 14 Lecture: Intermediate Data Structures & Pythonic Coding

1. Sets

A Set is a built-in data structure in Python that represents an unordered collection of unique elements.

Unordered: You cannot access items by index (e.g., you cannot do my_set[0]). The order in which you insert items is not guaranteed to be preserved.
Unique: Sets automatically remove duplicates. If you add the same item twice, it only appears once.

Sets in Python

Creating Sets

Just like dictionaries, sets use curly braces {}.

my_set = {1, 2, 3}

The “Empty Set” Trap: You might think you can create an empty set using empty curly braces {}. However, Python interprets {} as an empty dictionary. To create an empty set, you must use the set() constructor:

empty_dict = {}      # This is a dictionary
empty_set = set()    # This is a set

Converting to Sets: You can convert any collection (lists, tuples, strings) into a set. This is the fastest way to remove duplicates from a list.

# From a list (removes duplicates)
list_to_set = set([1, 2, 2])   # Result: {1, 2}

# From a string (breaks into unique characters)
chars = set('hello')           # Result: {'h', 'e', 'l', 'o'}

Modifying Sets

Adding: Use .add(item).
Removing: There are two ways to remove items, and the difference is important:
1. remove(item): Raises a KeyError if the item does not exist.
2. discard(item): Removes the item if it exists; does nothing if it doesn’t. No error is raised.

s = {1, 2, 3}
s.add(4)        # {1, 2, 3, 4}
s.remove(4)     # {1, 2, 3}
s.discard(99)   # No error, even though 99 isn't there

Set Operations (Math)

Sets allow us to compare groups of data efficiently.

Operation	Operator	Method	Analogy
Union	`\|`	`.union()`	The Guest List. You make a guest list, your friend makes one. Combining them gives you all unique guests.
Intersection	`&`	`.intersection()`	Meeting Friends. You are free some days, your friend is free some days. The intersection is the days you both are free.
Difference	`-`	`.difference()`	To-Do List. List 1 is all assignments. List 2 is finished assignments. The difference is what you have left to do.
Symmetric Difference	`^`	`.symmetric_difference()`	Sports Teams. Students who play only football or only basketball, excluding those who play both.

Code Example:

set_a = {1, 2, 3}
set_b = {3, 4, 5}

# Union (All unique items: 1, 2, 3, 4, 5)
print(set_a | set_b) 

# Intersection (Shared items: 3)
print(set_a & set_b)

# Difference (In A but NOT in B: 1, 2)
# Note: Order matters here! (set_b - set_a would be {4, 5})
print(set_a - set_b)

# Symmetric Difference (Unique to A or B, not both: 1, 2, 4, 5)
print(set_a ^ set_b)

Practice Case 1: Access Control System

Scenario: You need to identify security risks and marketing targets by comparing a messy list of all employees against a list of gym members.

def analyze_access(employee_list, member_list):
    # Convert lists to sets to remove duplicates immediately
    emp_set = set(employee_list)
    mem_set = set(member_list)
    
    # 1. Invalid Members: In gym list, but NOT in employee list
    invalid_members = mem_set - emp_set
    
    # 2. Verified Members: In BOTH gym list AND employee list
    verified_members = mem_set & emp_set
    
    # 3. Marketing Targets: Employees who are NOT gym members
    missing_employees = emp_set - verified_members
    
    return verified_members, missing_employees, invalid_members

# Example usage:
employee_list = ["Alice", "Bob", "Charlie", "David", "Eve", "Alice"]
member_list = ["Alice", "Charlie", "Frank", "Alice", "Bob"]

verified, missing, invalid = analyze_access(employee_list, member_list)
print("Verified Members:", verified)
print("Missing Employees:", missing)
print("Invalid Members:", invalid)

Expected Output:

Verified Members: {'Alice', 'Charlie', 'Bob'}
Missing Employees: {'Eve', 'David'}
Invalid Members: {'Frank'}

Practice Case 2: The Playlist Merger

Scenario: Merging two music libraries to find a “Master Playlist” and a “Discovery Playlist.”

def blend_playlists(list_a, list_b):
    set_a = set(list_a)
    set_b = set(list_b)
    
    # Union: Combine all songs
    master_playlist = set_a | set_b
    
    # Symmetric Difference: Songs unique to only one person (XOR)
    discovery_playlist = set_a ^ set_b
    
    return master_playlist, discovery_playlist

# Example usage:
list_a = ["Song1", "Song2", "Song3", "Song4"]
list_b = ["Song3", "Song4", "Song5", "Song6"]

master, discovery = blend_playlists(list_a, list_b)
print("Master Playlist:", master)
print("Discovery Playlist:", discovery)

Expected Output:

Master Playlist: {'Song1', 'Song5', 'Song6', 'Song2', 'Song3', 'Song4'}
Discovery Playlist: {'Song1', 'Song5', 'Song6', 'Song2'}

2. Advanced Iteration (`enumerate` & `zip`)

Python provides cleaner ways to loop than using range(len(list)).

Enumerate

When you need both the item and its index (position), use enumerate().

It returns a tuple (index, value) on each iteration.
You can specify a start parameter to begin counting from a number other than 0.

names = ["Alice", "Bob", "Charlie"]

# Bad way:
# for i in range(len(names)):
#     print(i, names[i])

# Pythonic way:
for i, name in enumerate(names, start=1):
    print(f"Rank {i}: {name}")

Zip

When you need to loop over multiple lists simultaneously, use zip().

It pairs up elements at the same index from each list.
It stops automatically when the shortest list runs out.

names = ["Alice", "Bob", "Charlie"]
scores = [85, 90, 78]

for name, score in zip(names, scores):
    print(f"{name} scored {score}")

Combining Both: You can use enumerate on a zip object if you need the index and items from multiple lists.

Practice Case: The Quiz Grader

Scenario: Compare a student’s answers to an answer key and provide specific feedback using the question number.

def grade_quiz(answer_key, student_answers):
    score = 0
    feedback = []
    
    # We unpack the index (q_num) from enumerate...
    # ...and the pair (correct, actual) from zip
    for q_num, (correct, actual) in enumerate(zip(answer_key, student_answers), start=1):
        if correct == actual:
            score += 1
        else:
            feedback.append(f"Question {q_num}: Expected {correct}, got {actual}")
            
    return score, feedback

# Example usage:
answer_key = ["A", "B", "C", "D", "A"]
student_answers = ["A", "C", "C", "D", "B"]

score, feedback = grade_quiz(answer_key, student_answers)
print("Score:", score)
print("Feedback:")
for item in feedback:
    print(" ", item)

Expected Output:

Score: 3
Feedback:
  Question 2: Expected B, got C
  Question 5: Expected A, got B

3. List & Dictionary Comprehensions

Comprehensions allow you to create new lists or dictionaries based on existing ones in a single, readable line. They generally follow this pattern:

[expression for item in iterable if condition]

The Logic Steps

Imagine we want squares of even numbers from [1, 2, 3, 4, 5].

Loop: for n in numbers
Filter: if n % 2 == 0
Expression: n ** 2
Collection Type: Wrap in [] for list, {} for set/dict.

numbers = [1, 2, 3, 4, 5]

# List Comprehension
squares = [n ** 2 for n in numbers if n % 2 == 0] # [4, 16]

# Set Comprehension
unique_squares = {n ** 2 for n in numbers}

# Dictionary Comprehension (Key: Value)
square_dict = {n: n**2 for n in numbers}          # {1: 1, 2: 4, ...}

Note on Tuples: Putting parentheses around a comprehension (n for n in numbers) does not create a tuple. It creates a Generator (see Section 4 for more on Iterators). To create a tuple, you must cast it explicitly: tuple(n for n in numbers).

Practice Case: Text Analyzer

Scenario: Extract keywords longer than 3 letters from a sentence, uppercase them, and map them to their lengths.

text = "python is an amazing programming language for beginners"

# 1. List Comprehension: Filter > 3 chars, Transform to UPPER
keywords = [word.upper() for word in text.split() if len(word) > 3]

# 2. Dictionary Comprehension: Map word -> length
word_metrics = {word: len(word) for word in keywords}

print("Keywords:", keywords)
print("Word Metrics:", word_metrics)

Expected Output:

Keywords: ['PYTHON', 'AMAZING', 'PROGRAMMING', 'LANGUAGE', 'BEGINNERS']
Word Metrics: {'PYTHON': 6, 'AMAZING': 7, 'PROGRAMMING': 11, 'LANGUAGE': 8, 'BEGINNERS': 9}

4. Functional Programming (`lambda`, `map`, `filter`)

Lambda Functions

A lambda is an anonymous (nameless), single-line function. It is useful when you need a short function for a quick operation, like a custom sorting rule.

Syntax: lambda arguments: expression (No return keyword needed).

# Standard function
def add_five(x):
    return x + 5

# Equivalent Lambda
add_five_lambda = lambda x: x + 5

Practical Use: Custom Sorting The most common use of lambda is with the key argument in sort(), min(), or max().

prices = [("apple", 15), ("banana", 5), ("cherry", 10)]

# Sort by price (index 1), descending (using negative value)
prices.sort(key=lambda item: -item[1])
# Result: [('apple', 15), ('cherry', 10), ('banana', 5)]

Map and Filter (Lazy Evaluation)

These functions apply a rule to a sequence. Crucially, they return Iterators, not lists.

map(func, iterable): Applies func to every item.
filter(func, iterable): Keeps items where func returns True.

Understanding Iterators (Lazy Evaluation): Iterators do not store all values in memory at once. They generate the next value only when asked (e.g., in a loop).

They are “One-time use”: Once you loop through an iterator, it is empty.
You can use next(iterator) to get the next item manually.
To see all items at once, convert to a list: list(my_iterator).

nums = [1, 2, 3, 4]

# Map: Double the numbers
doubled = map(lambda x: x * 2, nums)

# Filter: Keep evens
evens = filter(lambda x: x % 2 == 0, nums)

# Must convert to list to print/view results
print(list(doubled))  # [2, 4, 6, 8]
print(list(evens))    # [2, 4]

Practice Case: Inventory Sorter

Scenario: Filter items that need restocking (< 10 qty) and sort inventory by total value (Qty * Price).

inventory = [
    ("Widget A", 50, 2.0),
    ("Widget B", 5, 10.0),
    ("Widget C", 8, 100.0),
    ("Widget D", 2, 5.0)
]

# 1. Filter restock items
restock_list = list(filter(lambda item: item[1] < 10, inventory))

# 2. Sort by Total Value (Quantity * Price) using lambda
inventory.sort(key=lambda item: item[1] * item[2], reverse=True)

print("Restock List:", restock_list)
print("Sorted Inventory by Total Value:", inventory)

Expected Output:

Restock List: [('Widget B', 5, 10.0), ('Widget C', 8, 100.0), ('Widget D', 2, 5.0)]
Sorted Inventory by Total Value: [('Widget C', 8, 100.0), ('Widget A', 50, 2.0), ('Widget B', 5, 10.0), ('Widget D', 2, 5.0)]

1. Sets

Creating Sets

Modifying Sets

Set Operations (Math)

2. Advanced Iteration (enumerate & zip)

Enumerate

Zip

3. List & Dictionary Comprehensions

The Logic Steps

4. Functional Programming (lambda, map, filter)

Lambda Functions

Map and Filter (Lazy Evaluation)

2. Advanced Iteration (`enumerate` & `zip`)

4. Functional Programming (`lambda`, `map`, `filter`)