Week 6 Assignments

Variant 1: Student Gradebook Analyzer

You are a Teaching Assistant for a programming course. You have the gradebook data stored as a list of tuples. Each tuple contains a student’s name and a list of their quiz scores. Your task is to build a Python script that analyzes this data.

You must implement the following functions:

calculate_average(scores_list):
- Input: A list of numbers (a student’s scores).
- Output: The average of the scores in the list. Should return 0.0 for an empty list.
find_top_student(gradebook):
- Input: The main gradebook list.
- Output: The name of the student with the highest average score. If there’s a tie, return the name that comes first alphabetically.
get_grades_for_student(gradebook, student_name):
- Input: The gradebook list and a string student_name.
- Output: A list of scores for the specified student. If the student is not found, return an empty list [].
get_hardest_quiz_index(gradebook):
- Input: The gradebook list.
- Logic: Calculate the average score for each quiz (i.e., the average of all scores at index 0, then index 1, etc.).
- Output: The index of the quiz with the lowest average score. If there’s a tie, return the smaller index. Assume all students have taken the same number of quizzes.

Finally, create a main function analyze_gradebook(gradebook) that uses your helper functions to return a summary tuple containing: (top_student_name, grades_of_top_student, hardest_quiz_idx)

Testing Inputs:

gradebook = [
    ('Alice', [85, 80, 92]),
    ('Bob', [88, 74, 85]),
    ('Charlie', [92, 85, 88]),
    ('David', [90, 78, 94])
]

Expected Output:

('Charlie', [92, 85, 88], 1)

Variant 2: E-commerce Product Analyzer

You are a data analyst for an online store. You have a dataset of products, represented as a list of tuples. Each tuple contains a (product_id, category, price, units_sold). Your task is to write a Python script to extract key business metrics from this data.

You must implement the following functions:

calculate_product_revenue(product_tuple):
- Input: A single product tuple (id, cat, price, sold).
- Output: The total revenue for that product (price * units_sold).
find_top_revenue_product(products):
- Input: The main list of product tuples.
- Output: The product_id of the product that generated the most revenue. If there’s a tie, return the product_id that comes first alphabetically.
get_products_in_category(products, category_name):
- Input: The products list and a string category_name.
- Output: A list of product_ids that belong to the specified category. The list should be sorted alphabetically.
get_category_sales_summary(products):
- Input: The products list.
- Logic: Find all unique categories. For each category, calculate the total number of units sold.
- Output: A list of tuples ('category_name', total_units_sold). This list must be sorted alphabetically by category name.

Finally, create a main function analyze_products(products) that uses your helper functions to return a summary tuple containing: (top_revenue_product_id, electronics_product_ids, category_summary) where electronics_product_ids is the result of calling get_products_in_category() with 'Electronics'.

Testing Inputs:

products = [
    ('P101', 'Electronics', 799.99, 150),
    ('P205', 'Books', 24.50, 500),
    ('P102', 'Electronics', 499.50, 200),
    ('P301', 'Home Goods', 120.00, 800),
    ('P206', 'Books', 19.99, 650)
]

Expected Output:

('P101', ['P101', 'P102'], [('Books', 1150), ('Electronics', 350), ('Home Goods', 800)])

Variant 3: Weather Data Analyzer

You are a meteorologist analyzing historical weather data. The data is provided as a list of tuples, where each tuple contains the ('YYYY-MM-DD', max_temp_celsius, min_temp_celsius, precipitation_mm). Your job is to write a Python script to analyze this weather data.

You must implement the following functions:

get_daily_temp_swing(weather_day_tuple):
- Input: A single weather tuple (date, max_t, min_t, precip).
- Output: The difference between the max and min temperature for that day.
find_day_with_largest_swing(weather_data):
- Input: The main list of weather data.
- Output: The date string ('YYYY-MM-DD') of the day with the largest temperature swing. If there’s a tie, return the date that occurs earliest in the list.
count_days_above_precip(weather_data, threshold):
- Input: The weather data list and a numerical threshold.
- Output: The number of days where precipitation was strictly greater than the threshold.
get_monthly_summary(weather_data):
- Input: The weather data list.
- Logic: Find all unique months (e.g., '2023-10'). For each month, calculate the average maximum temperature and the total precipitation.
- Output: A list of tuples ('YYYY-MM', avg_max_temp, total_precip). This list must be sorted by month.

Finally, create a main function analyze_weather(weather_data) that uses your helper functions to return a summary tuple containing: (day_with_largest_swing, num_heavy_rain_days, monthly_summary) where num_heavy_rain_days is the count for a precipitation threshold of 10.0.

Testing Inputs:

weather_data = [
    ('2023-10-01', 22, 10, 5.5),   # Swing: 12
    ('2023-10-02', 25, 11, 0.0),   # Swing: 14
    ('2023-10-03', 24, 15, 12.0),  # Swing: 9
    ('2023-11-01', 18, 7, 2.5),    # Swing: 11
    ('2023-11-02', 15, 6, 15.5),   # Swing: 9
    ('2023-11-03', 16, 9, 8.0)    # Swing: 7
]

Expected Output:

('2023-10-02', 2, [('2023-10', 23.666666666666668, 17.5), ('2023-11', 16.333333333333332, 26.0)])

Variant 4: Basketball Player Stats Analyzer

You are a sports analyst for a basketball league. You have season data for several players stored as a list of tuples. Each tuple contains a player’s name, their team, and a list of points scored in each game. Your task is to build a Python script that analyzes this performance data.

You must implement the following functions:

calculate_ppg(points_list):
- Input: A list of numbers (a player’s points per game).
- Output: The average points per game (PPG) for the player. Should return 0.0 for an empty list.
find_mvp(player_stats):
- Input: The main list of player stats.
- Output: The name of the player with the highest PPG. If there’s a tie, return the name that comes first alphabetically.
get_players_from_team(player_stats, team_name):
- Input: The player stats list and a string team_name.
- Output: A list of player names who play for the specified team. The list should be sorted alphabetically.
get_team_scoring_summary(player_stats):
- Input: The player stats list.
- Logic: Find all unique teams. For each team, calculate the total points scored by all its players across all games.
- Output: A list of tuples ('team_name', total_points). This list must be sorted alphabetically by team name.

Finally, create a main function analyze_player_stats(player_stats) that uses your helper functions to return a summary tuple containing: (mvp_name, lakers_players, team_summary) where lakers_players is the result of calling get_players_from_team() with 'Lakers'.

Testing Inputs:

player_stats = [
    ('LeBron', 'Lakers', [30, 25, 35]),
    ('Curry', 'Warriors', [32, 28, 27]),
    ('Davis', 'Lakers', [22, 26, 20]),
    ('Tatum', 'Celtics', [28, 29, 31])
]

Expected Output:

('LeBron', ['Davis', 'LeBron'], [('Celtics', 88), ('Lakers', 158), ('Warriors', 87)])

Variant 5: Investment Portfolio Analyzer

You are a financial analyst managing a client’s stock portfolio. The portfolio data is represented as a list of tuples. Each tuple contains a stock’s (ticker_symbol, sector, purchase_price, current_price, num_shares). Your goal is to write a Python script to evaluate the portfolio’s performance.

You must implement the following functions:

calculate_holding_profit(stock_tuple):
- Input: A single stock tuple (ticker, sector, p_price, c_price, shares).
- Output: The total profit or loss for that holding ((current_price - purchase_price) * num_shares).
find_top_performer(portfolio):
- Input: The main list of portfolio holdings.
- Output: The ticker_symbol of the stock that has generated the most profit. If there’s a tie, return the ticker that comes first alphabetically.
get_tickers_in_sector(portfolio, sector_name):
- Input: The portfolio list and a string sector_name.
- Output: A list of ticker_symbols that belong to the specified sector. The list should be sorted alphabetically.
get_sector_value_summary(portfolio):
- Input: The portfolio list.
- Logic: Find all unique sectors. For each sector, calculate the total current market value of all holdings (current_price * num_shares).
- Output: A list of tuples ('sector_name', total_market_value). This list must be sorted alphabetically by sector name.

Finally, create a main function analyze_portfolio(portfolio) that uses your helper functions to return a summary tuple containing: (top_performer_ticker, tech_tickers, sector_summary) where tech_tickers is the result of calling get_tickers_in_sector() with 'Technology'.

Testing Inputs:

portfolio = [
    ('AAPL', 'Technology', 150.00, 175.00, 100),  # Profit: 2500
    ('JPM', 'Finance', 160.00, 165.00, 200),     # Profit: 1000
    ('GOOG', 'Technology', 2800.00, 2750.00, 10), # Profit: -500
    ('PFE', 'Healthcare', 40.00, 55.00, 500)     # Profit: 7500
]

Expected Output:

('PFE', ['AAPL', 'GOOG'], [('Finance', 33000.0), ('Healthcare', 27500.0), ('Technology', 45000.0)])

Variant 6: Movie Box Office Analyzer

You are a film industry analyst studying box office performance. You have a dataset of movies, represented as a list of tuples. Each tuple contains ('movie_id', 'genre', 'budget_millions', 'revenue_millions'). Your task is to write a Python script to analyze the financial success of these movies.

You must implement the following functions:

calculate_roi(movie_tuple):
- Input: A single movie tuple (id, genre, budget, revenue).
- Output: The movie’s return on investment (ROI) ratio (revenue / budget). Assume budget is never zero.
find_most_profitable_movie(movies):
- Input: The main list of movie tuples.
- Output: The movie_id of the movie with the highest ROI. If there’s a tie, return the movie_id that comes first alphabetically.
get_movies_in_genre(movies, genre_name):
- Input: The movies list and a string genre_name.
- Output: A list of movie_ids that belong to the specified genre. The list should be sorted alphabetically.
get_genre_revenue_summary(movies):
- Input: The movies list.
- Logic: Find all unique genres. For each genre, calculate the total box office revenue.
- Output: A list of tuples ('genre_name', total_revenue). This list must be sorted alphabetically by genre name.

Finally, create a main function analyze_movie_data(movies) that uses your helper functions to return a summary tuple containing: (most_profitable_movie_id, sci_fi_movie_ids, genre_summary) where sci_fi_movie_ids is the result of calling get_movies_in_genre() with 'Sci-Fi'.

Testing Inputs:

movies = [
    ('M01', 'Sci-Fi', 200, 800),  # ROI: 4.0
    ('M02', 'Comedy', 50, 250),   # ROI: 5.0
    ('M03', 'Sci-Fi', 150, 500),  # ROI: 3.33
    ('M04', 'Action', 250, 700),  # ROI: 2.8
    ('M05', 'Comedy', 30, 150)    # ROI: 5.0
]

Expected Output:

('M02', ['M01', 'M03'], [('Action', 700), ('Comedy', 400), ('Sci-Fi', 1300)])

Variant 7: Restaurant Menu Analyzer

You are the manager of a restaurant and need to analyze the profitability and popularity of your menu items. The data is stored as a list of tuples, where each tuple contains the (dish_name, category, cost_to_make, sale_price, times_ordered). Your task is to write a Python script to analyze this menu data.

You must implement the following functions:

calculate_total_profit(dish_tuple):
- Input: A single dish tuple (name, cat, cost, price, ordered).
- Output: The total profit generated by that dish ((sale_price - cost_to_make) * times_ordered).
find_most_profitable_dish(menu_data):
- Input: The main list of menu items.
- Output: The dish_name of the item that has generated the most total profit. If there’s a tie, return the name that comes first alphabetically.
get_dishes_in_category(menu_data, category_name):
- Input: The menu data list and a string category_name.
- Output: A list of dish_names that belong to the specified category. The list should be sorted alphabetically.
get_category_popularity(menu_data):
- Input: The menu data list.
- Logic: Find all unique categories. For each category, calculate the total number of times dishes in that category were ordered.
- Output: A list of tuples ('category_name', total_orders). This list must be sorted alphabetically by category name.

Finally, create a main function analyze_menu(menu_data) that uses your helper functions to return a summary tuple containing: (most_profitable_dish, main_course_dishes, category_summary) where main_course_dishes is the result of calling get_dishes_in_category() with 'Main Course'.

Testing Inputs:

menu_data = [
    ('Bruschetta', 'Appetizer', 2.50, 8.50, 150),  # Profit: 900
    ('Steak Frites', 'Main Course', 9.00, 24.00, 200), # Profit: 3000
    ('Caesar Salad', 'Appetizer', 3.00, 10.00, 250), # Profit: 1750
    ('Pasta Carbonara', 'Main Course', 5.50, 18.50, 300),# Profit: 3900
    ('Tiramisu', 'Dessert', 4.00, 9.00, 400)      # Profit: 2000
]

Expected Output:

('Pasta Carbonara', ['Pasta Carbonara', 'Steak Frites'], [('Appetizer', 400), ('Dessert', 400), ('Main Course', 500)])

Variant 8: Library Circulation Analyzer

You are a librarian analyzing book circulation data to understand reading trends. The data is a list of tuples, where each tuple contains a (book_id, genre, publication_year, total_checkouts). Your task is to write a Python script to analyze this data. (Assume the current year is 2024 for calculations).

You must implement the following functions:

calculate_annualized_checkouts(book_tuple):
- Input: A single book tuple (id, genre, pub_year, checkouts).
- Output: The average number of checkouts per year since publication (total_checkouts / (2024 - publication_year)). Handle the case where publication_year is 2024 by returning total_checkouts.
find_most_popular_book(library_data):
- Input: The main list of library book data.
- Output: The book_id of the book with the highest annualized checkouts. If there’s a tie, return the book_id that comes first alphabetically.
get_books_in_genre(library_data, genre_name):
- Input: The library data list and a string genre_name.
- Output: A list of book_ids that belong to the specified genre. The list should be sorted alphabetically.
get_genre_circulation_summary(library_data):
- Input: The library data list.
- Logic: Find all unique genres. For each genre, calculate the total number of checkouts.
- Output: A list of tuples ('genre_name', total_checkouts). This list must be sorted alphabetically by genre name.

Finally, create a main function analyze_library(library_data) that uses your helper functions to return a summary tuple containing: (most_popular_book_id, fantasy_book_ids, genre_summary) where fantasy_book_ids is the result of calling get_books_in_genre() with 'Fantasy'.

Testing Inputs:

# Assume current year is 2024
library_data = [
    ('B101', 'Fantasy', 2001, 506),    # Annualized: 506 / 23 = 22.0
    ('B205', 'Sci-Fi', 1999, 600),     # Annualized: 600 / 25 = 24.0
    ('B102', 'Fantasy', 2015, 207),    # Annualized: 207 / 9 = 23.0
    ('B301', 'Mystery', 2020, 100),    # Annualized: 100 / 4 = 25.0
    ('B206', 'Sci-Fi', 2018, 144)      # Annualized: 144 / 6 = 24.0
]

Expected Output:

('B301', ['B101', 'B102'], [('Fantasy', 713), ('Mystery', 100), ('Sci-Fi', 744)])

Variant 9: Sales Team Performance Analyzer

You are a sales manager evaluating the performance of your sales team. The data is organized as a list of tuples, where each tuple contains an employee’s (employee_id, region, quarterly_sales_list). Your assignment is to write a Python script to analyze this sales data.

You must implement the following functions:

calculate_average_sales(sales_list):
- Input: A list of numbers (an employee’s quarterly sales).
- Output: The average of the sales figures in the list. Should return 0.0 for an empty list.
find_top_salesperson(sales_data):
- Input: The main list of sales data.
- Output: The employee_id of the salesperson with the highest average quarterly sales. If there’s a tie, return the ID that comes first alphabetically.
get_employees_in_region(sales_data, region_name):
- Input: The sales data list and a string region_name.
- Output: A list of employee_ids for salespeople in that region. The list should be sorted alphabetically.
get_regional_sales_total(sales_data):
- Input: The sales data list.
- Logic: Find all unique regions. For each region, calculate the total combined sales from all quarters for all employees in that region.
- Output: A list of tuples ('region_name', total_sales). This list must be sorted alphabetically by region name.

Finally, create a main function analyze_sales_data(sales_data) that uses your helper functions to return a summary tuple containing: (top_salesperson_id, north_region_employees, regional_summary) where north_region_employees is the result of calling get_employees_in_region() with 'North'.

Testing Inputs:

sales_data = [
    ('E101', 'North', [50000, 60000, 55000]), # Avg: 55000
    ('E201', 'South', [70000, 75000, 80000]), # Avg: 75000
    ('E102', 'North', [85000, 90000, 95000]), # Avg: 90000
    ('E301', 'West', [65000, 60000, 58000])  # Avg: 61000
]

Expected Output:

('E102', ['E101', 'E102'], [('North', 435000), ('South', 225000), ('West', 183000)])

Variant 10: University Course Analyzer

You are a university administrator analyzing course enrollment data. The data is stored as a list of tuples, where each tuple contains a (course_code, department, list_of_semesterly_enrollments). Your task is to write a Python script to analyze this information.

You must implement the following functions:

calculate_average_enrollment(enrollment_list):
- Input: A list of numbers (a course’s semester enrollments).
- Output: The average enrollment for the course. Should return 0.0 for an empty list.
find_most_popular_course(course_data):
- Input: The main list of course data.
- Output: The course_code of the course with the highest average enrollment. If there’s a tie, return the code that comes first alphabetically.
get_courses_in_department(course_data, department_name):
- Input: The course data list and a string department_name.
- Output: A list of course_codes for courses in that department. The list should be sorted alphabetically.
get_department_enrollment_summary(course_data):
- Input: The course data list.
- Logic: Find all unique departments. For each department, calculate the total number of student enrollments across all its courses and semesters (i.e., the sum of all numbers in all enrollment lists for that department).
- Output: A list of tuples ('department_name', total_enrollments). This list must be sorted alphabetically by department name.

Finally, create a main function analyze_courses(course_data) that uses your helper functions to return a summary tuple containing: (most_popular_course_code, math_course_codes, department_summary) where math_course_codes is the result of calling get_courses_in_department() with 'Mathematics'.

Testing Inputs:

course_data = [
    ('CS101', 'Computer Science', [300, 310, 305]), # Avg: 305
    ('MATH201', 'Mathematics', [250, 240, 260]),    # Avg: 250
    ('ENG101', 'English', [400, 410, 390]),         # Avg: 400
    ('CS205', 'Computer Science', [280, 290, 300]), # Avg: 290
    ('MATH150', 'Mathematics', [350, 360, 340])     # Avg: 350
]

Expected Output:

('ENG101', ['MATH150', 'MATH201'], [('Computer Science', 1785), ('English', 1200), ('Mathematics', 1800)])

Variant 11: Real Estate Listing Analyzer

You are a real estate analyst comparing property listings. The data is available as a list of tuples, where each tuple contains a (listing_id, neighborhood, price, area_in_sqft). Your objective is to create a Python script to analyze the real estate market.

You must implement the following functions:

calculate_price_per_sqft(listing_tuple):
- Input: A single listing tuple (id, neighborhood, price, area).
- Output: The price per square foot for the listing (price / area). Assume area is never zero.
find_best_value_listing(listings):
- Input: The main list of property listings.
- Output: The listing_id of the property with the lowest price per square foot. If there’s a tie, return the ID that comes first alphabetically.
get_listings_in_neighborhood(listings, neighborhood_name):
- Input: The listings list and a string neighborhood_name.
- Output: A list of listing_ids in the specified neighborhood. The list should be sorted alphabetically.
get_neighborhood_value_summary(listings):
- Input: The listings list.
- Logic: Find all unique neighborhoods. For each neighborhood, calculate the total market value of all listings in it (i.e., the sum of their prices).
- Output: A list of tuples ('neighborhood_name', total_market_value). This list must be sorted alphabetically by neighborhood name.

Finally, create a main function analyze_listings(listings) that uses your helper functions to return a summary tuple containing: (best_value_listing_id, riverside_listings, neighborhood_summary) where riverside_listings is the result of calling get_listings_in_neighborhood() with 'Riverside'.

Testing Inputs:

listings = [
    ('L101', 'Downtown', 500000, 800),    # PPSF: 625.0
    ('L205', 'Riverside', 450000, 1000),   # PPSF: 450.0
    ('L102', 'Downtown', 750000, 1200),   # PPSF: 625.0
    ('L301', 'Northwood', 350000, 900),    # PPSF: 388.88
    ('L206', 'Riverside', 600000, 1250)    # PPSF: 480.0
]

Expected Output:

('L301', ['L205', 'L206'], [('Downtown', 1250000), ('Northwood', 350000), ('Riverside', 1050000)])

Variant 12: Event Ticket Sales Analyzer

You are an event manager analyzing ticket sales for various concerts. The data is a list of tuples, with each tuple containing an (event_id, artist_name, tickets_sold, revenue_generated). Your task is to write a Python script to process this sales data.

You must implement the following functions:

calculate_price_per_ticket(event_tuple):
- Input: A single event tuple (id, artist, sold, revenue).
- Output: The average price per ticket for the event (revenue / tickets_sold). Assume tickets_sold is never zero.
find_top_earning_event(events):
- Input: The main list of event tuples.
- Output: The event_id of the event that generated the most total revenue. If there’s a tie, return the ID that comes first alphabetically.
get_events_by_artist(events, artist_name):
- Input: The events list and a string artist_name.
- Output: A list of event_ids for the specified artist. The list should be sorted alphabetically.
get_artist_sales_summary(events):
- Input: The events list.
- Logic: Find all unique artists. For each artist, calculate the total number of tickets they sold across all their events.
- Output: A list of tuples ('artist_name', total_tickets_sold). This list must be sorted alphabetically by artist name.

Finally, create a main function analyze_ticket_sales(events) that uses your helper functions to return a summary tuple containing: (top_earning_event_id, imagine_dragons_events, artist_summary) where imagine_dragons_events is the result of calling get_events_by_artist() with 'Imagine Dragons'.

Testing Inputs:

events = [
    ('EV101', 'The Killers', 5000, 375000),       # Revenue: 375000
    ('EV205', 'Imagine Dragons', 8000, 600000),   # Revenue: 600000
    ('EV102', 'The Killers', 4500, 360000),       # Revenue: 360000
    ('EV301', 'Coldplay', 10000, 950000),         # Revenue: 950000
    ('EV206', 'Imagine Dragons', 8500, 680000)    # Revenue: 680000
]

Expected Output:

('EV301', ['EV205', 'EV206'], [('Coldplay', 10000), ('Imagine Dragons', 16500), ('The Killers', 9500)])