Week 12 Tutorial: File I/O
Problem 1: The Expense Calculator
Problem Statement:
You are helping a small business owner calculate their daily expenses. They have a text file called expenses.txt where every line contains a category and a dollar amount separated by a comma (e.g., Lunch,12.50).
Write a program that:
- Opens and reads
expenses.txt. - Extracts the cost from each line.
- Calculates the total sum of all expenses.
- Counts how many individual transactions (lines) there were.
- Prints a summary to the console.
Note: You do not need to write to a new file, just print the results.
Input Data Setup (Run this first to create the file):
data = """Lunch,12.50
Coffee,5.00
Office Supplies,23.75
Taxi,10.00
Coffee,8.25
Dinner,50.00"""
with open("expenses.txt", "w") as f:
f.write(data)
Expected Output:
--- Expense Report ---
Total Transactions: 6
Total Spent: $109.50
Average Expense: $18.25
Problem 2: The Log Filter
Problem Statement:
You have a messy server log file called server_log.txt. It contains general info messages, warnings, and critical errors mixed together. Your job is to extract only the lines containing the word “ERROR” and save them into a separate file called urgent_alerts.txt.
Write a program that:
- Reads
server_log.txt. - Checks each line to see if it contains the substring
"ERROR"(case-sensitive). - If it does, write that line to a new file called
urgent_alerts.txt. - At the end, print how many errors were found.
Input Data Setup (Run this first to create the file):
log_data = """[INFO] System started successfully
[WARNING] Memory usage high
[ERROR] Database connection failed
[INFO] User logged in
[ERROR] Payment gateway timeout
[INFO] Scheduled backup complete
[ERROR] Disk space critical"""
with open("server_log.txt", "w") as f:
f.write(log_data)
Expected Output (Console):
Scan complete. Found 3 errors.
Please check urgent_alerts.txt.
Expected Output (File: urgent_alerts.txt):
[ERROR] Database connection failed
[ERROR] Payment gateway timeout
[ERROR] Disk space critical
Problem 3: The Inventory Restocker
Problem Statement:
You manage a warehouse. You have a file called inventory.csv where each line represents a product in the format: Product Name,Current Stock,Minimum Required.
Write a program that:
- Reads the
inventory.csvfile. - Identifies which items are below their minimum required level.
- Creates a new file called
reorder_list.txt. - Writes the names of the items that need to be reordered, and how many need to be bought to reach the minimum level.
Input Data Setup (Run this first):
# Format: Product, Stock, Minimum
data = """Apples,50,100
Bananas,120,100
Cherries,5,20
Dates,50,50
Eggs,10,24"""
with open("inventory.csv", "w") as f:
f.write(data)
Expected Logic Example:
- Apples: Have 50, need 100. (50 < 100). Order 50.
- Bananas: Have 120, need 100. (120 >= 100). Do nothing.
Expected Output (File: reorder_list.txt):
Item: Apples | Order Amount: 50
Item: Cherries | Order Amount: 15
Item: Eggs | Order Amount: 14
Problem 4: The Formatting Fixer
Problem Statement:
You have received a file raw_users.txt containing user signup data. However, the users typed their names in messy ways (weird capitalization, extra spaces) and provided their birth year instead of their age.
Data Format: Full Name - BirthYear
Write a program that:
- Reads the file.
- Formats the name to be “Title Case” (e.g., “jOhN dOE” becomes “John Doe”).
- Calculates their approximate age (assume the current year is 2025).
- Writes a clean file
clean_profiles.txtin the format:Name: [Name] (Age: [Age]).
Input Data Setup (Run this first):
# Note the messy spacing and casing
data = """ john smith - 1990
SARAH CONNOR - 1984
kylo REN - 1995
LARA croft - 1992"""
with open("raw_users.txt", "w") as f:
f.write(data)
Expected Output (File: clean_profiles.txt):
Name: John Smith (Age: 35)
Name: Sarah Connor (Age: 41)
Name: Kylo Ren (Age: 30)
Name: Lara Croft (Age: 33)
Problem 5: The Election Auditor
Problem Statement:
You are auditing an election. You have a file called votes.txt.
Each line represents a ballot in the format: VoterID:CandidateName.
However, the machine that generated the file was glitchy:
- Some lines are incomplete (missing a name or ID).
- Some lines have extra whitespace.
- You need to count the valid votes for each candidate.
Write a program that:
- Reads
votes.txt. - Skips invalid lines (where the format isn’t
ID:Name). - Counts the votes for each candidate using a dictionary.
- Calculates the percentage of the total vote each candidate received.
- Writes a
results.txtfile that lists the candidates, their vote counts, their percentages, and declares a winner.
Input Data Setup:
data = """1001:Alice
1002:Bob
1003:Alice
ERROR_READING_LINE
1004: Charlie
1005:Alice
1006:Bob
1007:
1008:David"""
with open("votes.txt", "w") as f:
f.write(data)
Expected Output (File: results.txt):
OFFICIAL ELECTION RESULTS
-------------------------
Alice: 3 votes (42.9%)
Bob: 2 votes (28.6%)
Charlie: 1 votes (14.3%)
David: 1 votes (14.3%)
-------------------------
Total Valid Votes: 7
WINNER: Alice
Problem 6: The Cross-Referencing Text Analyzer
Problem Statement: You are building a tool to analyze the keyword density of a text file, but you need to ignore common “stop words” (like “the”, “is”, “at”) so they don’t clutter the results.
You have two files:
stopwords.txt: A list of words to ignore (one per line).story.txt: A paragraph of text.
Write a program that:
- Loads the stopwords into a list.
- Reads the
story.txt. - Processes the story word-by-word. You must:
- Convert to lowercase.
- Remove punctuation (periods, commas).
- Ignore the word if it is in your stopword list.
- Counts the frequency of the remaining “interesting” words.
- Writes the valid words and their counts to
analysis.txt.
Input Data Setup:
# File 1: Words to ignore
stops = """the
is
at
on
a
and"""
with open("stopwords.txt", "w") as f:
f.write(stops)
# File 2: The text to analyze
story = """The cat sat on the mat.
The cat is a good cat.
Is the dog on the mat? No, the dog is at the park."""
with open("story.txt", "w") as f:
f.write(story)
Expected Output (File: analysis.txt):
WORD FREQUENCY REPORT
---------------------
cat: 3
sat: 1
mat: 2
good: 1
dog: 2
no: 1
park: 1
Problem 7: The Multi-Store Sales Consolidator
Problem Statement: You are the regional manager for a chain of three stores. Each store manager sends you a daily sales report as a separate CSV file. Your job is to consolidate all three files into a single master report.
Each store file has the format: Product,UnitsSold,PricePerUnit
Write a program that:
- Reads all three store files (
store_a.csv,store_b.csv,store_c.csv). - Consolidates the data by calculating:
- Total Units Sold for each product across all stores.
- Total Revenue for each product across all stores (Units × Price).
- Identifies which store sold the most units overall.
- Writes a consolidated report to
regional_report.txt.
Hint: You’ll need a dictionary where keys are product names and values are another dictionary (or a list) storing totals.
Input Data Setup (Run this first to create the files):
store_a = """Laptop,5,999.99
Mouse,20,25.00
Keyboard,15,75.00
Monitor,8,300.00"""
store_b = """Laptop,3,999.99
Mouse,35,25.00
Headphones,12,150.00
Keyboard,10,75.00"""
store_c = """Mouse,25,25.00
Monitor,5,300.00
Headphones,8,150.00
Laptop,7,999.99"""
with open("store_a.csv", "w") as f:
f.write(store_a)
with open("store_b.csv", "w") as f:
f.write(store_b)
with open("store_c.csv", "w") as f:
f.write(store_c)
Expected Output (File: regional_report.txt):
============================================
REGIONAL SALES CONSOLIDATION
============================================
Product Units Sold Total Revenue
--------------------------------------------
Laptop 15 $14,999.85
Mouse 80 $2,000.00
Keyboard 25 $1,875.00
Monitor 13 $3,900.00
Headphones 20 $3,000.00
--------------------------------------------
GRAND TOTAL REVENUE: $25,774.85
TOP SELLING STORE: Store B (60 units sold)
============================================
Hints:
- Create a function to process a single store file and return its data.
- Use a nested dictionary like:
{"Laptop": {"units": 0, "revenue": 0.0}, ...} - Track each store’s total units separately to find the top seller.
Problem 8: The Student Grade Processor with Validation
Problem Statement:
You are building a grade processing system for a school. The raw input file grades_raw.txt contains student records, but the data is messy and contains various errors that must be handled gracefully.
Data Format: StudentID,Name,Assignment1,Assignment2,Assignment3,Exam
The data has these potential issues:
- Some lines have missing fields (fewer than 6 columns).
- Some scores are not valid numbers (typos like “eighty” or empty).
- Some scores are out of the valid range (scores must be 0-100).
- Some lines are completely empty.
Write a program that:
- Reads
grades_raw.txtand validates each line. - For valid students:
- Calculates their average score (all 4 assessments weighted equally).
- Assigns a letter grade (A: 90+, B: 80-89, C: 70-79, D: 60-69, F: below 60).
- For invalid lines:
- Logs the line number, the original data, and a description of the error.
- Writes two output files:
final_grades.txt: Clean list of students with their averages and letter grades.processing_errors.txt: Log of all errors encountered.
- Prints a summary to the console showing how many records were processed successfully vs. how many had errors.
Input Data Setup (Run this first to create the file):
data = """S001,Alice Smith,85,90,88,92
S002,Bob Jones,78,82,eighty,75
S003,Charlie Brown,95,91,89,94
S004,Diana Prince,70,65
S005,Eve Wilson,88,105,90,85
S006,Frank Miller,60,58,62,55
S007,Grace Lee,72,78,75,80
S008,Henry Ford,,85,80,78
S009,Ivy Chen,90,88,92,95
S010,Jack Black,45,50,48,52"""
with open("grades_raw.txt", "w") as f:
f.write(data)
Expected Output (File: final_grades.txt):
FINAL GRADE REPORT
==========================================
ID Name Average Grade
------------------------------------------
S001 Alice Smith 88.8 B
S003 Charlie Brown 92.2 A
S006 Frank Miller 58.8 F
S007 Grace Lee 76.2 C
S009 Ivy Chen 91.2 A
S010 Jack Black 48.8 F
==========================================
Total Students Processed: 6
Class Average: 76.0
Expected Output (File: processing_errors.txt):
PROCESSING ERROR LOG
==========================================
Line 2: S002,Bob Jones,78,82,eighty,75
-> Error: Invalid score format (non-numeric value)
Line 4: S004,Diana Prince,70,65
-> Error: Missing fields (expected 6, found 4)
Line 5: S005,Eve Wilson,88,105,90,85
-> Error: Score out of range (must be 0-100)
Line 7:
-> Error: Empty line detected
Line 9: S008,Henry Ford,,85,80,78
-> Error: Invalid score format (empty value)
==========================================
Total Errors: 5
Expected Console Output:
Processing complete!
- Successfully processed: 6 students
- Errors encountered: 5 records
Check 'final_grades.txt' and 'processing_errors.txt' for details.
Hints:
- Use
try/exceptto catch conversion errors when parsing scores. - Check
len(parts)after splitting to detect missing fields. - Validate each score is between 0 and 100 after successful conversion.
- Keep track of line numbers using
enumerate()or a counter variable. - Process the file once, writing to both output files as you go.
This content will be available starting December 16, 2025.