Databases | Practical class 10
Creating complex queries. Combining multiple tables and working with “Joins”.
1. Setting Up The Environment (pgAdmin 4)
- Launch pgAdmin 4: Open the pgAdmin 4 application.
- Connect to Server: Connect to your PostgreSQL server instance. You should have this configured from P1.
- Select Database: Navigate to your designated database for this course under
Servers -> [Your Server Name] -> Databases
. Right-click the database name and select “Query Tool”. - Sample Schema: For today’s examples, we will use a hypothetical university database schema. Execute the following SQL code in the Query Tool (press F5 or click the “Execute/Refresh” button (the triangle/play icon)) to create and populate the necessary tables:
-- Drop tables if they exist (to start fresh)
DROP TABLE IF EXISTS enrollments;
DROP TABLE IF EXISTS courses;
DROP TABLE IF EXISTS instructors;
DROP TABLE IF EXISTS students;
DROP TABLE IF EXISTS departments;
-- Create Departments table
CREATE TABLE departments (
dept_id SERIAL PRIMARY KEY,
dept_name VARCHAR(100) NOT NULL UNIQUE,
building VARCHAR(50)
);
-- Create Instructors table
CREATE TABLE instructors (
instructor_id SERIAL PRIMARY KEY,
instructor_name VARCHAR(100) NOT NULL,
dept_id INT REFERENCES departments(dept_id)
);
-- Create Courses table
CREATE TABLE courses (
course_id SERIAL PRIMARY KEY,
course_name VARCHAR(100) NOT NULL,
credits INT CHECK (credits > 0),
dept_id INT REFERENCES departments(dept_id),
instructor_id INT REFERENCES instructors(instructor_id)
);
-- Create Students table
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
student_name VARCHAR(100) NOT NULL,
major_dept_id INT REFERENCES departments(dept_id),
enrollment_year INT
);
-- Create Enrollments table (Junction table)
CREATE TABLE enrollments (
enrollment_id SERIAL PRIMARY KEY,
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
grade CHAR(1) CHECK (grade IN ('A', 'B', 'C', 'D', 'F')),
UNIQUE(student_id, course_id) -- A student can enroll in a course only once
);
-- Insert sample data
INSERT INTO departments (dept_name, building) VALUES
('Computer Science', 'Building A'),
('Mathematics', 'Building B'),
('Physics', 'Building A'),
('History', 'Building C');
INSERT INTO instructors (instructor_name, dept_id) VALUES
('Dr. Alan Turing', 1),
('Dr. Ada Lovelace', 1),
('Dr. Carl Gauss', 2),
('Dr. Marie Curie', 3),
('Dr. Herodotus', 4),
('Dr. Isaac Newton', NULL); -- Instructor not assigned to a department yet
INSERT INTO courses (course_name, credits, dept_id, instructor_id) VALUES
('Introduction to Programming', 4, 1, 1),
('Data Structures', 4, 1, 2),
('Calculus I', 4, 2, 3),
('Linear Algebra', 3, 2, 3),
('Quantum Mechanics', 4, 3, 4),
('Classical Mechanics', 4, 3, 6), -- Taught by Newton
('World History I', 3, 4, 5),
('Database Systems', 3, 1, 2); -- Taught by Lovelace
INSERT INTO students (student_name, major_dept_id, enrollment_year) VALUES
('Alice Smith', 1, 2022),
('Bob Johnson', 2, 2021),
('Charlie Brown', 1, 2022),
('Diana Prince', 3, 2023),
('Eve Adams', NULL, 2021); -- Undecided major
INSERT INTO enrollments (student_id, course_id, grade) VALUES
(1, 1, 'A'), -- Alice, Intro to Prog
(1, 2, 'B'), -- Alice, Data Structures
(1, 8, 'A'), -- Alice, Database Systems
(2, 3, 'A'), -- Bob, Calculus I
(2, 4, 'B'), -- Bob, Linear Algebra
(3, 1, 'B'), -- Charlie, Intro to Prog
(3, 8, 'C'), -- Charlie, Database Systems
(4, 5, 'A'); -- Diana, Quantum Mechanics
-- Note: Student Eve Adams is not enrolled in any course yet.
-- Note: Course World History I has no enrollments yet.
2. Joining Multiple Tables (Three or More)
The fundamental principle remains the same as joining two tables: you chain JOIN
clauses. The output of the first join (conceptually, an intermediate table) becomes the left table for the second join, and so on.
Syntax Pattern:
SELECT
t1.column1,
t2.column2,
t3.column3,
-- ... other columns
FROM
table1 AS t1
JOIN
table2 AS t2 ON t1.joining_column = t2.joining_column
JOIN
table3 AS t3 ON t2.another_joining_column = t3.another_joining_column
-- Additional JOIN clauses can be added here
WHERE
-- Optional conditions
ORDER BY
-- Optional sorting;
Key Points:
- Aliases are Crucial: Use table aliases (
t1
,t2
,t3
, or more descriptive names likes
,c
,d
) to keep your query readable and unambiguous, especially when tables have columns with the same name (e.g.,dept_id
). - Join Order Matters (Logically): Ensure the
ON
condition correctly links the tables being joined at that step. You typically join tables based on foreign key relationships. - Default is INNER JOIN: If you just write
JOIN
, PostgreSQL assumesINNER JOIN
.
Example 1: List students, the courses they are enrolled in, and the instructors teaching those courses.
This requires joining students
, enrollments
, courses
, and instructors
.
students
links toenrollments
viastudent_id
.enrollments
links tocourses
viacourse_id
.courses
links toinstructors
viainstructor_id
.
SELECT
s.student_name,
c.course_name,
i.instructor_name
FROM
students AS s
INNER JOIN
enrollments AS e ON s.student_id = e.student_id
INNER JOIN
courses AS c ON e.course_id = c.course_id
INNER JOIN
instructors AS i ON c.instructor_id = i.instructor_id
ORDER BY
s.student_name,
c.course_name;
Execution in pgAdmin 4:
- Enter the query into the Query Tool.
- Press F5 or click the Execute button.
- Observe the results in the Data Output pane. You should see student names paired with the courses they’re taking and the respective instructor names.
3. Combining Different Join Types
Sometimes, you need to combine INNER JOIN
with LEFT JOIN
(or RIGHT JOIN
, FULL OUTER JOIN
) to achieve the desired result. For instance, you might want to list all items from one table but only matching items from subsequent joins.
Example 2: List all students and, if they are enrolled in any courses, list those courses and their departments.
We need all students, so we start with students
and use LEFT JOIN
for enrollments and courses. We also want the course department.
- Start with
students
. LEFT JOIN
toenrollments
(to keep all students, even those not enrolled).LEFT JOIN
tocourses
(to keep students enrolled in courses that might somehow be missing details, though less likely with foreign keys enforced. Also, conceptually keeps the “all students” focus).LEFT JOIN
todepartments
(to get the department name for the course).
SELECT
s.student_name,
c.course_name,
d.dept_name AS course_department
FROM
students AS s
LEFT JOIN
enrollments AS e ON s.student_id = e.student_id
LEFT JOIN
courses AS c ON e.course_id = c.course_id
LEFT JOIN
departments AS d ON c.dept_id = d.dept_id -- Join based on the course's department
ORDER BY
s.student_name,
c.course_name;
Explanation:
students LEFT JOIN enrollments
: Ensures all students (like ‘Eve Adams’) appear, even if they have no matching rows inenrollments
. For these students, columns fromenrollments
(and subsequent joins) will beNULL
.... LEFT JOIN courses
: If a student has enrollments, we try to find the matching course.... LEFT JOIN departments
: If we found a course, we try to find its department.
Observation: You will see ‘Eve Adams’ listed with NULL
values for course_name
and course_department
. Students enrolled in courses will show the course name and the department that offers the course.
Example 3: List all departments and the names of instructors in those departments. Include departments even if they currently have no instructors assigned.
- Start with
departments
. LEFT JOIN
toinstructors
.
SELECT
d.dept_name,
d.building,
i.instructor_name
FROM
departments AS d
LEFT JOIN
instructors AS i ON d.dept_id = i.dept_id
ORDER BY
d.dept_name,
i.instructor_name;
Observation: All departments are listed. If a department has instructors, their names are listed. If a department has no instructors assigned in the instructors
table (or if an instructor has dept_id = NULL
, like Dr. Isaac Newton), that department will still appear, potentially with NULL
for the instructor name, or the instructor might not be linked to any displayed department.
4. Self-Joins
A self-join is used to join a table to itself. This is useful when rows within a single table have a relationship with other rows in the same table. A classic example is an employee table where one column stores the ID of the employee’s manager, who is also an employee in the same table.
Key Requirement: You must use table aliases to distinguish between the two conceptual copies of the table being joined.
Example 4: List instructors and the name of the department they belong to. (Using a self-join on the department table is not typical here, but let’s imagine a scenario where we model prerequisites within the courses
table).
Modification for a better self-join example: Let’s add a prerequisite_course_id
to the courses
table.
First, update the table and data (Run this in pgAdmin Query Tool):
-- Add a prerequisite column to courses
ALTER TABLE courses
ADD COLUMN prerequisite_course_id INT REFERENCES courses(course_id) NULL; -- Allow NULL for courses with no prerequisite
-- Update some courses to have prerequisites
UPDATE courses SET prerequisite_course_id = 1 WHERE course_name = 'Data Structures'; -- Data Structures requires Intro to Prog
UPDATE courses SET prerequisite_course_id = 3 WHERE course_name = 'Linear Algebra'; -- Linear Algebra requires Calculus I
UPDATE courses SET prerequisite_course_id = 1 WHERE course_name = 'Database Systems'; -- Database Systems requires Intro to Prog
Now, the self-join query: List courses and their prerequisites.
SELECT
c1.course_name AS course,
c2.course_name AS prerequisite
FROM
courses AS c1 -- The main course
LEFT JOIN
courses AS c2 ON c1.prerequisite_course_id = c2.course_id -- Join back to courses table for the prerequisite
ORDER BY
c1.course_name;
Explanation:
- We reference the
courses
table twice, aliasing it asc1
(representing the course) andc2
(representing the prerequisite). - The
LEFT JOIN
ensures that courses without prerequisites (whereprerequisite_course_id
isNULL
) are still listed, showingNULL
in theprerequisite
column. - The
ON
condition links theprerequisite_course_id
of the main course (c1
) to thecourse_id
of the prerequisite course (c2
).
Observation: The output will show each course and, if it has one defined, the name of its prerequisite course.
5. Combining Joins with Aggregation and Grouping
Complex queries often involve combining joins with aggregate functions (COUNT
, AVG
, etc. - P10) and GROUP BY
/ HAVING
clauses (P6).
Example 5: List departments and the number of courses offered by each. Only include departments that offer more than one course.
This requires joining departments
and courses
, then grouping by department and filtering the groups.
SELECT
d.dept_name,
COUNT(c.course_id) AS number_of_courses
FROM
departments AS d
INNER JOIN -- Or LEFT JOIN if you want to see departments with 0 courses (before filtering)
courses AS c ON d.dept_id = c.dept_id
GROUP BY
d.dept_id, d.dept_name -- Group by primary key and any other selected non-aggregated columns
HAVING
COUNT(c.course_id) > 1 -- Filter groups based on the aggregate count
ORDER BY
number_of_courses DESC,
d.dept_name;
Explanation:
departments
is joined withcourses
ondept_id
. AnINNER JOIN
only includes departments that have at least one course. ALEFT JOIN
would include all departments initially.GROUP BY d.dept_id, d.dept_name
: Groups the rows so that all courses belonging to the same department are together. We group bydept_id
(the unique identifier) and includedept_name
because it’s in theSELECT
list and isn’t an aggregate.COUNT(c.course_id)
: Counts the number of courses within each department group.HAVING COUNT(c.course_id) > 1
: Filters out the groups (departments) where the course count is not greater than 1.ORDER BY
: Sorts the final result.
Example 6: Find the average grade for each course.
This requires joining courses
and enrollments
, then grouping. Note: We need to handle letter grades numerically for averaging.
-- We need a way to convert grades to numerical values for AVG.
-- For simplicity here, we'll just count enrollments per course instead of averaging grades.
-- A true average grade would require a CASE statement or a separate mapping table (potentially beyond P11 scope simplicity).
-- Let's find the number of students enrolled in each course.
SELECT
c.course_name,
COUNT(e.student_id) AS number_of_students_enrolled
FROM
courses AS c
LEFT JOIN -- Use LEFT JOIN to include courses with zero enrollments
enrollments AS e ON c.course_id = e.course_id
GROUP BY
c.course_id, c.course_name -- Group by course
ORDER BY
number_of_students_enrolled DESC,
c.course_name;
Observation: This query lists all courses and shows how many students are enrolled in each, including courses with 0 enrollments like ‘World History I’.
6. Best Practices for Complex Joins
- Use Meaningful Aliases:
e
foremployees
,d
fordepartments
is better thant1
,t2
. - Format Your SQL: Use indentation and line breaks to make complex queries readable. Align
JOIN
,ON
,WHERE
clauses. - Understand Join Types: Choose
INNER
,LEFT
,RIGHT
, orFULL OUTER
deliberately based on whether you need to preserve all rows from the left, right, or both tables, even when there’s no match. - Specify Join Columns: Always use the
ON
clause with the correct joining columns. Avoid the older comma-separated join syntax in theFROM
clause. - Test Incrementally: Build complex queries step-by-step. Start with two tables, verify the result, then add the next join, and so on.