Databases | Practical class 5
Creating queries using the GROUP BY
and ORDER BY
clauses. Creating queries with the HAVING
clause.
We will be using PostgreSQL and pgAdmin 4 throughout this tutorial. Please ensure you have pgAdmin 4 open and are connected to your PostgreSQL server.
1. The ORDER BY
Clause: Structuring Your Output
The ORDER BY
clause is used to sort the results of your SQL queries. By default, the order of rows returned by a SELECT
statement is not guaranteed. ORDER BY
allows you to specify one or more columns to sort the result set, either in ascending or descending order.
Syntax:
SELECT column1, column2, ...
FROM table_name
ORDER BY column_name [ASC | DESC];
ORDER BY column_name
: Specifies the column to sort by. You can order by multiple columns by separating them with commas.ASC
(Ascending): Sorts the results in ascending order (A-Z, 1-9). This is the default if you don’t specifyASC
orDESC
.DESC
(Descending): Sorts the results in descending order (Z-A, 9-1).
Example:
Let’s assume we have a table named students
with the following structure and data:
CREATE TABLE students (
student_id INTEGER PRIMARY KEY,
student_name VARCHAR(255) NOT NULL,
course VARCHAR(255) NOT NULL,
grade INTEGER NOT NULL
);
INSERT INTO students (student_id, student_name, course, grade) VALUES
(1, 'DAVLATNAZAROV ISLOMBEK BAXROMBEK O‘G‘LI', 'Mathematics', 85),
(2, 'BADALYAN SAMVEL SURENOVICH', 'Physics', 92),
(3, 'ISMAILOV AZIZBEK O‘KTAM O‘G‘LI', 'Computer Science', 78),
(4, 'DJUMANIYAZOV AZAMAT BAXROM O‘G‘LI', 'Mathematics', 90),
(5, 'KOMILOV UMRBEK AZAMAT O‘G‘LI', 'Physics', 88),
(6, 'KOMILOV KAMRONBEK OG‘ABOY O‘G‘LI', 'Computer Science', 95),
(7, 'KURBANOV AMIRBEK BAXTIYAROVICH', 'Linear Algebra', 70),
(8, 'OLLABERGANOV KAMOLADDIN RASULBEK O‘G‘LI', 'Calculus I', 82),
(9, 'NAZARGELDIYEV NURBOL TULEGEN O‘G‘LI', 'Probability', 98),
(10, 'MATCHANOVA SEVARA ADILBEKOVNA', 'Statistics', 85),
(11, 'SAMANDAROV IBRATBEK RASHID O‘G‘LI', 'Database Systems', 79),
(12, 'SABIROV ZOIRJON SHAKIR O‘G‘LI', 'Operating Systems', 91),
(13, 'OZADOV KAMOLADDIN RAVSHONBEK O‘G‘LI', 'Data Structures', 86),
(14, 'XUDARGANOV ASADBEK G‘AYRATJON O‘G‘LI', 'Algorithms', 93),
(15, 'SHONAZAROV AMIRBEK ALIBEK O‘G‘LI', 'Discrete Mathematics', 75);
Scenario 1: Ordering students by name in ascending order.
SQL Query:
SELECT student_name, course, grade
FROM students
ORDER BY student_name ASC;
Result:
Scenario 2: Ordering students by grade in descending order.
SQL Query:
SELECT student_name, course, grade
FROM students
ORDER BY grade DESC;
Result:
Scenario 3: Ordering by multiple columns (first by course, then by grade within each course in descending order).
SQL Query:
SELECT student_name, course, grade
FROM students
ORDER BY course ASC, grade DESC;
Result:
In this case, the results are first ordered alphabetically by course
. For students in the same course, they are then ordered by grade
in descending order.
2. The GROUP BY
Clause: Aggregating Data into Groups
The GROUP BY
clause is used to group rows that have the same values in one or more columns. It is often used in conjunction with aggregate functions (which we will explore in detail in a later topic - P10) to calculate summary information for each group. For now, we will focus on understanding how GROUP BY
organizes your data into groups.
Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE condition -- (Optional WHERE clause from P5)
GROUP BY column_name1, column_name2, ...;
GROUP BY column_name1, column_name2, ...
: Specifies the columns by which you want to group the rows. Rows with the same combination of values in these columns will be grouped together.WHERE condition
: You can still use theWHERE
clause (from P5) to filter rows before they are grouped.
Important Note: When you use GROUP BY
, any column in your SELECT
list that is not part of the GROUP BY
clause must be used within an aggregate function. For this tutorial, we will focus on simply understanding the grouping mechanism without explicitly using aggregate functions in the SELECT
list for now. We will primarily select the grouping columns themselves.
Example using the students
table:
Scenario 1: Grouping students by course.
SQL Query:
SELECT course
FROM students
GROUP BY course;
pgAdmin 4 Instructions:
Follow the same steps as in the ORDER BY
example in pgAdmin 4 to execute this query.
Result:
This query returns a list of unique courses present in the students
table. The GROUP BY course
clause has grouped all rows with the same course
value into a single group, and then selected the course
value from each group.
Scenario 2: Let’s add student_name
to the SELECT
clause in the previous query.
SQL Query (Incorrect):
SELECT course, student_name -- student_name is not in GROUP BY and not aggregated
FROM students
GROUP BY course;
Result (PostgreSQL Error):
This query results in an error because student_name
is in the SELECT
list but is not part of the GROUP BY
clause and is not used with an aggregate function. PostgreSQL requires that non-aggregated columns in the SELECT
list must also be in the GROUP BY
clause.
Scenario 3: Grouping by course and then also selecting student names (demonstrating the grouping, not intended to be a practical query without aggregation yet).
To demonstrate the grouping concept, we can modify the query to include student_name
in the GROUP BY
clause. This would create groups based on unique combinations of course
and student_name
.
SQL Query (Illustrative for grouping concept):
SELECT course, student_name
FROM students
GROUP BY course, student_name;
Result:
This result shows that each unique combination of course
and student_name
is treated as a separate group. While this specific query might not be immediately useful without aggregation, it illustrates how GROUP BY
works with multiple columns.
In summary, GROUP BY
is crucial for categorizing your data and preparing it for aggregate calculations (which are coming in P10). It allows you to move from individual rows to sets of rows sharing common characteristics.
3. The HAVING
Clause: Filtering Groups
The HAVING
clause is used to filter groups created by the GROUP BY
clause. It is similar to the WHERE
clause, but WHERE
filters individual rows before grouping, while HAVING
filters groups after they are formed. HAVING
is always used in conjunction with GROUP BY
.
Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE condition -- (Optional WHERE clause)
GROUP BY column_name1, column_name2, ...
HAVING group_condition;
HAVING group_condition
: Specifies the condition that groups must meet to be included in the final result. This condition is evaluated for each group.
Important Note: Similar to the SELECT
clause with GROUP BY
, conditions in the HAVING
clause often involve aggregate functions (which we’ll cover in P10). However, for this introductory tutorial, we will illustrate the concept of filtering groups based on a simple condition related to the grouping column itself.
Example using the students
table:
Let’s assume we want to find courses that have names starting with ‘D’. We can use HAVING
to filter the groups of courses.
SQL Query:
SELECT course
FROM students
GROUP BY course
HAVING course LIKE 'D%';
pgAdmin 4 Instructions:
Execute this query in pgAdmin 4 as before.
Result:
Explanation:
GROUP BY course
: Groups the rows by course, resulting in groups for ‘Computer Sci’, ‘Mathematics’, and ‘Physics’.HAVING course LIKE 'P%'
: Filters these groups. It checks if thecourse
name for each group starts with ‘P’. Only the group where thecourse
is ‘Physics’ satisfies this condition.SELECT course
: Selects thecourse
name from the groups that passed theHAVING
condition.
Distinction between WHERE
and HAVING
:
WHERE
: Filters rows before grouping. It operates on individual rows and is used to select which rows will be part of the grouping process.HAVING
: Filters groups after grouping. It operates on the groups formed byGROUP BY
and is used to select which groups will be included in the final result.
Example demonstrating both WHERE
and HAVING
(without complex aggregate functions yet):
Let’s say we want to find courses that contains character ‘m’ and only consider students with grades above 90.
SQL Query:
SELECT course
FROM students
WHERE grade > 90 -- WHERE clause to filter rows before grouping
GROUP BY course
HAVING course LIKE '%m%'; -- HAVING clause to filter groups after grouping
Result:
Explanation:
WHERE grade > 90
: Filters thestudents
table, selecting only rows wheregrade
is greater than 80. This results in the following temporary data set:
GROUP BY course
: Groups the rows in this temporary data set bycourse
.HAVING course LIKE '%m%'
: Filters the groups formed in the previous step, keeping only the group where thecourse
name contains character ‘m’ (which is ‘Computer Sci’).SELECT course
: Selects thecourse
name from the remaining group.
4. Combining ORDER BY
, GROUP BY
, and HAVING
You can use all three clauses (ORDER BY
, GROUP BY
, and HAVING
) in a single query to achieve complex data manipulation. The order of execution in a SQL query with these clauses is typically:
FROM
: Specifies the table(s) to retrieve data from.WHERE
: Filters rows based on specified conditions.GROUP BY
: Groups rows based on specified columns.HAVING
: Filters groups based on specified conditions.SELECT
: Selects the columns to be displayed.ORDER BY
: Sorts the final result set.
Example:
Let’s say we want to find courses ending with ‘s’, considering only students with grades above 80, and then order the resulting courses alphabetically.
SQL Query:
SELECT course
FROM students
WHERE grade > 80
GROUP BY course
HAVING course LIKE '%s'
ORDER BY course ASC;
Result:
Practical Exercise:
Using the students
table, write SQL queries to achieve the following:
- List all courses in descending alphabetical order.
- Find all courses that contain the letter ‘s’ in their name.
- List the courses that contain the letter ‘s’ in their name, ordered alphabetically.
Solutions (Try to solve them yourself first!)
SELECT course
FROM students
GROUP BY course
ORDER BY course DESC;
SELECT course
FROM students
GROUP BY course
HAVING course LIKE '%s%';
SELECT course
FROM students
GROUP BY course
HAVING course LIKE '%s%'
ORDER BY course ASC;