Creating queries using the GROUP BY and ORDER BY clauses. Creating queries with the HAVING clause.

We will be using PostgreSQL and pgAdmin 4 throughout this tutorial. Please ensure you have pgAdmin 4 open and are connected to your PostgreSQL server.

1. The ORDER BY Clause: Structuring Your Output

The ORDER BY clause is used to sort the results of your SQL queries. By default, the order of rows returned by a SELECT statement is not guaranteed. ORDER BY allows you to specify one or more columns to sort the result set, either in ascending or descending order.

Syntax:

SELECT column1, column2, ...
FROM table_name
ORDER BY column_name [ASC | DESC];
  • ORDER BY column_name: Specifies the column to sort by. You can order by multiple columns by separating them with commas.
  • ASC (Ascending): Sorts the results in ascending order (A-Z, 1-9). This is the default if you don’t specify ASC or DESC.
  • DESC (Descending): Sorts the results in descending order (Z-A, 9-1).

Example:

Let’s assume we have a table named students with the following structure and data:

CREATE TABLE students (
    student_id INTEGER PRIMARY KEY,
    student_name VARCHAR(255) NOT NULL,
    course VARCHAR(255) NOT NULL,
    grade INTEGER NOT NULL
);

INSERT INTO students (student_id, student_name, course, grade) VALUES
(1, 'DAVLATNAZAROV ISLOMBEK BAXROMBEK O‘G‘LI', 'Mathematics', 85),
(2, 'BADALYAN SAMVEL SURENOVICH', 'Physics', 92),
(3, 'ISMAILOV AZIZBEK O‘KTAM O‘G‘LI', 'Computer Science', 78),
(4, 'DJUMANIYAZOV AZAMAT BAXROM O‘G‘LI', 'Mathematics', 90),
(5, 'KOMILOV UMRBEK AZAMAT O‘G‘LI', 'Physics', 88),
(6, 'KOMILOV KAMRONBEK OG‘ABOY O‘G‘LI', 'Computer Science', 95),
(7, 'KURBANOV AMIRBEK BAXTIYAROVICH', 'Linear Algebra', 70),
(8, 'OLLABERGANOV KAMOLADDIN RASULBEK O‘G‘LI', 'Calculus I', 82),
(9, 'NAZARGELDIYEV NURBOL TULEGEN O‘G‘LI', 'Probability', 98),
(10, 'MATCHANOVA SEVARA ADILBEKOVNA', 'Statistics', 85),
(11, 'SAMANDAROV IBRATBEK RASHID O‘G‘LI', 'Database Systems', 79),
(12, 'SABIROV ZOIRJON SHAKIR O‘G‘LI', 'Operating Systems', 91),
(13, 'OZADOV KAMOLADDIN RAVSHONBEK O‘G‘LI', 'Data Structures', 86),
(14, 'XUDARGANOV ASADBEK G‘AYRATJON O‘G‘LI', 'Algorithms', 93),
(15, 'SHONAZAROV AMIRBEK ALIBEK O‘G‘LI', 'Discrete Mathematics', 75);

Scenario 1: Ordering students by name in ascending order.

SQL Query:

SELECT student_name, course, grade
FROM students
ORDER BY student_name ASC;

Result:

Scenario 2: Ordering students by grade in descending order.

SQL Query:

SELECT student_name, course, grade
FROM students
ORDER BY grade DESC;

Result:

Scenario 3: Ordering by multiple columns (first by course, then by grade within each course in descending order).

SQL Query:

SELECT student_name, course, grade
FROM students
ORDER BY course ASC, grade DESC;

Result:

In this case, the results are first ordered alphabetically by course. For students in the same course, they are then ordered by grade in descending order.

2. The GROUP BY Clause: Aggregating Data into Groups

The GROUP BY clause is used to group rows that have the same values in one or more columns. It is often used in conjunction with aggregate functions (which we will explore in detail in a later topic - P10) to calculate summary information for each group. For now, we will focus on understanding how GROUP BY organizes your data into groups.

Syntax:

SELECT column1, column2, ...
FROM table_name
WHERE condition  -- (Optional WHERE clause from P5)
GROUP BY column_name1, column_name2, ...;
  • GROUP BY column_name1, column_name2, ...: Specifies the columns by which you want to group the rows. Rows with the same combination of values in these columns will be grouped together.
  • WHERE condition: You can still use the WHERE clause (from P5) to filter rows before they are grouped.

Important Note: When you use GROUP BY, any column in your SELECT list that is not part of the GROUP BY clause must be used within an aggregate function. For this tutorial, we will focus on simply understanding the grouping mechanism without explicitly using aggregate functions in the SELECT list for now. We will primarily select the grouping columns themselves.

Example using the students table:

Scenario 1: Grouping students by course.

SQL Query:

SELECT course
FROM students
GROUP BY course;

pgAdmin 4 Instructions:

Follow the same steps as in the ORDER BY example in pgAdmin 4 to execute this query.

Result:

This query returns a list of unique courses present in the students table. The GROUP BY course clause has grouped all rows with the same course value into a single group, and then selected the course value from each group.

Scenario 2: Let’s add student_name to the SELECT clause in the previous query.

SQL Query (Incorrect):

SELECT course, student_name  -- student_name is not in GROUP BY and not aggregated
FROM students
GROUP BY course;

Result (PostgreSQL Error):

This query results in an error because student_name is in the SELECT list but is not part of the GROUP BY clause and is not used with an aggregate function. PostgreSQL requires that non-aggregated columns in the SELECT list must also be in the GROUP BY clause.

Scenario 3: Grouping by course and then also selecting student names (demonstrating the grouping, not intended to be a practical query without aggregation yet).

To demonstrate the grouping concept, we can modify the query to include student_name in the GROUP BY clause. This would create groups based on unique combinations of course and student_name.

SQL Query (Illustrative for grouping concept):

SELECT course, student_name
FROM students
GROUP BY course, student_name;

Result:

This result shows that each unique combination of course and student_name is treated as a separate group. While this specific query might not be immediately useful without aggregation, it illustrates how GROUP BY works with multiple columns.

In summary, GROUP BY is crucial for categorizing your data and preparing it for aggregate calculations (which are coming in P10). It allows you to move from individual rows to sets of rows sharing common characteristics.

3. The HAVING Clause: Filtering Groups

The HAVING clause is used to filter groups created by the GROUP BY clause. It is similar to the WHERE clause, but WHERE filters individual rows before grouping, while HAVING filters groups after they are formed. HAVING is always used in conjunction with GROUP BY.

Syntax:

SELECT column1, column2, ...
FROM table_name
WHERE condition  -- (Optional WHERE clause)
GROUP BY column_name1, column_name2, ...
HAVING group_condition;
  • HAVING group_condition: Specifies the condition that groups must meet to be included in the final result. This condition is evaluated for each group.

Important Note: Similar to the SELECT clause with GROUP BY, conditions in the HAVING clause often involve aggregate functions (which we’ll cover in P10). However, for this introductory tutorial, we will illustrate the concept of filtering groups based on a simple condition related to the grouping column itself.

Example using the students table:

Let’s assume we want to find courses that have names starting with ‘D’. We can use HAVING to filter the groups of courses.

SQL Query:

SELECT course
FROM students
GROUP BY course
HAVING course LIKE 'D%';

pgAdmin 4 Instructions:

Execute this query in pgAdmin 4 as before.

Result:

Explanation:

  1. GROUP BY course: Groups the rows by course, resulting in groups for ‘Computer Sci’, ‘Mathematics’, and ‘Physics’.
  2. HAVING course LIKE 'P%': Filters these groups. It checks if the course name for each group starts with ‘P’. Only the group where the course is ‘Physics’ satisfies this condition.
  3. SELECT course: Selects the course name from the groups that passed the HAVING condition.

Distinction between WHERE and HAVING:

  • WHERE: Filters rows before grouping. It operates on individual rows and is used to select which rows will be part of the grouping process.
  • HAVING: Filters groups after grouping. It operates on the groups formed by GROUP BY and is used to select which groups will be included in the final result.

Example demonstrating both WHERE and HAVING (without complex aggregate functions yet):

Let’s say we want to find courses that contains character ‘m’ and only consider students with grades above 90.

SQL Query:

SELECT course
FROM students
WHERE grade > 90  -- WHERE clause to filter rows before grouping
GROUP BY course
HAVING course LIKE '%m%'; -- HAVING clause to filter groups after grouping

Result:

Explanation:

  1. WHERE grade > 90: Filters the students table, selecting only rows where grade is greater than 80. This results in the following temporary data set:

  1. GROUP BY course: Groups the rows in this temporary data set by course.
  2. HAVING course LIKE '%m%': Filters the groups formed in the previous step, keeping only the group where the course name contains character ‘m’ (which is ‘Computer Sci’).
  3. SELECT course: Selects the course name from the remaining group.

4. Combining ORDER BY, GROUP BY, and HAVING

You can use all three clauses (ORDER BY, GROUP BY, and HAVING) in a single query to achieve complex data manipulation. The order of execution in a SQL query with these clauses is typically:

  1. FROM: Specifies the table(s) to retrieve data from.
  2. WHERE: Filters rows based on specified conditions.
  3. GROUP BY: Groups rows based on specified columns.
  4. HAVING: Filters groups based on specified conditions.
  5. SELECT: Selects the columns to be displayed.
  6. ORDER BY: Sorts the final result set.

Example:

Let’s say we want to find courses ending with ‘s’, considering only students with grades above 80, and then order the resulting courses alphabetically.

SQL Query:

SELECT course
FROM students
WHERE grade > 80
GROUP BY course
HAVING course LIKE '%s'
ORDER BY course ASC;

Result:

Practical Exercise:

Using the students table, write SQL queries to achieve the following:

  1. List all courses in descending alphabetical order.
  2. Find all courses that contain the letter ‘s’ in their name.
  3. List the courses that contain the letter ‘s’ in their name, ordered alphabetically.
Solutions (Try to solve them yourself first!)
SELECT course
FROM students
GROUP BY course
ORDER BY course DESC;
SELECT course
FROM students
GROUP BY course
HAVING course LIKE '%s%';
SELECT course
FROM students
GROUP BY course
HAVING course LIKE '%s%'
ORDER BY course ASC;