Creating Queries Using Standard Functions in PostgreSQL

1. Introduction to Standard Functions

Standard functions (also known as built-in functions) are pre-defined operations within the SQL language that perform specific tasks on data values. They take zero or more input arguments and return a single value. Using these functions can significantly simplify data manipulation and retrieval, allowing you to format output, perform calculations, handle nulls, and extract specific pieces of information directly in your query.

In PostgreSQL, there is a rich library of standard functions covering various data types: strings, numbers, dates/times, and more. We will explore some of the most commonly used ones.

Why use standard functions?

  • Data Transformation: Change data from one format to another (e.g., uppercase text, rounded numbers).
  • Data Extraction: Pull specific parts out of a data value (e.g., the year from a date, a substring from text).
  • Calculation: Perform mathematical operations.
  • Conditional Logic: Implement simple if-then-else logic within a query.
  • Efficiency: Often performs operations more efficiently within the database engine than processing raw data in an external application.

2. Common Standard Function Categories

Let’s explore functions by category, using hypothetical tables like employees (with columns employee_id, first_name, last_name, hire_date, salary) and products (with columns product_id, product_name, price, stock_date).

-- Drop tables if they exist to start fresh (optional, use with caution)
-- DROP TABLE IF EXISTS products;
-- DROP TABLE IF EXISTS employees;

-- 1. Create the 'employees' table
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,  -- SERIAL automatically creates an auto-incrementing integer
    first_name VARCHAR(50) NOT NULL,
    middle_name VARCHAR(50),         -- Allows NULLs, as used in COALESCE example
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,       -- Added for the LOWER() example, UNIQUE ensures no duplicates
    hire_date DATE,
    salary NUMERIC(10, 2)            -- Precision 10, Scale 2 (e.g., 12345678.99)
);

-- 2. Insert sample data into the 'employees' table
INSERT INTO employees (first_name, middle_name, last_name, email, hire_date, salary) VALUES
('Alice', 'Jane', 'Smith', 'alice.j.smith@company.com', '2020-05-15', 60000.00),
('Bob', NULL, 'Johnson', 'B.Johnson@company.com', '2019-11-01', 75000.50),
('Charlie', 'Robert', 'Williams', 'charlie.w@company.com', '2021-01-20', 55000.00),
('Diana', NULL, 'Brown', 'diana.brown@company.com', '2020-05-15', 62000.75),
('Ethan', 'Michael', 'Davis', 'Ethan.M.Davis@company.com', '2022-08-30', 80000.00),
('Fiona', 'Grace', 'Miller', 'fiona.miller@company.com', '2023-12-10', 50000.00); -- Hired in December

-- 3. Create the 'products' table
CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    product_name VARCHAR(100) NOT NULL,
    product_code VARCHAR(20) UNIQUE,      -- Added for the TRIM() example
    price NUMERIC(8, 2),                -- Precision 8, Scale 2 (e.g., 123456.78)
    stock_date DATE
);

-- 4. Insert sample data into the 'products' table
INSERT INTO products (product_name, product_code, price, stock_date) VALUES
('Laptop Pro 15', ' LP15-001 ', 1499.99, '2023-01-20'), -- Note leading/trailing spaces in code
('Wireless Mouse', 'WM-100A', 25.50, '2023-02-15'),
('Mechanical Keyboard', ' MK-X2000', 120.00, '2023-01-20'), -- Note leading space in code
('USB-C Hub', 'UCH-05B', 45.95, '2023-03-01'),
('Monitor 27 Inch', 'MON27-QHD ', 350.00, '2022-11-10'), -- Note trailing space in code
('Webcam HD', 'WC-1080P', 75.25, '2023-02-15'),
('External SSD 1TB', 'SSD-EXT1T', 99.99, '2023-04-05');

2.1 String Functions

These functions operate on text data (types like VARCHAR, TEXT, CHAR).

  • LENGTH(string): Returns the number of characters in a string.
    -- Find the length of each employee's last name
    SELECT last_name, LENGTH(last_name) AS name_length
    FROM employees;
    
  • UPPER(string) / LOWER(string): Converts a string to all uppercase or all lowercase.
    -- Display product names in uppercase
    SELECT UPPER(product_name) AS upper_product_name
    FROM products;
    
    -- Display email addresses (assuming an 'email' column) in lowercase
    SELECT LOWER(email) AS lower_email
    FROM employees;
    
  • SUBSTRING(string FROM start [FOR length]): Extracts a substring. The position count starts at 1.
    -- Get the first 3 characters of employee first names
    SELECT first_name, SUBSTRING(first_name FROM 1 FOR 3) AS first_initials
    FROM employees;
    
  • CONCAT(str1, str2, ...) or the || operator: Concatenates (joins) two or more strings together. The || operator is often preferred in PostgreSQL for its simplicity.
    -- Create a full name string from first and last names
    SELECT first_name, last_name, CONCAT(first_name, ' ', last_name) AS full_name_concat
    FROM employees;
    
    -- Using the || operator (recommended in PostgreSQL)
    SELECT first_name, last_name, first_name || ' ' || last_name AS full_name_operator
    FROM employees;
    
  • REPLACE(string, from_substring, to_substring): Replaces all occurrences of a substring within a string with another substring.
    -- Replace spaces with underscores in product names (for example usage)
    SELECT product_name, REPLACE(product_name, ' ', '_') AS underscored_name
    FROM products;
    
  • TRIM([LEADING | TRAILING | BOTH] [characters] FROM string): Removes leading, trailing, or both leading and trailing characters (default is whitespace) from a string.
    -- Assume a column 'product_code' might have leading/trailing spaces
    SELECT product_code, TRIM(product_code) AS trimmed_code
    FROM products;
    
    -- Remove leading zeros from a code
    SELECT '00123', TRIM(LEADING '0' FROM '00123'); -- Returns '123'
    

2.2 Numeric Functions

These functions perform calculations on numeric data types (INTEGER, NUMERIC, DECIMAL, REAL, DOUBLE PRECISION, etc.).

  • ABS(number): Returns the absolute (non-negative) value of a number.
    -- Calculate the difference between each employee's salary and a target of $65,000.
    -- Then, show the absolute value of that difference using ABS().
    SELECT
        employee_id,
        first_name,
        salary,
        salary - 65000.00 AS difference_from_target, -- This might be negative
        ABS(salary - 65000.00) AS absolute_difference -- This will always be positive
    FROM
        employees;
    
  • ROUND(number [, decimal_places]): Rounds a number to a specified number of decimal places. If decimal_places is omitted, it rounds to the nearest integer.
    -- Round product prices to 1 decimal place
    SELECT product_name, price, ROUND(price, 1) AS rounded_price
    FROM products;
    
    -- Round salaries to the nearest 1000 (using a negative decimal_places value)
    SELECT salary, ROUND(salary, -3) AS rounded_salary_k
    FROM employees;
    
  • CEIL(number) or CEILING(number): Returns the smallest integer greater than or equal to the number (rounds up).
    -- Demonstrate CEIL: Show product price rounded UP to the nearest whole dollar amount
    SELECT
    product_name,
    price,
    CEIL(price) AS price_rounded_up_dollar
    -- You can also use CEILING(price) which does the same thing
    FROM products;
    
  • FLOOR(number): Returns the largest integer less than or equal to the number (rounds down).
    -- Find the whole dollar amount of a price
    SELECT price, FLOOR(price) AS whole_dollars
    FROM products;
    
  • MOD(dividend, divisor) or the % operator: Returns the remainder of a division (modulo operation).
    -- Find if an employee ID is odd or even
    SELECT employee_id, MOD(employee_id, 2) AS remainder -- 0 for even, 1 for odd
    FROM employees;
    
    -- Using the % operator
    SELECT employee_id, employee_id % 2 AS remainder
    FROM employees;
    

2.3 Date/Time Functions

These functions operate on date and time data types (DATE, TIMESTAMP, TIME, INTERVAL).

  • NOW(): Returns the current date and time (as a timestamp with time zone).
  • CURRENT_DATE: Returns the current date (as a date).
  • CURRENT_TIME: Returns the current time (as a time with time zone).
    -- Select the current date and time
    SELECT NOW(), CURRENT_DATE, CURRENT_TIME;
    
  • EXTRACT(field FROM source): Extracts a specific part (like year, month, day, hour, minute) from a date/time value. source can be a date/time column or value, and field can be YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, DOW (day of week, 0=Sunday), DOY (day of year), etc.
    -- Extract the year and month of hire for employees
    SELECT hire_date,
           EXTRACT(YEAR FROM hire_date) AS hire_year,
           EXTRACT(MONTH FROM hire_date) AS hire_month
    FROM employees;
    
    -- Find employees hired on a Friday (DOW = 5 in PostgreSQL)
    SELECT first_name, last_name, hire_date
    FROM employees
    WHERE EXTRACT(DOW FROM hire_date) = 5;
    

    Note: DATE_PART('field', source) is a PostgreSQL-specific equivalent to EXTRACT.

  • AGE(timestamp, timestamp) or AGE(timestamp): Calculates the interval between two timestamps. If only one argument is given, it calculates the interval between the argument and CURRENT_DATE. The result is an INTERVAL type, often displayed in years, months, and days.
    -- Calculate the tenure of each employee as of today
    SELECT first_name, last_name, hire_date, AGE(hire_date) AS tenure
    FROM employees;
    
  • Date Arithmetic: You can perform arithmetic using INTERVAL.
    -- Find the date 90 days after the hire date
    SELECT hire_date, hire_date + INTERVAL '90 day' AS review_date
    FROM employees;
    
    -- Find employees hired within the last 3 years
    SELECT first_name, last_name, hire_date
    FROM employees
    WHERE hire_date >= CURRENT_DATE - INTERVAL '3 year';
    

2.4 Conversion Functions

These functions convert data from one type to another.

  • CAST(expression AS type): The standard SQL way to perform type conversions.
  • expression::type: PostgreSQL’s shorthand syntax for type casting.
    -- Convert a numeric salary to text for concatenation
    SELECT first_name, last_name, 'Salary: ' || CAST(salary AS VARCHAR) AS salary_text
    FROM employees;
    
    -- Using PostgreSQL shorthand
    SELECT first_name, last_name, 'Salary: ' || salary::VARCHAR AS salary_text
    FROM employees;
    
    -- Convert a text date string to a DATE type (if format is standard)
    SELECT CAST('2023-10-26' AS DATE);
        
    SELECT '2023-10-26'::DATE;
    

    Caution: Ensure the data is compatible with the target type, or an error will occur.

2.5 Conditional Expressions (often used like functions)

These allow for conditional logic within a query.

  • COALESCE(value1, value2, ...): Returns the first non-NULL value in the list of arguments. Very useful for providing default values for NULL columns.
    -- Display 'N/A' if the middle_name column is NULL
    SELECT first_name, COALESCE(middle_name, 'N/A') AS middle_name_display, last_name
    FROM employees; -- Assuming a 'middle_name' column that allows NULLs
    
  • NULLIF(value1, value2): Returns NULL if value1 equals value2, otherwise returns value1. Useful for preventing division-by-zero errors or treating specific values as NULL.
    -- Demonstrate NULLIF by returning NULL if a product price matches a specific value.
    -- Select the product name, original price, and the price modified by NULLIF.
    SELECT
        product_name,
        price,
        NULLIF(price, 75.25) AS price_excluding_webcam_value
    FROM
        products;
    
    -- Demonstrate NULLIF by returning NULL if an employee's salary matches a specific value.
    SELECT
        first_name,
        last_name,
        salary,
        NULLIF(salary, 55000.00) AS salary_excluding_charlie_value
    FROM
        employees;
    
  • CASE WHEN condition1 THEN result1 [WHEN condition2 THEN result2 ...] [ELSE else_result] END: The standard SQL way to implement if-then-else logic.
    -- Categorize products based on price
    SELECT product_name, price,
           CASE
               WHEN price < 50.00 THEN 'Budget'
               WHEN price >= 50.00 AND price < 200.00 THEN 'Mid-Range'
               ELSE 'Premium'
           END AS price_category
    FROM products;
    

3. Combining Functions

Functions can be nested or used together within expressions.

-- Get the first initial in uppercase for all employees
SELECT first_name, UPPER(SUBSTRING(first_name FROM 1 FOR 1)) AS initial
FROM employees;

-- Display hire year and month, padded with leading zero if month < 10
SELECT hire_date,
       EXTRACT(YEAR FROM hire_date) AS hire_year,
       LPAD(CAST(EXTRACT(MONTH FROM hire_date) AS TEXT), 2, '0') AS hire_month_padded -- LPAD adds padding
FROM employees;
-- Note: LPAD is another useful string function for formatting.

4. Using Functions in Other Clauses

Functions are not limited to the SELECT list. They can be used in WHERE, ORDER BY, and sometimes GROUP BY (though be careful not to use aggregate functions here yet).

  • In WHERE:
    -- Find employees hired in December (Month = 12)
    SELECT first_name, last_name, hire_date
    FROM employees
    WHERE EXTRACT(MONTH FROM hire_date) = 12;
    
    -- Find products whose names (in lowercase) start with 'w'
    SELECT product_name
    FROM products
    WHERE LOWER(product_name) LIKE 'w%';
    
  • In ORDER BY:
    -- Order employees by the length of their last name
    SELECT first_name, last_name
    FROM employees
    ORDER BY LENGTH(last_name) DESC; -- Longest names first
    
    -- Order products by the last 3 characters of their name
    SELECT product_name
    FROM products
    ORDER BY SUBSTRING(product_name FROM LENGTH(product_name) - 2);