SQL Aggregate Functions — SUM, AVG, MIN, MAX
Aggregate functions collapse multiple rows into a single value. SUM totals numbers, AVG computes the mean, MIN finds the smallest value, MAX finds the largest. Every SQL SELECT statement that produces summary reports relies on at least one of these four functions.
This guide covers all four, their NULL behavior, GROUP BY usage, and the common pitfalls that produce wrong results.
SQL aggregate functions (SUM, AVG, MIN, MAX) reduce rows to a single result. All four ignore NULL values — AVG divides by the count of non-NULL rows, not total rows. Pair them with GROUP BY for per-category summaries and HAVING to filter groups by aggregate values. Use CAST to avoid integer division in AVG. MIN and MAX work on dates and strings, not just numbers. For row-level aggregates without collapsing rows, use window functions instead.
How Aggregate Functions Work
An aggregate function takes a set of input values — one per row — and returns a single output value. If a SELECT statement returns 10,000 rows, SUM(amount) reads all 10,000 values and produces one total. That is the defining behavior: many rows in, one value out.
The critical rule that trips up beginners is NULL handling. All four functions silently skip NULL values. For SUM, MIN, and MAX, this is usually harmless — you get the sum, minimum, or maximum of whatever exists. For AVG, it changes the denominator.
Consider a table with five rows where the score column contains [80, NULL, 90, NULL, 70]. AVG(score) returns 80, not 48. The function divides by COUNT(score) — which is 3 — not by COUNT(*) — which is 5. If you need the average across all rows including NULLs as zeros, use AVG(COALESCE(score, 0)) instead.
SUM and AVG
SUM adds all non-NULL values in a column. It works on numeric types only. The most common use is totaling revenue, quantities, or costs:
SELECT SUM(sale_amount) AS total_sales FROM sales WHERE sale_date BETWEEN '2024-01-01' AND '2024-03-31';
The trap with AVG is integer division. If quantity is an integer column, SUM(quantity) / COUNT(*) performs integer division — 7 / 2 returns 3, not 3.5. The built-in AVG() function handles this correctly on most databases, but if you compute averages manually, cast first:
SELECT AVG(CAST(quantity AS DECIMAL(10,2))) AS avg_quantity FROM order_items;
A weighted average requires a manual formula because SQL has no built-in function for it:
SELECT SUM(price * quantity) / SUM(quantity) AS weighted_avg_price FROM order_items;
For running totals and moving averages that keep every row visible, aggregate functions alone are not enough. You need the window function variant: SUM() OVER. With PARTITION BY, you can compute running totals per department, customer, or any other grouping — without collapsing rows.
MIN and MAX
Unlike SUM and AVG, MIN and MAX work on any comparable data type — numbers, dates, and strings. This makes them useful beyond simple numeric analysis.
Date ranges are the most common non-numeric use case. Find the first and last event in a table:
SELECT MIN(hire_date) AS first_hire, MAX(hire_date) AS latest_hire FROM employees;
Finding the row that holds the min or max is a frequent requirement. The aggregate function returns the value, but not the row. Two approaches work:
-- Subquery approach SELECT employee_name, salary FROM employees WHERE salary = (SELECT MIN(salary) FROM employees);
-- Window function approach (avoids scanning the table twice) SELECT employee_name, salary FROM ( SELECT employee_name, salary, ROW_NUMBER() OVER (ORDER BY salary ASC) AS rn FROM employees ) ranked WHERE rn = 1;
Combining MIN and MAX in one query gives you the range — the spread between the smallest and largest value:
SELECT MIN(price) AS cheapest, MAX(price) AS most_expensive, MAX(price) - MIN(price) AS price_range FROM products;
Combining All Four
The most practical use of aggregate functions is combining all four in a single SELECT for exploratory data analysis. This is the pattern analysts run first when they encounter a new dataset:
SELECT COUNT(*) AS total_orders, SUM(order_total) AS revenue, ROUND(AVG(order_total), 2) AS avg_order, MIN(order_total) AS smallest_order, MAX(order_total) AS largest_order, MAX(order_total) - MIN(order_total) AS order_range FROM orders WHERE order_date >= '2024-01-01';
Add ROUND to AVG output. Raw averages with 15 decimal places are unreadable in reports. The ROUND(value, 2) call costs nothing and makes output usable.
Descriptive statistics — count, sum, mean, minimum, maximum, and range — form the foundation of exploratory data analysis. Computing all six in a single SQL query is the fastest way to profile an unfamiliar dataset.
— Wes McKinney, Python for Data Analysis, 3rd Edition
A department payroll summary combines all four with GROUP BY:
SELECT department, COUNT(*) AS headcount, SUM(salary) AS total_payroll, ROUND(AVG(salary), 0) AS avg_salary, MIN(salary) AS lowest_salary, MAX(salary) AS highest_salary FROM employees GROUP BY department ORDER BY total_payroll DESC;
GROUP BY and HAVING
Aggregate functions without GROUP BY produce a single row. GROUP BY produces one row per group — per department, per month, per region. Every non-aggregated column in the SELECT list must appear in the GROUP BY clause, or the query is invalid (MySQL in permissive mode is the exception, and it produces unpredictable results).
HAVING filters groups after aggregation. This is the key distinction from WHERE, which filters rows before aggregation:
A practical example — find departments where the average salary exceeds 60,000 and the team has at least 5 people:
SELECT department, COUNT(*) AS headcount, ROUND(AVG(salary), 0) AS avg_salary FROM employees GROUP BY department HAVING AVG(salary) > 60000 AND COUNT(*) >= 5;
Multi-level grouping lets you break data down by two or more dimensions. A monthly revenue report groups by year and month:
SELECT EXTRACT(YEAR FROM order_date) AS yr, EXTRACT(MONTH FROM order_date) AS mo, SUM(order_total) AS monthly_revenue FROM orders GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date) ORDER BY yr, mo;
Common Mistakes
NULL-unaware AVG. As described above, AVG ignores NULLs in the denominator. If your table has rows where the value is conceptually zero but stored as NULL, AVG will overstate the result. Use COALESCE to convert NULLs to zero before aggregating.
Integer division. In PostgreSQL, SQL Server, and MySQL, dividing two integers returns an integer. SELECT 7 / 2 returns 3. When computing averages manually, cast one operand: CAST(total AS DECIMAL(10,2)) / count.
Missing GROUP BY columns. Every column in the SELECT list that is not inside an aggregate function must appear in GROUP BY. MySQL's ONLY_FULL_GROUP_BY mode enforces this; older MySQL configs silently return arbitrary values for non-grouped columns.
Division by zero. Ratio calculations like SUM(revenue) / SUM(costs) blow up if any group has zero costs. Protect with CASE:
SELECT category, CASE WHEN SUM(costs) > 0 THEN ROUND(SUM(revenue) / SUM(costs), 2) ELSE NULL END AS roi FROM financial_data GROUP BY category;
Floating-point types (FLOAT, DOUBLE) accumulate rounding errors when summed over thousands of rows. For financial calculations, always use DECIMAL or NUMERIC types to guarantee exact arithmetic.
— Oracle, SQL Language Reference — Data Types
Floating-point accumulation. SUM on a FLOAT column may produce results like 999999.9999999997 instead of 1000000.00. Use DECIMAL or NUMERIC types for financial data.
Real-World Patterns
Financial reporting — monthly income, expenses, and net profit in one query:
SELECT EXTRACT(YEAR FROM txn_date) AS yr, EXTRACT(MONTH FROM txn_date) AS mo, SUM(CASE WHEN txn_type = 'income' THEN amount ELSE 0 END) AS income, SUM(CASE WHEN txn_type = 'expense' THEN amount ELSE 0 END) AS expenses, SUM(CASE WHEN txn_type = 'income' THEN amount ELSE -amount END) AS net_profit FROM transactions GROUP BY EXTRACT(YEAR FROM txn_date), EXTRACT(MONTH FROM txn_date) ORDER BY yr, mo;
Website analytics — page performance metrics over the last 30 days:
SELECT page_url, COUNT(*) AS visits, ROUND(AVG(load_time_ms), 0) AS avg_load_ms, MIN(load_time_ms) AS fastest_ms, MAX(load_time_ms) AS slowest_ms FROM page_views WHERE view_date >= CURRENT_DATE - INTERVAL '30 days' GROUP BY page_url HAVING COUNT(*) >= 100 ORDER BY visits DESC;
Inventory analysis — total value, average price, and price range per category:
SELECT category, COUNT(*) AS sku_count, CAST(SUM(price * stock_qty) AS DECIMAL(15,2)) AS total_value, ROUND(AVG(price), 2) AS avg_price, MIN(price) AS cheapest, MAX(price) AS most_expensive FROM products GROUP BY category ORDER BY total_value DESC;
Aggregate queries summarize data, but the summary only has meaning when column definitions are shared. If "revenue" is calculated differently across departments, SUM(revenue) produces a meaningless total. A business glossary in a data catalog — or a semantic layer — standardizes these definitions so aggregate results are comparable across teams and reports.