| Lesson 6 | Advanced group operators: DISTINCT, UNIQUE, HAVING |
| Objective | Correctly place group functions DISTINCT, UNIQUE, or HAVING clauses |
In earlier lessons, you learned how group functions such as SUM,
AVG, and COUNT summarize data and how
GROUP BY defines which rows belong to each group. This lesson focuses
on where to place three related elements in your SELECT statements:
DISTINCT and UNIQUE in the SELECT listDISTINCT inside group functions such as COUNTHAVING clause after GROUP BY
Correct placement affects both the result set and performance. Misplacing
DISTINCT or HAVING is a common source of subtle bugs in
production queries.
A basic grouped query uses GROUP BY with one or more aggregate
functions:
SELECT FIRSTNAME,
LASTNAME,
SUM(TOTAL_SALE_AMOUNT) AS customer_total
FROM CUSTOMER C
JOIN CUSTOMER_SALE CS
ON C.CUST_ID = CS.CUST_ID
GROUP BY FIRSTNAME, LASTNAME
ORDER BY FIRSTNAME, LASTNAME;
Here, GROUP BY FIRSTNAME, LASTNAME defines the grouping. For each
distinct combination of first and last name, Oracle returns one summary row. The
placement of GROUP BY is fixed in the SELECT statement syntax and does
not change when you add DISTINCT, UNIQUE, or
HAVING.
The DISTINCT and UNIQUE keywords are
set modifiers for the SELECT list. In Oracle, they are
synonyms: UNIQUE behaves the same as DISTINCT in a SELECT
statement.
When you place DISTINCT immediately after SELECT, it
removes duplicate rows from the final result based on all selected columns:
SELECT DISTINCT PRODUCT_ID
FROM SALE_ITEM;
This query returns each PRODUCT_ID at most once, even if it appears in
many rows in SALE_ITEM.
You can also apply DISTINCT to multiple columns:
SELECT DISTINCT CUSTOMER_ID,
PRODUCT_ID
FROM SALE_ITEM;
In this case, duplicates are removed based on the combination of
CUSTOMER_ID and PRODUCT_ID. Rows that share the same pair
of values collapse into a single row in the result.
Key points about placement:
DISTINCT or UNIQUE appears directly after the SELECT keyword.SELECT list, not just a
single column (unless you only select one column).
DISTINCT together with GROUP BY on the same
columns is usually redundant; GROUP BY already collapses rows by
group.
Group functions themselves can take DISTINCT or ALL as an
option. The general form is:
COUNT([DISTINCT | ALL] expr)
When you place DISTINCT inside a group function, it affects
only that function, not the entire row:
SELECT COUNT(DISTINCT PRODUCT_ID) AS distinct_products_sold
FROM SALE_ITEM;
Here, COUNT(DISTINCT PRODUCT_ID) returns the number of unique products
sold. Contrast that with:
SELECT COUNT(PRODUCT_ID) AS non_null_product_rows
FROM SALE_ITEM;
which returns the number of rows where PRODUCT_ID is not
NULL, counting duplicates.
You can also use DISTINCT with other group functions:
SUM(DISTINCT AMOUNT) – sum each distinct non-null value once.AVG(DISTINCT RATING) – average unique ratings.Correct placement patterns to remember:
SELECT DISTINCT col1, col2, ...
SELECT COUNT(DISTINCT col1) ...
The WHERE clause filters individual rows before grouping. The
HAVING clause filters groups after group
functions have been computed. You use HAVING when your condition depends
on an aggregate value.
A typical pattern is:
SELECT FIRSTNAME,
LASTNAME,
SUM(TOTAL_SALE_AMOUNT) AS customer_total
FROM CUSTOMER C
JOIN CUSTOMER_SALE CS
ON C.CUST_ID = CS.CUST_ID
GROUP BY FIRSTNAME, LASTNAME
HAVING SUM(TOTAL_SALE_AMOUNT) > 100
ORDER BY FIRSTNAME, LASTNAME;
In this example:
GROUP BY defines one row per customer.SUM(TOTAL_SALE_AMOUNT) calculates each customer’s total.HAVING keeps only customers with totals greater than 100.Correct placement in the SELECT statement:
SELECT ...
FROM ...
[WHERE row_condition]
GROUP BY group_expr
HAVING group_condition
[ORDER BY ...];
Common guidelines:
WHERE for conditions on individual rows (non-aggregated
expressions).
HAVING for conditions on groups (expressions that involve group
functions).
WHERE and HAVING in the same query for
efficient filtering before and after aggregation.
For example, to find customers in a specific city who have spent more than $500:
SELECT FIRSTNAME,
LASTNAME,
SUM(TOTAL_SALE_AMOUNT) AS customer_total
FROM CUSTOMER C
JOIN CUSTOMER_SALE CS
ON C.CUST_ID = CS.CUST_ID
WHERE C.CITY = 'Chicago' -- row filter
GROUP BY FIRSTNAME, LASTNAME
HAVING SUM(TOTAL_SALE_AMOUNT) > 500; -- group filter
This lesson focused on placement rather than introducing new group functions. When you work with grouped queries in Oracle, remember:
DISTINCT / UNIQUE immediately after SELECT
remove duplicate rows from the final result.
DISTINCT inside group functions such as COUNT deduplicates
values only for that function.
HAVING appears after GROUP BY and filters groups based on
aggregate values, while WHERE filters individual rows before grouping.
Correctly placing these elements makes your queries easier to read, easier to maintain, and less likely to produce subtle errors in production systems.
Click the Exercise link below to practice placing group functions in queries.