Lesson 6	Advanced group operators: DISTINCT, UNIQUE, HAVING
Objective	Correctly place group functions DISTINCT, UNIQUE, or HAVING clauses

Advanced Group Operators: DISTINCT, UNIQUE, HAVING

In earlier lessons, you learned how group functions such as SUM, AVG, and COUNT summarize data and how GROUP BY defines which rows belong to each group. This lesson focuses on where to place three related elements in your SELECT statements:

DISTINCT and UNIQUE in the SELECT list
DISTINCT inside group functions such as COUNT
The HAVING clause after GROUP BY

Correct placement affects both the result set and performance. Misplacing DISTINCT or HAVING is a common source of subtle bugs in production queries.

Review: GROUP BY and group functions

A basic grouped query uses GROUP BY with one or more aggregate functions:


SELECT   FIRSTNAME,
         LASTNAME,
         SUM(TOTAL_SALE_AMOUNT) AS customer_total
FROM     CUSTOMER       C
         JOIN CUSTOMER_SALE CS
           ON C.CUST_ID = CS.CUST_ID
GROUP BY FIRSTNAME, LASTNAME
ORDER BY FIRSTNAME, LASTNAME;

Here, GROUP BY FIRSTNAME, LASTNAME defines the grouping. For each distinct combination of first and last name, Oracle returns one summary row. The placement of GROUP BY is fixed in the SELECT statement syntax and does not change when you add DISTINCT, UNIQUE, or HAVING.

Oracle Database SQL

DISTINCT and UNIQUE in the SELECT list

The DISTINCT and UNIQUE keywords are set modifiers for the SELECT list. In Oracle, they are synonyms: UNIQUE behaves the same as DISTINCT in a SELECT statement.

When you place DISTINCT immediately after SELECT, it removes duplicate rows from the final result based on all selected columns:


SELECT DISTINCT PRODUCT_ID
FROM   SALE_ITEM;

This query returns each PRODUCT_ID at most once, even if it appears in many rows in SALE_ITEM.

You can also apply DISTINCT to multiple columns:


SELECT DISTINCT CUSTOMER_ID,
                PRODUCT_ID
FROM   SALE_ITEM;

In this case, duplicates are removed based on the combination of CUSTOMER_ID and PRODUCT_ID. Rows that share the same pair of values collapse into a single row in the result.

Key points about placement:

DISTINCT or UNIQUE appears directly after the SELECT keyword.
It applies to the entire row defined by the SELECT list, not just a single column (unless you only select one column).
Using DISTINCT together with GROUP BY on the same columns is usually redundant; GROUP BY already collapses rows by group.

DISTINCT inside group functions

Group functions themselves can take DISTINCT or ALL as an option. The general form is:


COUNT([DISTINCT | ALL] expr)

When you place DISTINCT inside a group function, it affects only that function, not the entire row:


SELECT COUNT(DISTINCT PRODUCT_ID) AS distinct_products_sold
FROM   SALE_ITEM;

Here, COUNT(DISTINCT PRODUCT_ID) returns the number of unique products sold. Contrast that with:


SELECT COUNT(PRODUCT_ID) AS non_null_product_rows
FROM   SALE_ITEM;

which returns the number of rows where PRODUCT_ID is not NULL, counting duplicates.

You can also use DISTINCT with other group functions:

SUM(DISTINCT AMOUNT) – sum each distinct non-null value once.
AVG(DISTINCT RATING) – average unique ratings.

Correct placement patterns to remember:

Row-level deduplication: SELECT DISTINCT col1, col2, ...
Value-level deduplication inside an aggregate: SELECT COUNT(DISTINCT col1) ...

Filtering groups with HAVING

The WHERE clause filters individual rows before grouping. The HAVING clause filters groups after group functions have been computed. You use HAVING when your condition depends on an aggregate value.

A typical pattern is:


SELECT   FIRSTNAME,
         LASTNAME,
         SUM(TOTAL_SALE_AMOUNT) AS customer_total
FROM     CUSTOMER       C
         JOIN CUSTOMER_SALE CS
           ON C.CUST_ID = CS.CUST_ID
GROUP BY FIRSTNAME, LASTNAME
HAVING   SUM(TOTAL_SALE_AMOUNT) > 100
ORDER BY FIRSTNAME, LASTNAME;

In this example:

GROUP BY defines one row per customer.
SUM(TOTAL_SALE_AMOUNT) calculates each customer’s total.
HAVING keeps only customers with totals greater than 100.

Correct placement in the SELECT statement:


SELECT ...
FROM   ...
[WHERE row_condition]
GROUP BY group_expr
HAVING group_condition
[ORDER BY ...];

Common guidelines:

Use WHERE for conditions on individual rows (non-aggregated expressions).
Use HAVING for conditions on groups (expressions that involve group functions).
You can combine WHERE and HAVING in the same query for efficient filtering before and after aggregation.

For example, to find customers in a specific city who have spent more than $500:


SELECT   FIRSTNAME,
         LASTNAME,
         SUM(TOTAL_SALE_AMOUNT) AS customer_total
FROM     CUSTOMER       C
         JOIN CUSTOMER_SALE CS
           ON C.CUST_ID = CS.CUST_ID
WHERE    C.CITY = 'Chicago'              -- row filter
GROUP BY FIRSTNAME, LASTNAME
HAVING   SUM(TOTAL_SALE_AMOUNT) > 500;   -- group filter

Putting it all together

This lesson focused on placement rather than introducing new group functions. When you work with grouped queries in Oracle, remember:

DISTINCT / UNIQUE immediately after SELECT remove duplicate rows from the final result.
DISTINCT inside group functions such as COUNT deduplicates values only for that function.
HAVING appears after GROUP BY and filters groups based on aggregate values, while WHERE filters individual rows before grouping.

Correctly placing these elements makes your queries easier to read, easier to maintain, and less likely to produce subtle errors in production systems.

Advanced Group Operators – Exercise

Click the Exercise link below to practice placing group functions in queries.

Advanced Group Operators – Exercise