The best way to use the EXISTS clause is to substitute it for an IN clause.
The link below converts a query with an IN clause to a query with an EXISTS clause.
Changing an `IN` clause to an `EXISTS` clause in Oracle SQL is often done for performance reasons, particularly when dealing with large datasets. The two constructs are logically similar but work differently under the hood.
Key Differences:
- `IN`: Checks if a value is within a set of values returned by a subquery.
- `EXISTS`: Checks if a subquery returns any rows (more efficient when dealing with large datasets, as it stops processing as soon as a match is found).
Example: Using `IN`:
SELECT EMP_ID, EMP_NAME, SALARY
FROM EMPLOYEES
WHERE DEPT_ID IN (SELECT DEPT_ID FROM DEPARTMENTS WHERE LOCATION = 'New York');
How it Works:
- The subquery retrieves all
DEPT_ID
s from the DEPARTMENTS
table where LOCATION = 'New York'
.
- The main query filters employees whose
DEPT_ID
is in that list.
Changing to `EXISTS`:
SELECT E.EMP_ID, E.EMP_NAME, E.SALARY
FROM EMPLOYEES E
WHERE EXISTS (
SELECT 1
FROM DEPARTMENTS D
WHERE D.DEPT_ID = E.DEPT_ID
AND D.LOCATION = 'New York'
);
How `EXISTS` Works:
- The subquery is correlated with the outer query (`D.DEPT_ID = E.DEPT_ID`).
- It checks if any matching row exists in `DEPARTMENTS`.
- If at least one row exists, the main query includes the corresponding employee.
When to Use `EXISTS` Over `IN`?:
Use `EXISTS`: when:
- The subquery returns a large number of values (better performance).
- There’s a need for correlation between tables.
- The subquery has complex joins.
Use `IN`: when:
- The subquery returns a small, finite set of values.
- No correlation is needed between the outer and inner query.
EXISTS is a test for existence.
It is placed the way IN might be placed with a subquery, but it differs in that it is a logical test for the return of rows from a query, not for the rows themselves. How many authors have written more than one book on the bookshelf?
select AuthorName, COUNT(*)
from BOOKSHELF_AUTHOR
group by AuthorName
having COUNT(*) > 1;
group by AuthorName
| AUTHORNAME | COUNT(*)|
|----------------------|---------|
| DAVID MCCULLOUGH | 2 |
| DIETRICH BONHOEFFER | 2 |
| E. B. WHITE | 2 |
| SOREN KIERKEGAARD | 2 |
| STEPHEN JAY GOULD | 2 |
| W. P. KINSELLA | 2 |
| WILTON BARNHARDT | 2 |
This table represents the extracted data from the image, where each author appears twice in the dataset.
Attempting to find both AuthorName and Title fails, however, because the group by made necessary by the COUNT(*) is on the primary key of the BOOKSHELF_AUTHOR table (AuthorName, Title).
Because each primary key, by definition, uniquely identifies only one row, the count of titles for that one row can never be greater than 1, so the having clause always tests false since it
does not find any rows:
select AuthorName, Title, COUNT(*)
from BOOKSHELF_AUTHOR
group by AuthorName, Title
having COUNT(*) > 1;
no rows selected.
EXISTS provides a solution. The following subquery asks, for each AuthorName selected in the outer query, whether an AuthorName exists in the BOOKSHELF_AUTHOR table with a count of Titles greater than one.
If the answer for a given name is yes, the EXISTS test is true, and the outer query selects an AuthorName and Title. The author names are correlated by the "BA" alias given to the first BOOKSHELF_AUTHOR table.
column AuthorName format a25
column Title format a30
select AuthorName, Title
from BOOKSHELF_AUTHOR BA
where EXISTS
(select 'x'
from BOOKSHELF_AUTHOR BA2
where BA.AuthorName = BA2.AuthorName
group by BA2.AuthorName
having COUNT(BA2.Title) > 1)
order by AuthorName, Title;
In the next lesson, you will learn the unusual case of embedding a subquery in the FROM clause of your query.