Database Engine support for Subquery Statement
The Subquery approach to working with information works in many, but not all, SQL engines. You will need to do some investigation to determine whether your specific engine supports it. Microsoft's SQL Server 2000/2005, Microsoft Access, Oracle 9i/10G/11g, and many other engines do support it.
It seems that the capability becomes more of an issue in PC-based engines, but this is sure to change as more and more of the server-based engines also begin working on your workstation.
SQL Subquery
A very powerful feature of SQL is the subquery statement. As the name implies, subquery refers to a SELECT statement that contains a subordinate SELECT statement. This can be a very flexible way of selecting data. Let us assume that we want to list all employees who work in sales. The challenge is that the DEPARTMENTS table in the sample HR schema contains several sales departments, including
- Sales,
- Government Sales, and
- Retail Sales.
We could place literals for those three department names or their corresponding department IDs in the WHERE clause of our
SELECT statement. However, the problem we then face is maintenance of the query if a sales-related department is subsequently
added or eliminated. A safer approach is to use an SQL query to find the applicable department IDs when the query is run and then use that list of IDs to find the employees. The query to find the department IDs is simple enough:
SELECT DEPARTMENT_ID
FROM DEPARTMENTS
WHERE DEPARTMENT_NAME LIKE '%Sales%';
If we place the preceding SELECT statement in the WHERE clause of a query that lists the employee information of interest.
Note that SQL syntax requires the subselect to be enclosed in a pair of parentheses:
SELECT LAST_NAME, FIRST_NAME, HIRE_DATE, SALARY
FROM EMPLOYEES
WHERE DEPARTMENT_ID IN
(SELECT DEPARTMENT_ID
FROM DEPARTMENTS
WHERE DEPARTMENT_NAME LIKE '%Sales%')
ORDER BY LAST_NAME, FIRST_NAME;
The statement used in this example is said to contain a noncorrelated subselect because the inner SELECT (that is, the one inside the WHERE clause) can be run first and the results used when the outer SELECT is run. There also is such a thing as a correlated subselect (or subquery), where the outer query must be invoked multiple times, once for each row found in the inner query. Consider this example:
SELECT LAST_NAME, FIRST_NAME, SALARY, DEPARTMENT_ID
FROM EMPLOYEES A
WHERE SALARY >
(SELECT AVG(SALARY)
FROM EMPLOYEES B
WHERE A.DEPARTMENT_ID = B.DEPARTMENT_ID);