Both inner joins and outer joins are used in SQL to combine data from two tables based on a shared field, but they differ in how they handle rows that don't have a match in the other table.
Inner Join:
- Focuses on Matching Records: An inner join only returns rows where there's a match between the two tables on the join condition (usually common columns).
- Excludes Unmatched Rows: If a row in one table doesn't have a corresponding match in the other table based on the join condition, it's excluded from the result set.
- Simpler and Faster: Generally faster to execute compared to outer joins because it deals with a smaller dataset (matching rows only).
Outer Join:
- Preserves All Rows (or Specific Table's Rows): There are three types of outer joins (left, right, and full) that determine which rows are included in the result set, even if there's no match in the other table.
- Left outer join: Includes all rows from the left table and matching rows from the right table. Unmatched rows in the right table have null values for columns from the right table.
- Right outer join: Includes all rows from the right table and matching rows from the left table. Unmatched rows in the left table have null values for columns from the left table.
- Full outer join: Includes all rows from both tables, regardless of whether there's a match. Unmatched rows in either table have null values for the corresponding columns.
- More Complex: Outer joins can be slightly slower than inner joins because they need to handle unmatched rows and potentially fill them with null values.
Here's an analogy: Imagine matching students with their test scores from two separate lists.
- Inner join: Only students with scores on both lists are included.
- Left outer join: All students are included, even if their score is missing (null value) on the list.
- Right outer join: Includes all scores, even if the student is missing (null value) on the student list.
- Full outer join: Includes all students and all scores, filling in null values where there's no match.
Choosing the right join type depends on what information you need in your final result set. Inner joins are ideal when you only care about data with matches in both tables. Outer joins are useful when you want to see all data from one or both tables, even if there's no corresponding match in the other table.
You have seen plenty of subqueries in this module so far. Now you will learn about an advanced form of subquery called the
correlated subquery[1]. A "correlated subquery" is a subquery in which part of the query refers back to data in the outer query. The advantage of using a correlated subquery is that you can write a query that isolates the subquery portion while still combining the subquery with the main query as if it were a join. This form of query is sometimes easier to understand than a complicated multi-table join.
Correlated subquery syntax
Here is the general syntax:
SELECT col1, col2, ...FROM table alias1WHERE col3 =
(SELECT col4 FROM table alias2
WHERE alias1.col5 = alias2.col6)
Notice that alias1 (the alias for the table in the outer query) appears in the WHERE clause of the subquery. This connection between a column FROM the outer query and a column FROM the subquery makes this a correlated subquery. Here is an example, using the Pet Store schema. The query in the following diagram answers the question:
Question: Which PRODUCT record has been updated after the last time an entry was added to the Pet Care Log for that PRODUCT?
A correlated subquery is a subquery that relies on values from the outer query to execute. In Oracle SQL, a correlated subquery can be used to filter or manipulate data based on values in the outer query. Here's an example:
Consider two tables: employees and departments.
employees table:
+-----------+---------+----------+-------------+
| employee_id | name | salary | department_id |
+-----------+---------+----------+-------------+
| 1 | John | 5000 | 10 |
| 2 | Jane | 6000 | 10 |
| 3 | Mark | 5500 | 20 |
| 4 | Lucy | 4500 | 20 |
+-----------+---------+----------+-------------+
Departments table:
+---------------+----------+
| department_id | name |
+---------------+----------+
| 10 | HR |
| 20 | IT |
+---------------+----------+
Now, let's say we want to find all employees who earn the maximum salary in their department. We can use a correlated subquery to achieve this:
SELECT e1.employee_id, e1.name, e1.salary, e1.department_id
FROM employees e1
WHERE e1.salary = (
SELECT MAX(e2.salary)
FROM employees e2
WHERE e1.department_id = e2.department_id
);
In this query, the outer query is iterating over each employee (aliased as e1), and the correlated subquery (aliased as e2) calculates the maximum salary for each department. The outer query then filters the results by matching the employee's salary with the maximum salary in their department. The output will be:
+-----------+------+--------+--------------+
| employee_id | name | salary | department_id |
+-----------+------+--------+--------------+
| 2 | Jane | 6000 | 10 |
| 3 | Mark | 5500 | 20 |
+-----------+------+--------+--------------+