Lesson 3 | Transitive Dependencies: |
Objective | Define Transitive Dependencies |
Define Transitive Dependencies
Transitive dependencies occur when a
determinant[1] affects the values of more than one business object. Consider again the relation from the preceding lesson, which stores information about items stocked by Stories on CD. Again, we are assuming each CD belongs to exactly one category.
Item (ItemNo, Title, Price, CategoryID, CategoryName, CategoryManager)
The functional dependency for the relation is written:
ItemNo -> (Title, Price, CategoryID, CategoryName, CategoryManager)
The problem is that there are actually two functional dependencies masquerading as one. It is equally correct to write the dependency as:
ItemNo -> (Title, Price, CategoryID)
CategoryID -> (CategoryName, CategoryManager)
Or, to put it another way:
ItemNo -> Title, Price, CategoryID -> CategoryName, CategoryManager
The functional dependency follows the mathematical property of
transitivity[2] , which states that if A=B and B=C, then A=C. Because ItemNo determines CategoryID, which in turn determines CategoryName and CategoryManager, the relation contains a transitive dependency. The next lesson explains how to eliminate transitive dependencies.
Data Analysis for DB Design
Data Modeler considers "Transitive dependencies" during the Normalization Process
A data modeler does consider transitive dependencies during the normalization process, particularly when aiming to achieve third normal form (3NF) or higher levels of normalization in database design. Here's how:
- Transitive Dependency: This occurs when a non-prime attribute (an attribute that is not part of any candidate key) depends on another non-prime attribute. In simpler terms, if A → B and B → C, then A transitively determines C.
- Normalization Process:
- First Normal Form (1NF): Ensures that all attributes contain atomic values, and there are no repeating groups.
- Second Normal Form (2NF): Applies to tables that are already in 1NF. It requires that all attributes must depend on the whole key, not just part of it. This addresses partial dependencies but not transitive dependencies.
- Third Normal Form (3NF): Builds on 2NF. It requires that all non-prime attributes (attributes that are not part of any candidate key) must be non-transitively dependent on the primary key. This means no transitive dependencies are allowed.
When a data modeler identifies transitive dependencies, they will typically:
- Decompose the Table: Break down the existing table into multiple tables such that each non-prime attribute is only dependent on the key, the whole key, and nothing but the key.
- Create New Tables: For each transitive dependency found, create a new table where the determinant (the attribute on which the other attributes are dependent) becomes a primary key or part of a composite key in the new table.
For example, if you have an Employee table with 4 columns:
Employee (EmployeeID, Name, ProjectID, ProjectName)
If `ProjectID` determines `ProjectName` independently of `EmployeeID`, this is a transitive dependency. The data modeler would then:
- Split this into two tables:
- Employee (EmployeeID, Name, ProjectID)
- Project (ProjectID, ProjectName)
By doing this, each table is now in 3NF because `ProjectName` depends only on `ProjectID` in the `Project` table, and `Employee` table attributes only depend on `EmployeeID`. Considering and resolving transitive dependencies ensures data integrity, reduces data redundancy, and makes the database design more efficient and maintainable.
Example of Transitive Dependencies
A transitive dependency occurs in database design when a non-prime attribute depends on another non-prime attribute indirectly through a primary key. It violates the Third Normal Form (3NF) and needs to be addressed for a more optimal database structure.
Example:
Consider a table `Employees` with the following attributes:
Employee_ID |
Department_ID |
Department_Name |
Department_Head |
1 |
101 |
Sales |
Alice |
2 |
101 |
Sales |
Alice |
3 |
102 |
Marketing |
Bob |
4 |
103 |
IT |
Charlie |
- Employee_ID is the primary key.
- Department_ID is a foreign key that identifies which department an employee belongs to.
- Department_Name and "Department_Head" are attributes related to the department.
Functional Dependencies:
- Employee_ID → Department_ID: (Each employee belongs to a single department, so Employee_ID determines Department_ID.)
- Department_ID → Department_Name, Department_Head: (Each department has a unique name and head, so Department_ID determines Department_Name and Department_Head.)
Transitive Dependency:
There's an indirect relationship here:
Employee_ID → Department_ID → Department_Name, Department_Head
This means that "Department_Name" and "Department_Head" are transitively dependent on "Employee_ID" through "Department_ID".
This violates 3NF because "Department_Name" and "Department_Head" depend on "Department_ID", not directly on "Employee_ID". To remove this transitive dependency, we would need to separate this data into two tables:
Revised Structure:
1. Employees Table:
Employee_ID |
Department_ID |
1 |
101 |
2 |
101 |
3 |
102 |
4 |
103 |
2.
Departments Table:
Department_ID |
Department_Name |
Department_Head |
101 |
Sales |
Alice |
102 |
Marketing |
Bob |
103 |
IT |
Charlie |
Now, the transitive dependency is removed because each table handles only the direct dependencies.
Transitive Dependencies
A transitive dependency occurs when the value of a non-key attribute is functionally dependent on the value of another non-key attribute. In order for a transitive dependency to exist, a table must have at least two non-key attributes. Consider the following table (FareCode is a code like C for Coach or FC for First Class):
Passenger = { PassengerId + Name + FareCode + FarePrice }
Here we have three non-key attributes so we might have a transitive dependency. We need to examine the non-key attributes to see if any functional dependencies exist. Start with passenger name and fare code. If we know a passenger's name, do we know the fare code?
If the answer is yes, we have found a
transitive dependency.
However, in this case the answer is no. There might be two people on a plane with the same name but one is flying first class and the other is flying coach. The specific value of FareCode is dependent on the unique identifier of Passenger (PassengerId). Next we will check fare code and fare price. If we know the fare code, do we know the fare price? Assuming that there is only one price for each code, then the answer is yes.
This is an example of a
transitive dependency. We correct this problem by splitting the original table into two tables.
The first is a table with the original key and all values determined directly by it. A separate table is created with the transitive dependency. For our example, we have:
Passenger = { PassengerId + Name + FareCode }
Fare = { FareCode + FarePrice }
[1]determinant: An attribute upon which another attribute is functionally dependent.
[2]transitivity: In mathematics, a binary relation R over a set X is transitive if whenever an element a is related to an element b, and b is in turn related to an element c, then a is also related to c. Transitivity is a key property of both partial order relations and equivalence relations.