Lesson 9	Functional dependencies involved in normalization
Objective	Identify functional dependencies

Identify Functional Dependencies during Normalization

Identifying functional dependencies is a critical step in database normalization, as it helps ensure the database structure eliminates redundancy and maintains data integrity. Here are some commonly used techniques to identify functional dependencies during normalization:

Analyze Business Rules and Requirements
- Review Business Processes: Understand the business rules and processes governing the data to identify how attributes are related.
- Ask Questions: Collaborate with domain experts or stakeholders to clarify relationships between attributes.
- Document Constraints: Note any explicit constraints, such as "Each employee has a unique employee ID."
Examine Sample Data
- Review Data Patterns: Look at a sample dataset to observe how attribute values relate to one another.
- Identify Unique Values: Check if certain attributes or combinations of attributes uniquely determine other attributes.
Use Dependency Diagrams
- Draw Attribute Relationships: Create a dependency diagram or schema diagram that visually maps attributes and their dependencies.
- Highlight Determinants and Dependents: Clearly indicate which attributes determine the values of others.
Decompose into Candidate Keys
- Identify Primary Identifiers: Determine which attributes or combinations of attributes serve as candidate keys.
- Verify Uniqueness: Ensure that these keys uniquely identify rows in a table.
Evaluate Attribute Semantics
- Understand Attribute Meaning: Analyze the meaning of each attribute and its relation to others.
- Check Contextual Dependencies: Determine if an attribute always depends on a specific determinant.
Use Functional Dependency Testing
- Testing Combinations: For any attribute \( A \), test if the value of \( A \) is consistently determined by another attribute or set of attributes \( B \).
- Apply Logical Rules: Use inference rules (e.g., Armstrong's axioms) to derive additional dependencies.
Look for Repeated Data Patterns
- Identify Redundancies: Spot columns with repeating groups of data values.
- Correlate with Determinants: Determine if redundancy indicates a missing functional dependency.
Use Normalization Steps
- First Normal Form (1NF): Start by ensuring atomicity and identifying repeating groups.
- Second Normal Form (2NF): Identify dependencies on partial keys for composite keys.
- Third Normal Form (3NF): Check for transitive dependencies where non-key attributes depend on other non-key attributes.
Conduct Interviews or Workshops
- Gather Stakeholder Input: Discuss with users or developers familiar with the data relationships.
- Validate Dependencies: Use stakeholder feedback to confirm or refine identified dependencies.
Utilize Database Design Tools
- Modeling Tools: Use ERD (Entity-Relationship Diagram) tools to identify potential dependencies visually.
- Automated Analysis: Leverage software that can analyze schema and data for potential functional dependencies.
Consider Historical or Domain Knowledge
- Reference Standards: Use domain-specific standards or rules to infer dependencies.
- Leverage Existing Documentation: Review existing database documentation, schemas, or legacy systems for insights.
Verify with Logical Consistency
- Check for Contradictions: Test the identified dependencies by querying the data to ensure consistency.
- Simulate Updates: Verify how updates in one attribute affect dependent attributes to confirm dependencies.

These techniques, used in combination, provide a robust framework for identifying functional dependencies and achieving a well-normalized database. Unfortunately, there is no "cut-and-dried" way to identify functional dependencies. One will have to rely on the knowledge you gain during requirements analysis^[1] to make those determinations. You may design a database for an unfamiliar company, or need to account for data on subjects about which you have little prior knowledge. The interviews you conduct and the outside reading you do are absolutely invaluable in identifying the business objects to be represented in the database and the functional dependencies in the resulting relations.

Database Functional Dependency

One strategy you can use to determine whether a field is functionally dependent on the entire primary key is to ask:

1) Is the field an attribute of the business object represented by the table or
2) Is it a foreign key field used to establish a link with another table?

If neither of these statements is true, there is a good possibility the field is not functionally dependent. Consequently, the table is not in (2NF) second normal form.

Representing Functional Dependencies in DBMS:
In relational notation, a functional dependency is represented with the -> symbol. For example:
```
CustID -> CustFirst, CustLast, Address, Phone
```
The -> (arrow) character is read as "determines,” so the functional dependency above is read as "CustID determines CustFirst, CustLast, Address, and Phone." If a functional dependency has a multi-field determinant^[2], separate the fields to the left of the arrow by a comma. An automobile insurance policy, where each policy could cover more than one car under different terms, is an example of a multi-field determinant:
```
PolicyNo, CarID -> Deductible, CoveragePlan
```

Process of Identifying Functional Dependencies

1. First Step: The first step is to identify the Entity Types, the sets of columns that identify them (there might be multiple) and their attributes. In this case:

Student: {StudName}, {StudNR}
attributes: StudAddr
Course: {CourseNr}, {CourseName}
Course Edition: {CourseNr, Semester, Year}
Professor: {ProfName}

From this we can derive FDs with the following rules:
(ER2FD1) If an Entity Type has keys K1 and K2, then the FD K1 -> K2 holds
(ER2FD2) If an Entity Type has a key K and single-valued attribute A then FD K -> A holds
So in this case we derive:

StudName -> StudNR (ER2FD1)
StudNr -> StudName (ER2FD1)
StudNR -> StudAddr (ER2FD2)
CourseNr -> CourseName (ER2FD1)
CourseName -> CourseNr (ER2FD1)

2. Second Step: The second step is to identify the Relationships, between which Entity Types they hold, and what attributes they have:

Student-followed-Course-edition: (Student, Course Edition)
attributes: Grade
Prof-teaches-Course-edition: (Professor, Course Edition)

From this we derive FDs with the following rules:
(ER2FD3) If a Relationship is one-to-many or one-to-one, then there holds an FD from one of the keys of the many-Entity to one of the keys of the one-Entity. (ER2FD4) If a Relationship has a single-valued attribute, then there holds an FD from a combination of the keys of the participating Entity Types to the attribute.
So in this case we derive:

StudNr, CourseNr, Semester, Year -> Grade (ER2FD4)
CourseNr, Semester, Year -> ProfName (ER2FD3)

Note: We assume here that each edition only has one professor, and so Prof-teaches-Course-Edition is indeed one-to-many.
For ER2FD3 and ER2FD4 it is sufficient that this is done for only one combination of keys, since the others are implied. So we could, but do not also derive the following FDs:

StudName, CourseNr, Semester, Year -> Grade
StudNr, CourseName, Semester, Year -> Grade
StudName, CourseName, Semester, Year -> Grade

The rule ER2FD3 also has a variant for n-ary relationships where n>2, which is somewhat more complex. Consider, for example, a relationship that holds between Entity Types A, B and C. Then it should be checked if perhaps one of the participating Entity Types, say C, is functionally determined by the two others, i.e., if for every combination of A entity and B entity there is at most one C entity in this relationship with them. If that is the case, then an FD should be derived from a combination of keys of A and keys of B to the keys of C. An example of this would be a relation Supplies(Supplier, Part, Project) where each Project has for each Part at most one Supplier.

Understanding Functional Dependencies: A functional dependency is a one-way relationship between two attributes such that at any given time, for each unique value of attribute A, only one value of attribute B is associated with it through the relation. For example, assume that A is the customer number from the orders relation. Each customer number is associated with one customer first name, one last name, one street address, one city, one state, one zip code, and one phone number. Although the values for those attributes may change, at any moment, there is only one.

Functional Dependencies - Exercise

Before you move on to the next lesson, complete the following exercise to reinforce your understanding of functional dependencies.
Functional Dependencies - Exercise
The next lesson reviews the procedure for putting a relation in 2NF.

[1]Requirements analysis: The stage in the database design cycle when designers find out everything they can about the data the client needs to store in the database and the conditions under which that data needs to be accessed.

[2]multi-field determinant: A determinant consisting of more than one field.