Third Normal Form   «Prev  Next»

Lesson 8 Boyce-Codd Normal Form (BCNF)
Objective Verifies that all data sets are identified and segregated.

Boyce-Codd Normal Form (BCNF)

On a more practical note, just getting to 3NF or BCNF is the goal of many project teams. 3NF is the Third normal form used in relational database normalization. According to the Codd's definition, a table is said to be in (3NF) third normal form, if and only if,
  1. that table is in the second normal form (2NF),
  2. and every attribute in the table that do not belong to a candidate key should directly depend on every candidate key of that table.

BCNF known as 3.5NF

BCNF (also known as 3.5NF) is another normal form used in relational database normalization. It was introduced to capture some the anomalies that are not addressed by the 3NF. A table is said to be in BCNF, if and only if, for each of the dependencies of the form A --> B that are non-trivial, A is a super-key. The following table outlines the similarities and differences between 3NF and BCNF.
Properties 3NF BCNF
Achievability Always achievable Not always Always achievable
Quality of the tables Less More
Non-key Determinants Can have non-key attributes as determinants Cannot have non-key attributes as determinants
Proposed by Edgar F. Codd Raymond F. Boyce and Edgar F. Codd jointly proposed
Decomposition Loss-less join decomposition can be achieved Sometimes loss-less join decomposition cannot be achieved


Columns which have the potential to be NULL

Figure 4-81 shows removal of two values, which have the potential to be NULL values. These valued fields from a table called EDITION, will be used to create the new table called RANK. The result is a zero or one-to-one relationship between the RANK and EDITION tables. This implies that if a RANK record exists, then a corresponding EDITION record must exist as well. In the opposite case, however, an EDITION record can exist where a RANK record does not have to exist. This opposite case, accounts for an edition of a publication having no RANK and INGRAM_UNITS values. A recently published publication will rarely have any statistical information.
3NF and beyond removing NULL fields to new tables.
The image above represents a database normalization process, specifically transforming a database schema beyond the Third Normal Form (3NF) into a higher level of normalization, possibly Boyce-Codd Normal Form (BCNF) or Fourth Normal Form (4NF).
Tables in the Database System: The diagram showcases the transformation of an Edition table by decomposing it into two related tables: Edition and Rank.
1. Original Edition Table (Before Transformation)
  • Table Name: Edition
    • Attributes:
      • ISBN (Primary Key)
      • publisher_id (Foreign Key)
      • publication_id (Foreign Key)
      • print_date
      • pages
      • list_price
      • format
      • rank (Nullable, sometimes not applicable)
      • ingram_units (Nullable, sometimes not applicable)
  • Issues Identified:
    • rank and ingram_units are not applicable to all records, meaning they contain NULL values.
    • The table violates higher normal forms because rank and ingram_units are functionally dependent on ISBN only when a ranking exists.

2. Transformed Schema (Beyond 3NF)
After normalization, the `Edition` table is split into two tables:
(A) Revised `Edition` Table
  • Primary Key: ISBN
  • Attributes:
    • ISBN (Primary Key)
    • publisher_id (Foreign Key)
    • publication_id (Foreign Key)
    • print_date
    • pages
    • list_price
    • format
  • Purpose:
    • Stores general information about the book edition.
    • No longer contains rank and ingram_units, preventing unnecessary NULL values.

(B) New `Rank` Table
  • Primary Key: ISBN (Foreign Key from Edition)
  • Attributes:
    • ISBN (Foreign Key)
    • rank
    • ingram_units

Purpose:
  • Stores ranking information only if the edition has a rank.
  • If an edition does not have ranking data, it does not require a row in the Rank table.

Key Relationships and Constraints
  • One-to-Zero-or-One Relationship:
    • An Edition does not have to be ranked, but if a Rank entry exists, it must reference an existing Edition.
    • Represented by the optional foreign key constraint between Rank and Edition.

Normalization Explanation
This transformation ensures:
  1. Elimination of Partial Dependencies: Every attribute is fully functionally dependent on the primary key (ISBN).
  2. Elimination of Redundant NULLs: The previous table had unnecessary NULLs when no ranking data existed.
  3. Improved Data Integrity: The schema enforces a strict relationship where an entry in Rank cannot exist without an associated entry in Edition.

Conclusion:
For the following list of 3 elements, put the elements in a HTML unordered list .
  • The schema has been refactored beyond 3NF, likely into BCNF or 4NF, by creating a separate Rank table.
  • The ISBN serves as the primary key in Edition and a foreign key in Rank.
  • The optional relationship ensures that ranking information is stored only when needed, reducing storage waste.
Figure 4-81: 3NF and beyond removing NULL fields to new tables.

Boyce-Codd normal form (BCNF)

A stronger definition of 3NF called Boyce-Codd normal form (BCNF) was proposed later by Boyce and Codd. All these normal forms are based on a single analytical tool, which are the functional dependencies among the attributes of a relation. Later, a fourth normal form (4NF) and a fifth normal form (5NF) were proposed, based on the concepts of multivalued dependencies and join dependencies, respectively.
Boyce-Codd Normal Form (BCNF) is based on the functional dependencies (FDs) among the attributes of a relation. It is a stricter version of Third Normal Form (3NF) that further addresses anomalies caused by transitive dependencies.
Here's a breakdown of the relationship between BCNF and functional dependencies:
Functional Dependencies:
  • These are relationships between attributes in a relation, where the value of one attribute (determinant) uniquely determines the value of another (dependent).
  • Example: `OrderID -> CustomerName` (knowing the order ID uniquely tells you the customer's name).

Boyce-Codd Normal Form:
  • A relation is in BCNF if it satisfies two conditions:
    1. It's in 3NF (no partial dependencies or transitive dependencies).
    2. For every FD (X -> Y) in the relation, where X is not a superkey (a minimal set of attributes that uniquely identifies a tuple), X must be a candidate key (a potential superkey).

Why BCNF focuses on FDs:
  • Transitive dependencies occur when an attribute depends on another, which in turn depends on a third. This can lead to redundant data and update anomalies.
  • BCNF eliminates such anomalies by ensuring that determinants (X in X -> Y) are always candidate keys. This guarantees that any updates to the determinant will propagate directly to all dependent attributes, maintaining data consistency.

Visualizing the connection:
Imagine a table with attributes `CustomerID`, `OrderID`, `ProductName`, and `Price`.
  • `CustomerID -> OrderID` and `OrderID -> ProductName` are FDs.
  • However, `CustomerID -> ProductName` is a transitive dependency, derived from the existing FDs.
  • If the table is not in BCNF, updating `CustomerID` might not automatically update `ProductName`, leading to inconsistencies.

By enforcing the BCNF condition, we can restructure the table to eliminate transitive dependencies and ensure proper data flow based on FDs.

SEMrush Software