Third Normal Form   «Prev 

Boyce-Codd Normal Form and Four Important Rules

Boyce-Codd Normal Form (BCNF) is also known as 3.5 Normal Form. It's a stronger version of Third Normal Form (3NF), with additional restrictions to ensure a higher degree of data normalization.
Here's a summary of why it's called 3.5NF:
  • BCNF builds upon 3NF: It incorporates all the rules of 3NF, which address transitive dependencies.
  • BCNF goes a step further: It also addresses a specific type of dependency that can still exist in 3NF tables, known as a "dependency involving a non-prime attribute."
  • Positioning between 3NF and 4NF: In terms of normalization levels, BCNF sits between 3NF and Fourth Normal Form (4NF), addressing more anomalies than 3NF but not as strict as 4NF.
Therefore, the 3.5 designation signifies its position as a refinement of 3NF, but not quite reaching the level of 4NF.
For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two conditions:
  1. It should be in the Third Normal Form.
  2. And, for any dependency A → B, A should be a super key.

The second point means, that for a dependency A → B, A cannot be a non-prime attribute, if B is a prime attribute. This will be explained using BCNF in terms of relations.

Relationship Example

Consider the following relationship: R(A,B,C,D) and following dependencies:
A -> BCD 
BC -> AD
D -> B

The relationship shown above is already in 3rd normal form where the keys are A and BC.
Hence, in the functional dependency, A -> BCD, is the super key[1]. In the second relation BC->AD, BC is also a key.
However, in the relation, D -> B, D is not a key. Therefore, we can only break our relationship R into two relationships R1 and R2.
Relationship R broke into two relationships
Relationship R broke into two relationships.


"Super Key" within the context of "Boyce-Codd Normal Form"

In the context of **Boyce-Codd Normal Form (BCNF)** in relational database design, a **superkey** is a set of one or more attributes (columns) that can uniquely identify a tuple (row) in a relation (table).
Key Characteristics of a Superkey:
  1. Uniqueness: The values of a superkey must be unique across all rows in the relation. This means no two rows can have the same values for all the attributes in the superkey.
  2. Redundancy Allowed: A superkey may include additional attributes that are not necessary for uniquely identifying rows. For example, if {A} is a candidate key, then {A, B} is also a superkey but not a candidate key because it includes the redundant attribute B.

Relation to BCNF: BCNF is a stricter form of normalization used to eliminate redundancy and anomalies. A relation is in BCNF if: - For every **functional dependency** (FD) \( X \to Y \), the determinant \( X \) (the left-hand side of the FD) is a **superkey**. This means that any attribute or set of attributes \( X \) that functionally determines other attributes \( Y \) must have the ability to uniquely identify all rows in the relation. Example:
Consider a relation \( R \) with attributes \( \{A, B, C\} \) and the following functional dependencies:
  1. \( A \to B \)
  2. \( B \to C \)
  • \( A \) is a candidate key because it uniquely identifies all tuples.
  • \( \{A, B\} \) and \( \{A, B, C\} \) are superkeys because they also uniquely identify tuples but are not minimal.
  • If \( B \to C \) exists but \( B \) is not a superkey, the relation violates BCNF. To resolve this, the relation must be decomposed into smaller relations.

In summary, superkeys are central to understanding and achieving BCNF, as they define the attribute sets that must be the determinants in all valid functional dependencies.

Database Design for Mere Mortals

Boyce-Codd Normal Form (BCNF)

A superkey is a set of fields that contain unique values. You can use a superkey to uniquely identify the records in a table and recall that a candidate key is a minimal superkey. In other words, if you remove any of the fields from the candidate key, it will not be a superkey anymore. A determinant is a field that at least partly determines the value in another field. Note that the definition of 3NF deals with fields that are dependent on another field that is not part of the primary key. Now we are talking about fields that might be dependent on fields that are part of the primary key (or any candidate key). A table is in BCNF if:
  1. It is in 3NF.
  2. Every determinant is a candidate key.

Four important rules for Codd's Twelve Rule Criteria for a fully-relational DBMS

  1. All information in a relational database is represented explicitly at the logical level[2] in exatly one way, by the values in tables.
    Creating tables to hold data allows users to search the tables, link tables which have fields in common, and use those links to combine those records in one or more tables efficiently.
  2. Each and every datum (atomic value) in a relational database is guaranteed to be logically accessible by resorting to a combination of table name, primary key value, and column name. Being able to find a datum is at the heart of 2NF. Relations in 1NF must have a primary key, though the relation can contain columns that are not functionally determined by the relation's primary key field(s). While it will be theoretically possible to find any value given the datum's table name, primary key value, and column name, it may not be practical to find the information you want if the datum is in a record which is completely unrelated to the datum's source.
  3. Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind that theoretically permit unimpairment are made to the base tables. Theoretically, deleting a record about an item should not also delete information about the company that makes it. If a table is not in 3NF, however, it is possible to wipe out information about a manufacturer by deleting the last record referring to a product made by that company.
  4. Integrity constraints specific to a particular relational database must be definable in the relational data sub-language and storable in the catalog, not in the application programs. A minimum of the following two integrity constraints must be supported:
    1. Entity integrity: No component of a primary key is allowed to be null.
    2. Referential integrity: For each distinct non-null foreign key value in a relational database, there must exist a matching primary key value from the same domain

    Maintaining referential integrity is an absolute requirement for a relational database. Putting all of a database's tables in 3NF, complete with links between tables based on common fields, allows the RDBMS to check each table's contents against related tables to ensure no important information is lost or changed.
[1]superkey: In the relational data model a superkey is a set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values.
[2]logical level: All information in a relational database is represented explicitly by the values in the tables.

Save Time Writing