Third Normal Form   «Prev  Next»

Lesson 6Normalization and Codd's 12 criteria
ObjectiveExplain how Achieving Third Normal Form adheres to Codd's 12 Criteria.

Achieving Third Normal Form and Its Compliance with Codd's 12 Criteria

Third Normal Form (3NF) is a database schema design principle that enhances data integrity and minimizes redundancy. By achieving 3NF, database designers can improve the overall efficiency and organization of a relational database management system. This article will explore how adhering to Third Normal Form aligns with Codd's 12 criteria for relational databases.
Codd's 12 Criteria, Third Normal Form and Its Relation to Codd's Criteria
The 12 criteria proposed by Dr. Edgar F. Codd serve as a guideline for evaluating and ensuring the efficiency of relational database management systems. These criteria focus on aspects such as data integrity, logical data independence, and support for transaction processing. By adhering to these rules, a database management system can deliver optimal performance, scalability, and maintainability.
  1. Information Rule: The Information Rule states that all data in a relational database should be represented as values within tables. Third Normal Form promotes this criterion by eliminating transitive dependencies, ensuring that each non-key attribute is directly dependent on the primary key. This reduces redundancy and guarantees accurate representation of data within tables. All information in a relational database is represented explicitly at the logical level in exactly one way: by values in tables
  2. Guaranteed Access Rule: By adhering to 3NF, a database schema ensures that all data can be accessed using the combination of a table name, primary key, and attribute name. This aligns with the Guaranteed Access Rule, which requires that every data item be uniquely addressable. Each and every datum (individual value) in a relational database is guaranteed to be logically accessible by resorting to a combination of table name, primary-key value, and column name.
  3. Systematic Treatment of Null Values: 3NF contributes to the systematic treatment of null values by minimizing the chances of null values ppearing in a table. Since 3NF eliminates transitive dependencies and ensures proper organization of attributes, null values are less likely to occur. Null values (distinct from the empty character string or a string of blank characters or any other number) represent missing information in the RDBMS in a systematic way, independent of data type.
  4. Dynamic Online Catalog Based on the Relational Model: Achieving Third Normal Form aids in creatingdynamic online catalog based on the relational model. By ensuring that the schema is well-structured, database designers can more easily manage metadata and make changes to the schema as needed. The database description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to regular data.
  5. Comprehensive Data Sublanguage Rule: Third Normal Form, as a part of the relational model, inherently supports a comprehensive data sublanguage. This language includes features for defining, querying, and updating data, enabling efficient management of the database. A relational system may support several languages and various modes of terminal use (for example, fill-in-the-blanks mode). However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings and that is comprehensive in supporting all of the following items:
    • Data definition
    • View definition
    • Data manipulation (interactive and by program)
    • Integrity constraints
    • Transaction boundaries (begin, commit, and rollback)
  6. View Updating Rule: The View Updating Rule states that any view that is theoretically updatable should be updatable by the system. Adhering to 3NF simplifies view updating, as it ensures that the underlying base tables have a well-structured schema and are free from anomalies that could complicate updates. All views that are theoretically updateable are also updateable by the system.
  7. High-Level Insert, Update, and Delete: By achieving Third Normal Form, a database schema supports high-level insert, update, and delete operations. This is because 3NF eliminates redundancies, reducing the likelihood of update anomalies and ensuring the integrity of the data. The capability of handling a base relation or a derived relation as a single operand applies not only to the retrieval of data but also to the insertion, update, and deletion of data.
  8. Physical Data Independence: Physical data independence is the capacity to change the physical storage or access methods without affecting the application logic. While 3NF primarily focuses on the logical organization of data, achieving Third Normal Form can indirectly support physical data independence by facilitating the maintenance of a well-structured schema. Applications programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods.
  9. Logical Data Independence: Logical data independence refers to the ability to change the logical schema without impacting the application programs. Adhering to Third Normal Form contributes to logical data independence by providing a well-structured, efficient schema that can be modified without causing disruptions. Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
  10. Integrity Independence: Third Normal Form supports integrity independence by ensuring that data dependencies are appropriately captured n the schema. This means that any constraints, such as primary and foreign keys, can be maintained independently of the application logic. Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
  11. Distribution Independence: Distribution independence implies that users are unaware of how the data is physically distributed across multiple locations. Although achieving Third Normal Form primarily impacts the logical organization of data, it indirectly contributes to distribution indeence by providing a well-structured schema that can be effectively partitioned and distributed across various storage systems. An RDBMS has distribution independence: A characteristic of certain Relational Database Managment Systems(RDBMS) in which storing parts of a database on more than one computer does not affect the RDBMS's ability to work with the database.
  12. Non-Subversion Rule: The Non-Subversion Rule states that if the system supports a low-level language, it should not be used to bypass integrity rules or constraints expressed in the high-level language. Adhering to Third Normal Form bolsters the Non-Subversion Rule by ensuring that all data dependencies are captured correctly in the schema. Consequently, the likelihood of violating integrity constraints is minimized, even when using a low-level. If a relational system has a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher level relational language (multiple-records-at-a-time).

Data Analysis for DB Design

Codd's 12 Criteria for Normalization

E.F. Codd, the theoretician who laid the groundwork for relational databases described 12 rules of fully relational database. The theory went relatively unchallenged, but his comparisons of existing database systems ruffled feathers because, according to Codd's criteria, some were marketed as relational when they were not. Most of Codd's 12 criteria establish rules for how the RDBMS handles things internally and only peripherally influence how users design databases to be managed by the RDBMS. Designing a database in third normal form makes it possible for the RDBMS to follow Codd's rules. There are four rules that stand out.
Four Important Rules
Slogan to Remember Codd's Criteria: If you are having trouble remembering which requirements go with which normal form, there is a slogan you can use to keep them in order.
  • Codd Slogan for Normalization:
    The key, the whole key, and nothing but the key, so help me Codd.

    1. “The key” : Tables may not contain repeating groups, which prevent a table from having a primary key. (1NF)
    2. “the whole key”: Every attribute must be functionally dependent on the entire primary key. (2NF)
    3. “and nothing but the key,” : There may not be any transitive dependencies, which involve a non-key field.(3NF)
    4. “ so help me, Codd.” : The theorist who gave us these rules.

The slogan is based on the oath witnesses take in courts in the United States, which requires them to
tell "the truth, the whole truth, and nothing but the truth, so help me God."

Ted's Codd Slogan for Normalization

Codd produced these rules as part of a personal campaign to prevent his vision of the relational database being destroyed, as database vendors scrambled in the early 1980s to repackage existing products with a relational overlay. Rule 12 was particularly designed to counter such a positioning. Even if such re-packaged non-relational products eventually gave way to SQL DBMS, no popular relational database management systems are actually relational, be it by Codd's twelve rules or by the more formal definitions in his papers, in his books by its coworkers and successors, Christopher J. Date, David McGoveran and Fabian Pascal. Only less known DBMSs like Sybase strive to comply. The only commercial example, as of December 2010 is Dataphor. Some rules are controversial, especially rule three, because of the debate on three-valued logic.
In logic, a three-valued logic (also trinary logic, trivalent, ternary, or trilean, sometimes abbreviated 3VL) is any of several many-valued logic systems in which there are three truth values indicating true, false and some indeterminate third value. This is contrasted with the more commonly known bivalent logics (such as classical sentential or Boolean logic) which provide only for true and false. Conceptual form and basic ideas were initially created by Jan Lukasiewicz and C. I. Lewis. These were then re-formulated by Grigore Moisil in an axiomatic algebraic form, and also extended to n-valued logics in 1945.

Three-Valued Logic and Third Normal Form: Exploring the Connection
Three-valued logic and Third Normal Form (3NF) are concepts from different areas of computer science, but there is an interesting point where their ideas can be tangentially related when dealing with data modeling and database normalization.
  • Three-Valued Logic: Three-valued logic (3VL) is a logical system where, in addition to the traditional Boolean values of `True` and `False`, there is a third value: `Unknown` or `Null`. In the context of databases, this comes into play with the handling of `NULL` values, which represent missing or unknown data.
  • Third Normal Form (3NF): Third Normal Form is a level of database normalization used to organize data in a relational database to reduce redundancy and ensure data integrity. A table is in 3NF if:
    1. It is in Second Normal Form (2NF).
    2. All the attributes (columns) are functionally dependent only on the primary key, not on any other non-prime attribute (i.e., there are no transitive dependencies).
  • The Connection: The relevance of three-valued logic to 3NF primarily relates to how databases handle `NULL` values, especially in queries and constraints:
    1. Data Integrity: When designing tables in 3NF, it is crucial to consider how NULL values might affect functional dependencies and data integrity. For instance, constraints such as NOT NULL can be used to enforce that specific columns should not contain NULL values, making the database more consistent.
    2. Query Evaluation: SQL queries that involve conditions with NULL values are evaluated using three-valued logic. For example, a comparison involving NULL might yield Unknown, which can influence the results of queries and complicate the enforcement of certain dependencies or conditions needed for proper normalization.

Practical Implication: While 3NF focuses on reducing redundancy and improving data organization, three-valued logic introduces complexity in handling data with `NULL` values. This complexity must be managed carefully, as `NULL` values can affect how dependencies are defined and maintained, making it essential to design tables thoughtfully in 3NF to avoid ambiguities or inconsistencies in query results.

DeNormalization Theory - Exercise

Before you move to the next lesson, click the Exercise link below to reinforce your understanding of normalization rules.
DeNormalization Theory - Exercise

SEMrush Software