Sound relational database design means taking the time to ensure your database conforms to certain rules. This module introduces techniques you can use to improve your database designs without worrying about the relational math.
When we design a database for an enterprise, the main objective is to
create an accurate representation of the data,
relationships between the data, and
constraints on the data that is pertinent to the enterprise.
To help achieve this objective, we can use one or more database design techniques. In this module we describe another database design technique called normalization[1]. Normalization is a database design technique, which begins by examining the relationships (called functional dependencies) between attributes. Attributes describe some property of the data or of the relationships between the data that is important to the enterprise.
Learning Objectives
After completing the lessons in this module, you should be able to:
Explain the requirements for third normal form (3NF)
Identify transitive dependencies
Normalize a relation to 3NF
Describe Codd's 12 criteria for a fully relational RDBMS
Explain how Codd's 12 criteria relate to normalization
Explain what type of dependencies might require normalization beyond 3NF
Define denormalization and identify when it might be useful
Third Normal Form
As a data modeler working on normalizing database tables, understanding Third Normal Form (3NF) is crucial. 3NF is a level of database normalization that aims to reduce the duplication of data and ensure data integrity by organizing the data within a relational database.
In essence, Third Normal Form is achieved when a database design meets all the requirements of the Second Normal Form (2NF) and additionally satisfies the following condition: every non-prime attribute of the table is non-transitively dependent on the primary key. In simpler terms, this means that there should be no transitive dependencies for non-key attributes. To elaborate, a transitive dependency in a database occurs when a non-key column depends on another non-key column, which in turn depends on the primary key. 3NF seeks to eliminate these transitive dependencies. By doing so, it ensures that:
The data stored in the database is free from insertion, update, and deletion anomalies. This means that the database structure allows efficient data manipulation and ensures the integrity of the data.
Every non-key attribute is directly dependent on the primary key, and not on any other non-key attribute. This reduces redundancy and dependency, making the database more streamlined and easier to maintain.
However, it's important to note that while 3NF can significantly improve database design, it may not always be practical or necessary to achieve this form in every database scenario. The decision to normalize a database to the third normal form should consider the specific requirements and constraints of the application, including factors like database performance, complexity, and the nature of the data being stored.
The next lesson discusses limitations of second normal form.
Polynomial Regression
Not all relationships are linear. Linear Equation: y = mx + b
This is a first order or first degree polynomial, where the power of x is 1.
Higher order equations produce more complex curves.
Quadratic Equation: Third order Equation:
[1]Normalization: Normalization uses a series of tests (described as normal forms) to help identify the optimal grouping for these attributes to ultimately identify a set of suitable relations that supports the data requirements of the enterprise