When you design the tables in your database, you also define the relationships between them. This process defines how the data will be stored in the tables. When you normalize a database, you organize the data in tables in such a way that the data does not repeat.
Typically, you end up with a greater number of tables, each containing fewer columns and a greater number of defined relationships.
Generally, it is a good idea to normalize, because it will result in tables having more clustered indexes. Tables with clustered indexes usually require less disk space than tables that are not normalized, and they enable the software to execute queries more quickly. This is not always true, however, because normalizing increases the number of tables, and the more tables you use in a query, the more slowly the query is likely to execute. The basic concept of normalization is to remove repeating columns (as much as possible) and place them in one or more new tables. View the Slide Show below for an example of normalization.
When we design a database for an enterprise, the main objective is to create an accurate representation of the data, relationships between the data, and constraints on the data that is pertinent to the enterprise. To help achieve this objective, we can use one or more database design techniques. One of those techniques is called
Entity-Relationship (ER) modeling. This module discusses the database design technique called normalization.
Normalization is a database design technique, which begins by examining the relationships (called functional dependencies) between attributes. Attributes describe some property of the data or of the relationships between the data that is important to the enterprise. Normalization uses a series of tests (described as normal forms) to help identify the optimal grouping for these attributes
to ultimately identify a set of suitable relations that supports the data requirements of the enterprise. While the main purpose of this module is to introduce the concept of functional dependencies and describe normalization up to Third Normal Form (3NF), later we will take a more formal look at functional dependencies and also consider later normal forms that go beyond 3NF.
Normalization is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. The purpose of normalization is to identify a suitable set of relations that support the data requirements of an enterprise. The characteristics of a suitable set of relations include the following:
- n the minimal number of attributes necessary to support the data requirements of the enterprise;
- n attributes with a close logical relationship (described as functional dependency) are found in the same relation;
- n minimal redundancy with each attribute represented only once with the important exception of attributes that form all or part of foreign keys , which are essential for the joining of related relations.
The benefits of using a database that has a suitable set of relations is that the database will be easier for the user to access and maintain the data, and take up minimal storage space on the computer.
The following table assesses the potential impact of normalization on performance, maintainability, extensibility, scalability, availability, and security. This table is directly related to one of the Microsoft 70-029 exam objectives.
Factor |
Consideration |
Performance |
Normalizing your database can tend to hurt performance if data tables are very large and your queries need to join many tables together. |
Maintainability |
Normalized tables are much easier to maintain than non-normalized tables, because normalized tables have less columns. |
Extensibility |
Extensibility generally refers the ability to upgrade your applications in the future. Microsoft determines whether specific functions and features are available to support your existing database designs. Therefore, normalizing your database will not have any impact on its extensibility. |
Scalability |
Scalability allows you to migrate your database onto other operating systems, such as Windows 89 or 2012. Scalability is affected when you incorporate database features that aren't supported by other operating systems. However, scalability is not impacted by normalization.
|
Availability |
Availability is a term that is used to describe SQL Server being on-line and available to process data. A highly normalized database that contains much data might affect the availability of the server while very intensive queries are being processed. Therefore, if you normalize your database, your server may not be available for other tasks if you have lots of data and lots of normalized tables. However, many different query techniques can be employed to minimize this, such as using Stored Procedures.
|
Security |
Security restrictions can be placed on tables or columns. However, it is easier and more convenient to place security restrictions on entire tables. Therefore, if you normalize your database, you can enhance security by restricting access to a table that represents a complete object, or entity, such as salary information.
|
Now that you have learned about data normalization, you will learn about data denormalization in the next lesson.