A dimension table consists of a primary key column that uniquely identifies each record (row) of dimension. A dimension is a framework that consists of one or more hierarchies that classify data. Usually dimensions are de-normalized tables and may have redundant data.
Let us take a quick recap of the concepts of normalization and de-normalization, as they will be used in this module.
Normalization is a process of breaking up a larger table into smaller tables free of any possible insertion, updation or deletion anomalies.
Normalized tables have
reduced redundancy of data. In order to get full information, these tables are usually joined.
In de-normalization, smaller tables are merged to form larger tables to reduce joining operations. De-normalization is particularly performed in those cases where retrieval is a major requirement and insert, update, and delete operations are minimal,
as in case of historical data or data warehouse. These de-normalized tables will have redundancy of data.
For example, in case of EMP-DEPT (employee-department) database, there will be two normalized tables as EMP(eno, ename, job, sal, deptno) and DEPT(deptno, dname), while in case of de-normalization we will have a single table EMP_DEPT with attributes [eno, ename, job, sal, deptno, dname]. Let us consider Location and Item dimensions as shown in Figure 3.4 a).
Here, in Location dimension location_id is the primary key with street_name, city, state_id and country_code as its
attributes. Figure 3.4 (b) shows another dimension namely Item having item_code as primary key and item_name, item_type, brand_name, and supplier_id as other attributes.