Relational Constructs   «Prev  Next»

Lesson 10 Keys Column Descriptors
Objective Distinguish between key columns and descriptor columns of a DB Table.

Key Columns and Descriptor Columns in a Relational Database

Every column in a relational database table serves one of two functional roles: it either identifies a row uniquely and establishes relationships between tables, or it describes the entity that the row represents. Columns in the first category are called key columns. Columns in the second category are called descriptor columns. Distinguishing between these two roles is a foundational skill in relational database design - the distinction shapes how primary keys are chosen, how tables are linked, and how data integrity is enforced.

Key Columns

Key columns are used to establish unique row identification, relationships between tables, and referential integrity constraints. They fall into two categories:

  1. Primary Key. A primary key is a column or combination of columns whose values uniquely identify each row in the table. No two rows in the same table can share the same primary key value, and primary key columns cannot contain null values. Primary key values should also be stable - rarely if ever modified - because other tables may reference them through foreign keys, and changing a primary key value requires updating all referencing records to maintain consistency.

    A primary key column is not filled in from customer input or derived from pre-existing information about the entity. Its value is either assigned by a data-entry clerk following a defined scheme or generated automatically by the RDBMS (for example, through an auto-increment or sequence). The source of the value matters less than its guarantee: it will be unique across every row in the table for the lifetime of the database.

  2. Foreign Key. A foreign key is a column or set of columns in one table that references the primary key in another table. Foreign keys establish relationships between tables, enable multi-table queries, and enforce referential integrity - the constraint that every foreign key value must correspond to an existing primary key value in the referenced table. A row in the COMPUTER table with a ManuID value of M001 can only exist if M001 appears as a primary key value in the MANUFACTURING table.

    When designing a table, the database designer identifies key columns based on the table's identification requirements and the relationships that must be maintained between tables. Every table that participates in a relationship with another table requires a primary key.

Descriptor Columns

Descriptor columns - also called non-key columns or attribute columns - provide additional information about the entity that each row represents. They do not uniquely identify rows and are not used to establish relationships between tables. Their purpose is to store the data that applications, reports, and users need to work with the entity.

Descriptor columns can hold any data type - text, numbers, dates, or binary data - depending on the nature of the attribute they represent. The database designer selects descriptor columns based on the information that needs to be stored for each entity, choosing columns that support the application's functionality, reporting needs, and data analysis requirements.

The key characteristic of a descriptor column is that its values are not required to be unique. Many rows in the same table can share the same value for a descriptor column - and in practice, for columns like last name or city, many rows will. This is expected and correct; the uniqueness requirement belongs to the primary key, not to the descriptors.

Entities Are Represented by Tables

As established throughout this module, entities in the real world are represented by tables in a relational database. Each table contains columns that describe different attributes of the entity. A table describing customers includes the customer's name, address, phone number, and other identifying and descriptive information.

The CUSTOMER table below illustrates the column structure for the Stories on CD customer entity:

CUSTOMER
CustId CustLast CustFirst CustStreet CustAptNo CustCity CustState CustZip CustPhone

Identifying Key and Descriptor Columns in the CUSTOMER Table

The CUSTOMER table contains nine columns. Eight of them - CustLast, CustFirst, CustStreet, CustAptNo, CustCity, CustState, CustZip, and CustPhone - are descriptor columns. They describe the customer represented by the record: the customer's name, where they live, and how to contact them. None of these columns is required to be unique across all customer rows.

The ninth column, CustID, is the key column. Its value cannot be derived from customer input or from any pre-existing information about the customer - it is assigned by the data-entry clerk or generated by the RDBMS. Regardless of its source, the CustID column contains a unique value for every customer in the database, distinguishing each record from every other record in the table. That uniqueness is what makes CustID a key column.

Descriptor Columns

Because many records can share the same value in columns such as CustFirst, CustLast, and CustPhone, those columns are descriptor columns. They describe the customer the record refers to but do not necessarily differentiate that customer from every other customer in the table.

An extreme example makes the distinction vivid. Consider a father and son, both named Thomas Rothrauff, living at the same address and sharing the same phone number. The CUSTOMER table contains no suffix column for "Jr." or "III". Looking only at the descriptor columns, the two customers are completely indistinguishable - every descriptor value is identical.

The key column resolves this. Each Rothrauff is assigned a distinct CustID, and the two records become unambiguously identifiable in the database even though their descriptor columns are identical:

CUSTOMER
CustID CustLast CustFirst CustStreet CustAptNo CustCity CustState CustZip CustPhone
001 Rothrauff Thomas 1 Water St. C Hurt VA 24081 540.555.1212
002 Rothrauff Thomas 1 Water St. C Hurt VA 24081 540.555.1212

CUSTOMER table consisting of 2 rows.

CustID 001 and CustID 002 are unambiguously distinct records. Any order placed by either customer, any query that references their record, and any report that aggregates their data will correctly identify which Rothrauff is which - because the key column makes each row uniquely addressable.

The Database Schema

In the SQL environment, a schema is a named logical grouping of related database objects: tables, indexes, and views. A schema typically belongs to a single user or application. A single database can contain multiple schemas belonging to different users or applications, and each schema enforces a first level of access control by restricting visibility to the objects owned by that schema's user.

Think of a schema as the organizational container that groups all the tables, indexes, and views that belong together. In a multi-tenant database, separate schemas keep each tenant's data logically isolated even though they share the same physical database instance.

The ANSI SQL standard defines a command to create a database schema:


CREATE SCHEMA AUTHORIZATION {creator};

If the creator is JONES, the command is:


CREATE SCHEMA AUTHORIZATION JONES;

All database objects created within this schema - tables, indexes, views - are owned by JONES and are accessible to JONES by default. Other users require explicit grants to access objects in the JONES schema.

Column Properties in a Relation

Every column in a relational table has three formal properties that are part of the relational model:

  1. A name that is unique within the table. No two columns in the same table can share the same name. Two or more tables within the same schema may have columns with identical names - and as you will see in later lessons, this is not only permitted but highly desirable when a column in one table is a foreign key referencing the same-named column in another table. When the same column name appears in multiple tables involved in the same query operation, you qualify the name by preceding it with the table name and a period. For example: CUSTOMER.CustID distinguishes the CustID column of the CUSTOMER table from a CustID column that might appear in an ORDERS table.
  2. A domain. The values in a column are drawn from one and only one domain, making relations column homogeneous. Every column is subject to a domain constraint. At minimum, the domain constraint is a data type - integers, dates, text strings. Some DBMS platforms allow you to define custom domains with specific value sets and constraints, which can then be applied to columns across multiple tables to enforce consistent business rules throughout the schema.
  3. No positional concept. Columns in a relation have no inherent order. The columns can be viewed or queried in any sequence without affecting the meaning of the data. The order in which columns appear in a CREATE TABLE statement or a SELECT list is a presentational choice, not a structural one. This is distinct from the behavior of flat files and spreadsheets, where column position often carries implicit meaning.

The next lesson defines primary keys.


SEMrush Software 10 SEMrush Banner 10