Relational Constructs   «Prev  Next»

Lesson 6 Rules for Database Table Rows
Objective List the rules for table rows.

Rules for Database Table Rows

In the relational model, a table represents a set of entities - customers, orders, products, or any other subject about which the database stores information. A row (also called a record or tuple) represents exactly one instance of that entity. A row in the Customers table represents one customer. A row in the Orders table represents one order.

To keep data consistent and reliably queryable, the relational model applies a set of rules to rows. These rules are not arbitrary conventions - they derive from E.F. Codd's foundational work on the relational model and from the mathematical properties of sets. A relation in the mathematical sense is a set of tuples, and sets have well-defined properties: no duplicates, no ordering dependency, and elements that are fully defined by their attributes. The row rules translate those mathematical properties into practical database design requirements.

The Three Core Rules for Table Rows

Every row in a relational table must satisfy three rules:

  1. Atomic values. Each column in a row stores a single, indivisible value - not a list, an array, or a compound structure. This requirement is the foundation of First Normal Form (1NF) and is fundamental to the relational model's ability to query, sort, filter, and index data reliably.

    Consider a CustomerPhone column that contains the value "555-1111, 555-2222". This violates the atomic values rule because the column holds two phone numbers in a single cell. SQL has no standard way to extract, compare, or index individual values within a comma-separated string. Queries that need to find all customers with a specific phone number become fragile and unreliable. The correct approach is to move multiple values to a related table - a CustomerPhones table with one phone number per row and a foreign key referencing the parent Customers table. Each row in CustomerPhones then contains one atomic phone number value that SQL can query, index, and join on reliably.

  2. No duplicate rows. The same real-world entity instance should not appear more than once in the same table. A relation is a set of tuples, and by mathematical definition a set contains no duplicate elements. If two rows in the Customers table contain identical values in every column, the database has no way to distinguish between them and no way to determine which one represents the actual customer.

    In practice, the no-duplicate-rows rule is enforced by the primary key and by unique constraints. The primary key guarantees that no two rows can share the same identifier. Unique constraints on other columns or column combinations (candidate keys) extend this guarantee to other attributes that should be distinct, such as email addresses or national identification numbers. The RDBMS rejects any INSERT or UPDATE that would create a duplicate primary key value or violate a unique constraint.

  3. Each row is uniquely identifiable. Every table - and particularly every table that participates in relationships with other tables - must have a primary key: a column or minimal set of columns whose values uniquely identify each row. Without a reliable mechanism for unique row identification, it is impossible to safely join tables, enforce referential integrity, or produce accurate reports.

    This rule is what drives the requirement for primary key design in every table. A table without a primary key is a table whose rows cannot be reliably addressed, and a database containing such tables cannot guarantee the consistency of its data.


Database Design for Mere Mortals

How Row Rules Influence Primary Keys, Uniqueness, and Atomic Values

The three row rules have direct and specific consequences for how primary keys, uniqueness constraints, and atomic value requirements are implemented in a relational database:

Aspect Row Rule Direct Influence Benefit
Primary Keys Each row must be uniquely identifiable Requires a PK on every table participating in relationships Reliable row identification and safe joins
Uniqueness No duplicate rows allowed Enforced via PK and UNIQUE constraints on candidate keys Prevents ambiguity and data errors
Atomic Values Single value per cell No lists or arrays in columns; 1NF compliance required Queryability and normalization

Why Primary Keys Matter in Relationships

The importance of primary keys becomes most visible when tables are linked through relationships. Consider a Customers table and an Orders table. Each order belongs to one customer, and the Orders table references the Customers table through a foreign key on CustomerID.

If the Customers table lacks a primary key - or if the same CustomerID value appears in more than one row - then an Order row referencing that identifier could match more than one customer row. This breaks three critical database capabilities:

A primary key prevents all of these problems by guaranteeing that each customer row is uniquely identifiable. Even when two customers share the same name, their CustomerID values differ, and every order references exactly one customer record. The primary key is the anchor that makes multi-table queries reliable.

Primary key characteristics:

Rows as Representations of Entity Instances

Each row in a table represents one instance of the entity the table describes. A row in the Customers table represents one customer. A row in the Products table represents one product. A row in the Orders table represents one order. The columns in that row store the attribute values that describe that specific instance - the customer's name and address, the product's description and price, the order's date and status.

This concept has a surface resemblance to objects in object-oriented programming, but the analogy should not be pushed too far. The distinction is important:

  • Database rows store data only - the attributes of one entity instance, organized as column values.
  • OOP objects store data and behavior - attributes (fields) plus the methods that operate on those attributes.

Many applications bridge this gap using an ORM (object-relational mapper) - a software layer that maps database rows to objects in the application's programming language, and maps object field changes back to SQL UPDATE statements. The ORM makes rows behave like objects from the application's perspective, but the underlying database stores only the data. The row is the persisted record of the entity's state; the behavior lives in the application code.

Relational Databases vs. Spreadsheets

At a high level, a relational database is a collection of tables, each containing rows and columns. This structure superficially resembles a spreadsheet workbook where each sheet is a table and each cell holds a value. The resemblance is real but the differences are significant.

In a spreadsheet, a cell will accept almost anything typed into it - a number, a string, a date, a formula, or nothing at all. The spreadsheet imposes no constraints on what a cell contains or whether the value is consistent with neighboring cells. This flexibility makes spreadsheets useful for ad hoc analysis but unreliable for storing data that multiple applications and users depend on.

In a relational table, each column is defined for a specific kind of value, and the RDBMS enforces that definition. A DATE column will reject a string that does not represent a valid date. A NOT NULL column will reject a blank entry. A foreign key column will reject a value that does not exist in the referenced table. These constraints are what make a relational database trustworthy as a shared data store.

Domain vs. Data Type

The concept that governs what values are valid for a column goes by two related but distinct terms: data type and domain[1].

Consider two columns, both defined as VARCHAR(100): one stores street addresses, the other stores color names. Both have the same data type, but they belong to completely different domains. Storing a color name in an address column or an address in a color column would not violate the data type but would violate the domain - the value is technically storable but semantically meaningless in that context.

Domain constraints are covered in detail in the next lesson. The distinction introduced here is that a well-designed column is defined not just by its data type but by the specific domain of values it is intended to hold. The domain is what turns a generic VARCHAR column into a well-defined, meaningful attribute in a business model.

A row exists to keep related column values together. Every column in a row stores an attribute of the same entity instance - the same customer, the same order, the same product. This co-location of related attribute values within a single row is why row uniqueness and primary keys are foundational to relational design. Without unique rows, the values in a row cannot be reliably attributed to a specific entity instance, and the fundamental purpose of the table is undermined.

Entities Attributes Exercise

Before moving on, click the link below to review entities and attributes from the course ER diagram:
Entities Attributes Exercise

[1] Domain: The domain determines the type of data values that are permitted for that attribute. Column domains are covered in detail in the next lesson.

SEMrush Software 6 SEMrush Banner 6