Lesson 8 | Verbalizing an ER diagram |
Objective | Describe the process of verbalizing an ER diagram. |
Verbalizing ER diagram
Verbalizing an Entity-Relationship (ER) diagram is a method used by database designers to describe the contents and relationships within the ER diagram using natural language. This method can facilitate understanding and communication about the database's structure, especially for individuals who might not be accustomed to interpreting graphical database models.
In essence, verbalizing the ER diagram translates the visual representation of entities, attributes, relationships, and constraints into a textual or spoken narrative.
When verbalizing an ER diagram, the designer will typically follow these steps:
- Entities: Begin by identifying the entities in the diagram. Entities are objects or concepts of interest that have data stored about them. They might be verbalized as "There is an entity called 'Customer'."
- Attributes: Next, describe the attributes of each entity. Attributes are characteristics or properties of an entity. For example, "The 'Customer' entity has attributes 'CustomerID', 'FirstName', 'LastName', and 'Email'."
- Relationships: Then, articulate the relationships between entities. These describe how entities are associated with one another. This could be verbalized as "A 'Customer' places an 'Order'", indicating a relationship between the 'Customer' and 'Order' entities.
- Cardinality and Participation: Explain the cardinality and participation of each relationship, which describes the numerical aspects of a relationship between entities. For example, "One 'Customer' can place many 'Orders', but each 'Order' is placed by exactly one 'Customer'." This statement describes a one-to-many relationship (1:N) from 'Customer' to 'Order'.
- Constraints: Finally, discuss any constraints present in the ER diagram. These are rules that limit the type of data that can be stored in the database. For instance, "Each 'Customer' must have a unique 'CustomerID'."
By verbalizing the ER diagram, the database designer offers a comprehensive and easy-to-understand explanation of the database's structure. This can serve as a quick reference for the design team, stakeholders, and any other individuals who need to understand the database's organization and relationships. This practice helps bridge the gap between technical and non-technical individuals, facilitating more effective communication and collaboration in the database design process.
Once an ER diagram is completed and especially if the diagram is huge, as is often the case with a database containing a dozen or more tables, many designers will opt to verbalize the diagram for quick reference and to provide a second source of documentation.
The ER diagram for Stories on CD is translated into relational notation this way, with entity names followed by attributes enclosed in parentheses. Note that the entity name is now plural, reflecting the name you will ultimately give to the table in the database itself.
Customers (CustID PK, CustLast, CustFirst, CustStreet, CustCity, CustState,
CustZip, CustPhone) Orders (OrderNo PK, CustID FK, OrderDate)
Line Items (OrderNo CPK/FK, CDNo CPK/FK, Quantity,
SellingPrice, LineCost, Shipped)
CDs (CDNo PK, CDTitle, DistID FK, RetailPrice, AgeGroup, Description)
Distributors (DistID PK, DistName, DistStreet, DistCity, DistState, DistZip, DistPhone)
Relational Notation
The relational notations retain the information necessary to translate them into basic database tables. However, unlike an ER diagram, this type of notation is not a particularly good tool for communicating the structure of a database to end users. It is principally for the benefit of the database designer, and often comes in handy when standardizing on names for entities(tables) and attributes(fields). As with an ER diagram, there is no standardized notation for verbalizing a diagram; consistency is the only rule of thumb.
The next lesson concludes the module.
Naming fields
When creating a medium-to-large database with a dozen or more tables and dozens of fields, it is especially important to have some sort of naming strategy for fields (attributes) that you apply consistently across the database. Often, you will implement the naming strategy when you verbalize the ER diagram.
One very common practice among database designers is to use the first two, three, or four letters of a tables name to distinguish between similar fields across tables. You have already seen this strategy applied throughout this course to the Stories on CD case study.
The ID (identification) fields for Customers and Distributors are similar in the case study; these were named CustID and DistID respectively. My own preference is to use capital letters to separate the table portion of the fields name (Cust) from the descriptive or identification portion (ID). Other designers prefer to use the underscore character (_) to separate the two portions (for example, cust_ID).
The majority of database designers agree on one thing: Do not use a space to separate the two portions (Cust ID, for example, is not acceptable), even if the RDBMS you are using supports multiple words to identify fields. Not all RDBMSs support multiple words, and this fact alone makes it an unwise practice.
Finally, be careful about using abbreviations to name fields. Do not use abbreviations that are not intuitive.
In our case study, for example, first and last names of customers are specified as CustFirst and CustLast, as I feel this abbreviated format is intuitive. Other designers are not comfortable using abbreviations at all, and would opt for a format such as CustFirstName and CustLastName (or, cust_first_name and cust_last_name). Here, as elsewhere, consistency combined with standard database design practices will serve you well.
Database Modeling