Lesson 4 | Entity identifier rules |
Objective | List two rules for creating entity identifiers. |
Two Rules for Creating Entity Identifiers
The two most important rules for creating an entity identifier are:
- Keep the identifier meaningless
- Keep the identifier unrestricted
When choosing an entity identifier, it is important to use meaningless values, that is, identifiers that will not change.
For example, combining the first five letters of an employees last name with his/her phone number is a meaningful identifier that could present problems.
Question: What if the employees phone number changes?
It is very important when designing a database to make sure you choose identifiers that will never change.
Social security numbers are popular people identifiers, but they are restricted to U.S. citizens.
Companies with overseas customers or universities where foreign students make up a percentage of the student population must be especially wary of using social security numbers as identifiers.
In other circumstances, however, a social security number can be a good entity identifier. Whenever possible, meaningless values should be assigned as key attributes.
Question: Which guidelines should be observed when creating entity identifiers in a data model?
When creating entity identifiers in a data model, it is crucial to adhere to a set of guidelines that promote consistency, readability, and maintainability. Following these best practices will ensure that your data model is both scalable and accessible to others working on the project. Here are the key guidelines to observe when creating entity identifiers in a data model:
- Uniqueness: Each entity identifier should be unique within the data model to avoid confusion and ensure accurate referencing. This prevents ambiguity and ensures the integrity of the relationships between entities.
- Meaningful and Descriptive: Choose identifiers that clearly describe the entity they represent. Avoid vague or generic names, and instead opt for terms that convey the purpose or function of the entity. This improves the readability of the data model and makes it easier for others to understand its structure.
- Consistent Naming Conventions: Use a consistent naming convention throughout the data model. This can include conventions such as camelCase, PascalCase, or using underscores to separate words. Consistency in naming conventions promotes a unified and professional appearance and makes the data model easier to navigate.
- Avoid Abbreviations and Acronyms: While it may be tempting to use abbreviations or acronyms to shorten entity names, doing so can lead to confusion and misinterpretation. Instead, use full, descriptive names that clearly convey the entity's purpose.
- Use Singular Nouns: In general, use singular nouns for entity identifiers. This makes it easier to understand the relationships between entities and helps to maintain consistency across the data model.
- Start with a Letter: Entity identifiers should begin with a letter, preferably in lowercase for consistency. Starting with a letter ensures compatibility with various database management systems and programming languages.
- Avoid Special Characters and Spaces: Refrain from using special characters (e.g., %, &, $) and spaces in entity identifiers, as these can cause issues in certain database systems and programming languages. Stick to alphanumeric characters to ensure compatibility.
- No Reserved Keywords: Do not use reserved keywords or terms from the database management system or programming language you are working with. This will prevent conflicts and potential errors when implementing the data model.
- Maximize Readability: Entity identifiers should be easily readable by humans, not just computers. Strive for clarity and simplicity when naming entities, and avoid overly complex or confusing terms.
- Documentation: Document the purpose and meaning of each entity identifier within the data model. This aids in understanding and maintaining the data model, especially for team members who may not be familiar with the project's intricacies.
By adhering to these guidelines when creating entity identifiers in a data model, you will ensure that your data model is consistent, scalable, and easily understood by both humans and computers. This will, in turn, facilitate a more efficient development process and contribute to the overall success of your project.
Entity Identifiers
The only purpose for putting the data that describe an entity into a database is to retrieve the data at some later date.
This means that we must have some way of distinguishing one entity from another so that we can always be certain that we are retrieving the precise entity we want. We do this by ensuring that each entity has some attribute values that distinguish it from every other entity in the database (an entity identifier).
Assume, for example, that DistributedNetworks has only two customers named John Smith. If an employee searches for the items John Smith has ordered, which John Smith will the DBMS retrieve?
In this case, the answer is both of them. Because there is no way to distinguish between the two customers, the result of the query will be inaccurate. DistributedNetworks solved the problem by creating customer numbers that were unique.
That is indeed a common solution to identifying instances of entities where there is no simple unique identifier suggested by the data itself.
Another solution would be to pair the customer's first name and last name with his or her telephone number. This combination of data values (a concatenated identifier) would also uniquely identify each customer.
There are, however, two drawbacks to doing so this. First, the identifier is long and clumsy; it would be easy to make mistakes when entering any of the parts.
Second, if the phone number of the customer changes, then the identifier must also change.
Changes made in an entity identifier can cause serious problems in a database.
Some entities, such as invoices, come with natural identifiers (the invoice number). We assign unique, meaningless numbers to others, especially accounts, people, places and things. Still others require concatenated identifiers.
When we store an instance of an entity in a database, we want the DBMS to ensure that the new instance has a unique identifier. This is an example of a constraint on a database, a rule to which data must adhere.
The enforcement of a variety of database constraints helps us to maintain data consistency and accuracy.
The next lesson describes instances of entities.