A multi-valued attribute is a single attribute with two distinct data values entered for that attribute.
No attribute is permitted to have multiple data values. If so, this would violate the
one attribute |-> one value rule
of relational databases.
For every instance of an entity, each attribute in its group can have only one value. Attributes with more than one value are considered multi-valued, and they must be resolved.
Using multi-valued attributes in an Entity-Relationship (ER) model can lead to several disadvantages and complications. These include:
Increased Complexity in Implementation
Multi-valued attributes require additional tables or structures for proper storage in a relational database.
For example, a multi-valued attribute like phone_numbers in a Person entity will typically need a separate table to handle the one-to-many relationship, making the schema more complex.
Normalization Issues
Multi-valued attributes violate the principles of database normalization, particularly 1NF (First Normal Form), which requires that each attribute in a relation contains only atomic (indivisible) values.
Denormalized data can lead to redundancy and inconsistency.
Query Complexity
Queries involving multi-valued attributes often become more complex because they require joins with additional tables.
For example, retrieving all phone numbers for a person would require joining the Person table with a PhoneNumbers table.
Data Integrity Challenges
Ensuring data integrity becomes more challenging when handling multi-valued attributes, as relationships between entities and their multi-valued attributes must be carefully maintained.
Redundancy and Anomalies
If multi-valued attributes are not properly normalized, they may introduce redundancy.
Update anomalies can occur if changes to the attribute need to be propagated to multiple places in the database.
Performance Issues
Storing and querying multi-valued attributes can lead to performance overhead, especially in large datasets.
Additional joins or subqueries may impact query performance negatively.
Lack of Portability
Some database systems may not natively support multi-valued attributes, making it difficult to migrate or port the schema to different database platforms.
This can lead to vendor lock-in or compatibility issues.
Limited Flexibility for Advanced Relationships
Representing more complex relationships (e.g., with additional attributes or constraints on the multi-valued attribute) becomes harder.
For instance, if each phone number in phone_numbers needs a type (e.g., mobile, home, work), it requires further restructuring.
Loss of Scalability
As the number of multi-valued attribute instances increases, managing the data becomes less scalable without a proper design approach.
Example
Consider an entity `Employee` with a multi-valued attribute `skills`. Instead of directly modeling `skills` as a multi-valued attribute:
Create an EmployeeSkills table to represent the many-to-many relationship between Employee and Skill.
This normalization ensures better scalability, easier query formulation, and compliance with relational database principles.
Conclusion: While multi-valued attributes can be convenient in early conceptual stages, they should be avoided in physical database design. Instead, normalizing the model and creating separate entities or relationships ensures better database performance, scalability, and maintainability.
Identifying multi-valued Attributes
Question: What exactly constitutes a multi-valued attribute?
Consider an attribute called "Name" for a "Customer" entity.
Question: If Susan Anthony were entered as a data value, would this be considered a multi-valued attribute?
Answer: No. The full name does not violate the one-value-per-attribute rule because it is not multi-valued in the sense of presenting two different values for an attribute. Question: Why, then, do most databases break down the single-attribute name into two attributes:
first name and
last name?
Here is one good reason to do so. Suppose you want to send form letters to your customers and grab the information for the salutation from the database. Do you really want to send a letter that begins with:
Dear Susan Anthony?
Probably not (at least, not in all situations).
What you really want for the salutation is just the customer's first name. In other words, while a full name does not violate the "one attribute |->one value" rule for the Name attribute, there are practical reasons for breaking it down.
Now consider the same "Customer" entity with a "Phone" attribute.
If two telephone numbers are entered as data values (perhaps a business phone number and a home phone number) for the Phone attribute, this constitutes a multi-valued attribute.
Problem with multi-valued attributes
Multi-valued attributes make sorting data in a database extremely difficult, if not impossible. They must be resolved, and there are two ways to do so:
create additional attributes for an entity, or
create an entirely new entity.
Question: What is the problem with multivalued attributes?
Multivalued attributes can cause problems with the meaning of data in the database, significantly slow down searching, and place unnecessary restrictions on the amount of data that can be stored. Assume, for example, that you have an employee entity with attributes for the name and birth dates of dependents. Each attribute is allowed to store multiple values, as in Figure 5-7, where each gray blob represents a single instance of the employee entity. How will you associate the correct birth date with the name of the dependent to which it applies?
Will it be by the position of a value stored in the attribute (i.e., the first name is related to the first birth date, and so on)? If so, how will you ensure that there is a birth date for each name and a name for each birth date?
How will you ensure that the order of the values is never mixed up?
When searching a multivalued attribute[1], a DBMS must search each value in the attribute, most likely scanning the contents of the attribute sequentially. A sequential search is the slowest type of search available. In addition, how many values should a multivalued attribute be able to store? If you specify a maximum number, what will happen when you need to store more than the maximum number of values? For example, what if you allow room for 10 dependents in the employee entity just discussed and you encounter an employee with 11 dependents? Do you create another instance of the employee entity for that person? Consider all the problems that doing so would create, particularly in terms of the unnecessary duplicated data.
Note: Although it is theoretically possible to write a DBMS that will store an unlimited number of values in an attribute, the implementation would be difficult and searching much slower than if the maximum number of values were specified in the database design.
The next lesson describes how to resolve multi-valued attributes by adding more attributes to describe an entity.
[1]multi-valued attribute: An attribute with more than one data value; must be resolved into a single-valued attribute.