Lesson 11

SQL View Module Conclusion

In this module we had a tour of views^[1] and what they are used for. As you build specific queries to support the reports you are generating, consider using views for those queries that your enterprise refers to over and over again. Remember those rules if you are creating views to support future updates to your system. The view must have a unique identifier that it can reference in each of its underlying tables, and security must allow you sufficient rights to the tables, which are associated to the view.
Views apply to the course project by providing you a way to build a SELECT statement and then save it, making it possible to write the statement, save your work, then reference that statement in your report. You can perfect a given SELECT statement, save it, and not have to think about it again.
So far in this module, topics covered have included tables, relationships between tables, and indexes attached to tables. You should understand the basic structure of a table, and that the relationships between tables are determined by primary keys in parent tables linked to foreign keys in child tables. Foreign keys are copies of primary key field values from parent tables.
Indexing is important to understand not directly from a modeling perspective, but that indexes are used to superimpose a different order on top of the order created by the very structure of the relationships between tables, imposed by primary and foreign keys. It is important to know that specialized objects exist as options for expansion to a relational database model, as extensions to both the underlying physical structure of a database and the overlying logical structure (the tables and indexes). Following are a few examples:

Additional Considerations and Best Practices for Views

When thinking about views generated from database tables, several additional considerations and best practices come to mind beyond unique identifiers and security permissions:

Performance Optimization:
- Indexing: Ensure that underlying tables have appropriate indexes to support the queries in the view, as views themselves don’t store data but rely on the base tables’ performance.
- Materialized Views: For complex or frequently accessed views, consider materialized views (if supported by the DBMS) to store precomputed results, improving query performance at the cost of storage and refresh overhead.
- Query Complexity: Keep view definitions as simple as possible to avoid performance bottlenecks, especially if they involve multiple joins, aggregations, or subqueries.
Maintainability:
- Naming Conventions: Use clear, consistent naming for views (e.g., prefix with vw_ or include the report name) to distinguish them from tables and indicate their purpose.
- Documentation: Document the purpose, underlying tables, and logic of each view to aid future maintenance and debugging.
- Modularity: Design views to be reusable across multiple reports or applications, reducing redundancy in query logic.
Data Consistency and Integrity:
- Referential Integrity: Ensure the view’s logic accounts for the relationships between tables (e.g., foreign keys) to avoid inconsistent or misleading results.
- Filtering Logic: Clearly define any filtering conditions (e.g., WHERE clauses) to ensure the view returns accurate and relevant data.
- Handling NULLs: Account for NULL values in the underlying tables to prevent unexpected results in the view.
Security and Access Control:
- Granular Permissions: Grant access to views rather than base tables to enforce least-privilege principles, allowing users to query the view without direct access to sensitive table data.
- Column-Level Security: If certain columns in the underlying tables are sensitive, exclude them from the view or use functions to mask data.
- Schema Ownership: Ensure the view is created in a schema with appropriate ownership and access rights for the intended users.
Updatability:
- Read-Only Nature: Most views are read-only, but if the view needs to support updates (e.g., INSERT, UPDATE, DELETE), ensure it meets the DBMS’s requirements (e.g., single-table views with no aggregations or joins in some systems).
- Triggers: For updatable views, consider using INSTEAD OF triggers (if supported) to define custom update logic.
Portability and Compatibility:
- DBMS-Specific Features: Be cautious with vendor-specific syntax or features in view definitions to ensure compatibility if the system migrates to another database platform.
- Versioning: Track changes to view definitions in a version control system to manage updates and rollbacks effectively.
Use Cases and Design:
- Abstraction Layer: Use views to simplify complex queries for end-users or applications, hiding intricate joins or business logic.
- Data Aggregation: Create views for common aggregations (e.g., sums, counts) needed for reports, ensuring consistency across different reports.
- Temporal Data: If views include time-sensitive data, consider adding parameters or filters for date ranges to make them flexible for historical or current reporting.
Testing and Validation:
- Data Validation: Regularly test views to ensure they return correct results, especially after changes to underlying tables or data.
- Performance Testing: Monitor the execution time of queries against views to identify and address performance issues.
- Edge Cases: Test views with edge cases (e.g., empty tables, extreme values) to ensure robustness.
Refresh and Update Strategy:
- Dependencies: Track dependencies between views and underlying tables to anticipate impacts from schema changes (e.g., adding/removing columns).
- Recompilation: Some DBMSs require views to be recompiled after changes to base tables; automate this process where possible.
- Scheduled Refreshes: For materialized views, define a refresh schedule that balances data freshness with system performance.
Scalability:
- Large Datasets: For views querying large tables, consider partitioning the underlying tables to improve performance.
- Caching: If the DBMS supports it, leverage query caching for frequently accessed views to reduce load on the database.

By addressing these aspects, views can be designed to be efficient, secure, maintainable, and aligned with the enterprise’s reporting and data access needs. If you have a specific database system in mind (e.g., PostgreSQL, SQL Server, Oracle), I can tailor these considerations further!

A view is not a physical copy of data and does not contain any data itself. A view is merely a logical overlay of existing tables. Every execution against a view executes the query contained within the view against all underlying tables. The danger with using views is filtering a query against a view, expecting to read a very small portion of a very large table. Any filtering should be done within the view because any filtering against the view itself is applied after the query in the view has completed execution. While views can simplify development and improve code organization, they can indeed lead to performance issues if not used carefully, especially in complex queries involving joins or large datasets. Views can hide the complexity of underlying tables, making it harder to optimize queries and potentially leading to slower performance, particularly when dealing with large tables or complex joins.
Materialized views: Materialized views are available in some very large capacity type relational databases. A materialized view materializes the underlying physical data by making a physical copy of data from tables. So, unlike a view, when a query is executed against a materialized view, the materialized view is physically accessed rather than the underlying tables. The objective is to free the underlying tables for other uses, effectively creating two separate physical copies. Materialized views are often used to aggregate large data sets down to smaller sized data sets, in data warehouses and data marts.^[2]
The biggest potential problem with materialized views is how often they are refreshed and brought up to date with any changes to their underlying tables. Another attribute of materialized views is the ability of some database engines to allow a query directed at an underlying table to be automatically redirected to a physically much smaller materialized view, sometimes called automated query rewrite. Queries can be automatically rewritten by the query Optimizer if the query rewrite can help to increase query performance.
Clusters: Clusters are used in very few databases and have been somewhat superceded by materialized views. In the past, clusters were used to pre-create physical copies of entire field level sections of heavily accessed tables, especially in SQL joins. Unlike materialized views, clusters do not allow for automatic refresh and are normally manually maintained.
Sequences and auto counters: An auto counter field is a special datatype, sometimes called a non-static internal function, allowing automated generation of sequential number values (thus the term sequence). Typically, auto counters are used for primary key surrogate key generation on insertion of new records into a table.
Partitioning and parallel processing: Some databases allow physical splitting of tables into separate partitions, including parallel processing on multiple partitions and individual operations on individual partitions. One particularly efficient aspect of partitioning is the capability when querying a table to read fewer than all the partitions making up a table, perhaps even a single partition.

SQL Views - Quiz

Take this brief quiz to test your knowledge of views.
SQL Views - Quiz

[1]View:A view is essentially a query definition and does not contain any data.

[2]data mart: A data mart is a subject-oriented database that is often a partitioned segment of an enterprise data warehouse. The subset of data held in a data mart typically aligns with a particular business unit like sales, marketing or real estate.