Lesson 4 | Sizing a cluster |
Objective | Properly size a cluster. |
Properly Sizing Oracle Cluster
The Oracle DBA must specify a size for the data associated with a particular value of the cluster key when implementing a cluster.
Sizing a cluster in Oracle 19c is an important task to ensure that your database performs efficiently and that storage is used effectively. A cluster in Oracle is a schema object that contains one or more tables that share the same data blocks because they are often queried together. Properly sizing a cluster involves estimating the space required for the data and considering the size of the cluster key.
Steps to Properly Size a Cluster in Oracle 19c:
- Understand the Concept of a Cluster Key: The cluster key is a column or a set of columns used to group rows from one or more tables in a cluster. Rows with the same cluster key value are stored together in the same data block.
- Estimate the Average Row Size: Calculate the average size of a row in each table that will be part of the cluster. Include all columns and take into account data types, null values, and any potential overhead.
Example estimation:
SELECT AVG(VSIZE(column1) + VSIZE(column2) + ... + VSIZE(columnN)) AS avg_row_size
FROM table_name;
- Determine the Number of Rows per Cluster Key: Estimate how many rows are associated with each cluster key value. This can be done by querying the tables and grouping by the cluster key.
Example:
SELECT cluster_key_column, COUNT(*)
FROM table_name
GROUP BY cluster_key_column;
- Calculate the Space Required for Each Cluster Key: Multiply the average row size by the average number of rows per cluster key.
Formula:
space_per_key = avg_row_size * avg_rows_per_key
- Include Overhead and Padding: Account for additional space required for overheads such as row headers, block headers, and any padding. It's typically recommended to add 10-20% overhead to your calculations.
- Set the Cluster Size: The cluster size is the size of a data block multiplied by the number of data blocks that should fit the estimated space for each cluster key. You should ensure that the calculated cluster size can hold all rows associated with each cluster key value.
Example for setting a cluster size:
CREATE CLUSTER cluster_name
(cluster_key_column DATA_TYPE)
SIZE cluster_size
TABLESPACE tablespace_name;
Here, `cluster_size` should be set based on your calculations to ensure that all rows associated with a cluster key fit within the cluster's blocks.
- Monitor and Adjust: After implementing the cluster, monitor its performance and space usage. Use the `DBA_CLUSTERS` and `DBA_TABLES` views to monitor the space and adjust if necessary.
- Rebuild the Cluster if Necessary: If performance degrades or you notice excessive space usage, you may need to rebuild the cluster with a different size, especially if the distribution of rows per cluster key changes significantly over time.
Example of Creating a Cluster:
Suppose you have two tables `EMPLOYEES` and `DEPARTMENTS` that share a `DEPARTMENT_ID` column and you want to cluster them together:
-- Create the cluster
CREATE CLUSTER emp_dept_cluster
(DEPARTMENT_ID NUMBER)
SIZE 1024 -- Estimated size for each department's data
TABLESPACE users;
-- Create the tables in the cluster
CREATE TABLE DEPARTMENTS (
DEPARTMENT_ID NUMBER PRIMARY KEY,
DEPARTMENT_NAME VARCHAR2(50)
)
CLUSTER emp_dept_cluster (DEPARTMENT_ID);
CREATE TABLE EMPLOYEES (
EMPLOYEE_ID NUMBER PRIMARY KEY,
EMPLOYEE_NAME VARCHAR2(50),
DEPARTMENT_ID NUMBER
)
CLUSTER emp_dept_cluster (DEPARTMENT_ID);
In this example, `SIZE 1024` represents the estimated size needed for all rows associated with a single `DEPARTMENT_ID`.
Summary:
- Estimate average row size and number of rows per cluster key.
- Calculate the space required per cluster key.
- Include overhead in your calculations.
- Set the cluster size based on these estimates when creating the cluster.
- Monitor performance and adjust as necessary.
Properly sizing a cluster helps to optimize query performance and ensures efficient storage utilization, leading to overall better database management.
In a non-clustered table, the size of the data to be contained in the table is not a great concern.
A DBA always must be sure to have enough disk space to store data, but management of the underlying storage is handled by your Oracle database. When you use a cluster, however, you are explicitly linking data, by value, with the way it is stored on disk. Because of this, you must specify a size for the data associated with a particular value of the cluster key.
How do you size Oracle Cluster?
The CREATE CLUSTER command has a required SIZE parameter.
You can specify this size either in K(ilobytes) or M(egabytes). When you create a cluster, each time you add a value for the cluster key, Oracle allocates the number of data blocks required by the value in the SIZE clause. The size specified in the CREATE CLUSTER should be large enough to store all of the data for all of the rows associated with the cluster value and the cluster value itself.
- Improperly size your cluster?
If you size your cluster too small, the data for the cluster value will extend beyond the pre-allocated space, and the additional chaining will reduce the overall performance advantages of the cluster. If you specify too large a size for your cluster, you will end up wasting disk space, which can also contribute to reduced disk and database performance. The next lesson shows how to create a cluster index.
Sizing Cluster - Exercise