Lesson 14 | Optimizing query statements |
Objective | Understand optimization rules for query statements. |
Understand Optimization Rules for Query Statements.
Once you know what kind of indexing will be used, you can decide whether to modify your query statement.
One of the things you will look at is whether clustered or non-clustered indexes will be used. The performance impact of a query will depend on the type of index used and the type of query.
Clustered versus non-clustered Indexing
First, we will review the basic difference between clustered and non-clustered indexes. For more detailed information, refer to the lesson on indexes.
Clustered index
A clustered index is one which speeds up processing because the data is physically ordered, based on the values in the clustered index.
Non-clustered index
A non-clustered index is one which speeds up processing because there is a pointer to the data values. The values are NOT stored in physical order.
Creating a Table with a Foreign Key
I am going to ignore the ON clause. That leaves a script that looks something like this for the Orders table:
USE Accounting
USE Accounting;
CREATE TABLE Customers
(
CustomerNo int IDENTITY NOT NULL PRIMARY KEY,
CustomerName varchar(30) NOT NULL,
Address1 varchar(30) NOT NULL,
Address2 varchar(30) NOT NULL,
City varchar(20) NOT NULL,
State char(2) NOT NULL,
Zip varchar(10) NOT NULL,
Contact varchar(25) NOT NULL,
Phone char(15) NOT NULL,
FedIDNo varchar(9) NOT NULL,
DateInSystem smalldatetime NOT NULL
);
Note that the actual column being referenced must have either a PRIMARY KEY or a UNIQUE constraint defined on it.
It is also worth noting that primary and foreign keys can exist on the same column. You can see an example of this in the AdventureWorks database with the Sales.SalesOrderDetail table. The primary key is composed of both the SalesOrderID and the SalesOrderDetailID columns.
The former is also the foreign key and references the Sales.SalesOrderHeader table. You will actually create a table later in the chapter that has a column that is both a primary key and a foreign key.
How It Works
Once you have successfully run the preceding code, run sp_help, and you should see your new constraint reported under the constraints section of the sp_help information. If you want to get even more to the point, you can run sp_helpconstraint. The syntax is easy:
EXEC sp_helpconstraint <able name>
Run sp_helpconstraint on your new Orders table, and you will get information back giving you the names, criteria, and status for all the constraints on the table. At this point, your Orders table has one FOREIGN KEY constraint and one PRIMARY KEY constraint.
Note: When you run sp_helpconstraint on this table, the word clustered
will appear right after the reporting of the PRIMARY KEY. This just means it has a clustered index.
Your new foreign key has been referenced in the physical definition of your table, and is now an integral part of your table.
The database is in charge of its own integrity. Your foreign key enforces one constraint on the data and makes sure your database integrity remains intact. Unlike primary keys, foreign keys are not limited to just one per table. You can have between 0 and 253 foreign keys in each table. The only limitation is that a given column can reference only one foreign key. However, you can have more than one column participate in a single foreign key. A given column that is the target of a reference by a foreign key can also be referenced by many tables.
Optimization Rules
Once you know the indexing plan for your query, you should consider the following:
- If many records are being inserted, a clustered index can hurt performance because SQL Server has to physically reorder the rows so that
they are in order.
- If SQL Server statistics are out of date, the query optimizer might select an index that is no longer the best candidate to use.
- If there is a tremendous amount of data that needs to be inserted, it might make more sense to instruct SQL Server not to maintain statistics for given indexes or columns. For example, if you are inserting one million rows, it might make sense to turn off statistics, insert all the data, then turn statistics back on. This prevents recomputing statistics one million times.
The next lesson will review this module.