Obtaining the best performance from the database engine requires a schedule of routine maintenance operations.
Database corruption is rare but one of the most important tasks in the maintenance schedule is to ensure that no corruption has occurred in the databases. SQL Server indexes will perform better if you periodically review their performance and remove any fragmentation that occurs within them.
The database engine depends upon the hardware platform that it runs on, and that can cause corruption. In particular, issues in the memory and I/O subsystems can lead to corruption within databases.There are two levels of database integrity:
- Physical integrity. Ensures that SQL Server can read and write data extents to the physical media.
- Logical integrity. Ensures that the data within the pages is logically correct. The pages can then refer to each other as required so that SQL Server can fetch related pages.
Without regular checking of the database files, any lack of integrity of the database might lead to bad information derived from it. Backup does not check the integrity, it only checks the page checksums and that is only when you use the WITH CHECKSUM option on the BACKUP command. Although the CHECKSUM database option is important, the checksum is only checked when SQL Server reads the data. The exception to this is when SQL Server is backing up the database and using the WITH CHECKSUM option. Archive data is, by its nature, not read frequently and this can lead to corrupt data within the database that, if it’s not checked as it’s backed up, may not be found for months.
SQL Server provides the DBCC utility, which supports a range of management facilities. The DBCC CHECKDB command makes a thorough check of the structure of a database, to detect almost all forms of potential corruption. The functionality that DBCC CHECKDB contains is available as a range of options which are described as follows:
- DBCC CHECKALLOC checks the consistency of the database files disk space allocation units.
- DBCC CHECKTABLE checks the pages associated with a specified table and the pointers between pages that are associated with the table. DBCC CHECKDB executes DBCC CHECKTABLE for every user table in the database.
- DBCC CHECKCATALOG checks the database catalog schema by performing logical consistency checks on the metadata tables in the database. These tables hold information that describes both system and user tables in addition to other database objects. DBCC CHECKCATALOG does not check user tables.
[INFO] Note: DBCC CHECKDB also performs checks on other types of objects, such as the links for FILESTREAM objects and consistency checks on the Service Broker objects. [/INFO]
DBCC CHECKDB Aspects
Usually, the best method for database recovery is to restore a backup of the database by synchronizing the execution of DBCC CHECKDB with backup retention policy. This ensures that you can always restore a database from an uncorrupted database.
Online Concurrent Operations
DBCC CHECKDB works using internal database snapshots to ensure that the utility works with a consistent view of the database to allow the database to be in use during this operation.
The use of an internal snapshot causes DBCC CHECKDB to need additional disk space. The amount of disk space required on the volumes depends on how much data is changed during the execution of DBCC CHECKDB.DBCC CHECKDB also uses space in the tempdb database while executing. To provide an estimated amount of space required in tempdb, DBCC CHECKDB provides an ESTIMATEONLY option.
Backups and DBCC CHECKDB
It is good practice to run DBCC CHECKDB on a database before performing a backup. This check helps to ensure that the backup will contain a consistent version of the database.
You can reduce the impact of the DBCC utility on other services running on the server by setting the MAXDOP option to more than 0 and less than the maximum number of processors in your system.
The DBCC CHECKFILEGROUP command runs checks against the user objects in the specified filegroup. This has the potential of saving you considerable checking of non-user metadata objects.
DBCC CHECKDB Options
DBCC CHECKDB provides a number of options that alter its behavior while it is executing.Option descriptionPHYSICAL_ONLY Only checks the integrity of the physical structure to reduce overheadNOINDEX Does not perform logical checks on nonclustered indexesEXTENDED_LOGICAL_CHECKS Performs additional logical checks of indexed views, spatial, and XML indexes.TABLOCK Uses locks instead of database snapshotALL_ERRORMSGS Returns all error messages instead of the default action that returns the first 200NO_INFOMSGS Returns only error messages with no informational messagesESTIMATEONLY Estimates the amount of tempdb space that is required to run
DBCC CHECKDB Repair Options
You should back up a database before performing any repair option, if you can, and also find and resolve the reason for the corruption. Otherwise, it could happen again soon after.DBCC CHECKDB offers two repair options. For both options, the database needs to be in single user mode. The options are:
- REPAIR_REBUILD rebuilds indexes and removes corrupt data pages. This option only works with certain mild forms of corruption and does not involve data loss.
- REPAIR_ALLOW_DATA_LOSS will almost certainly lose some data. It deallocates any corrupt pages it finds and changes others that reference the corrupt pages to stop them trying to reach them. After the operation completes, the database will be consistent, but only from a physical database integrity point of view. Significant loss of data could have occurred. Repair operations do not consider any of the constraints that may exist on or between tables. If the specified table is involved in one or more constraints, you should execute DBCC CHECKCONSTRAINTS after running the repair operation.
Indexes can help to improve searching, sorting, and join performance, but they can also impede the performance of data modification. Over time, through the manipulation of the data, indexed data becomes fragmented. This fragmentation reduces the performance of storage access operations at the physical level. Another important aspect of SQL Server that requires ongoing maintenance for optimal performance is the management of indexes.
Index fragmentation occurs over time, when data modification causes index pages to split. There are two type of index fragmentation:
- Internal fragmentation occurs when pages are not completely full of data. This often occurs when a page is split during an insert—or update operation in a case where the data could not fit back into the space initially allocated to the row.
- External fragmentation occurs when pages get out of sequence during Data Manipulation Language (DML) operations that split existing pages when modifying data. When the index requires and allocates a new page within an extent that is different to the one that contains the original page, extra links are required to point between the pages and extents involved.
You can use sys.dm_db_index_physical_stats dynamic management view to analyze the level of fragmentation in your indexes and to decide whether to rebuild it.The following code shows how to detect objects with fragmentation more than 30 percent in database “adventureWorks2012”:SELECT DB_NAME(database_id) AS [Database] , OBJECT_NAME(object_id) AS [Object] , avg_fragmentation_in_percent AS Fragmentation
WHERE avg_fragmentation_in_percent > 30
ORDER BY avg_fragmentation_in_percent DESC
FILLFACTOR and PAD_INDEX
You can reduce the performance impact of page splits by leaving an empty space on each page when creating an index, including a clustered index. You can use the FILLFACTOR option when you create the index to define how full the index pages should be. The default value of the FILLFACTOR is 0, which actually means fill the pages 100 percent. Any other value indicates the percentage of each page filled, on the index creation.By default, the FILLFACTOR option only applies to leaf-level pages of an index. You can use it in conjunction with the PAD_INDEX = ON option to cause the same free space to be allocated in the nonleaf levels of the index as well.Although the use of the fill factor to pre-allocate space in the index pages can help in some situations, it is not a best practice; it may even impede performance where all the changes are on the last page anyway. Consider the auto increment primary key index; It is unlikely that this fill factor will improve performance. The primary key ensures that no duplicate rows occur; therefore, it is unlikely that you will insert values for such rows in the middle of existing ones, because the primary key is sequential. Also, if you do not use variable sized columns, you will be able to update rows in place too, so there is no requirement for empty space for updates.Make sure you consider the costs in addition to the benefits of setting the empty space in index pages. Both memory pages and disk space are wasted if the space adds no value. There will also be reduced network efficiency and an inability of the processing unit to get as much useful data when it needs to process it.
SQL Server provides two options for removing fragmentation from clustered and non-clustered indexes.
- REBUILD drops and recreates the index. This removes fragmentation, reclaims disk space by compacting the pages based on the specified or existing fill factor setting, and reorders the index entries in contiguous pages. When you use the ALL option, SQL Server drops all indexes on the table and rebuilds them in a single operation. If any part of that process fails, SQL Server rolls back the entire operation. A single rebuild operation can use a large amount of space in the transaction log. To avoid this, you can change the recovery model of the database to use the BULK_LOGGED or SIMPLE recovery models before performing the rebuild operation, so that it is a minimally logged operation.
- REORGANIZE uses minimal system resources. It defragments the leaf level of clustered and nonclustered indexes by physically reordering the leaf-level pages to match the logical, left to right order of the leaf nodes. Reorganizing an index also compacts the index pages. The compaction uses the existing fill factor value for the index. You can interrupt a reorganize without losing the work performed so far. This means that, on a large index, you could configure partial reorganization to occur each day.
For heavily fragmented indexes (those with fragmentation greater than 30 percent) rebuilding is usually the most appropriate option to use. SQL Server maintenance plans include options to rebuild or reorganize indexes. If you do not use maintenance plans, it is important to create a job that regularly performs defragmentation of the indexes in your databases.Many organizations have no available maintenance-time windows during which to perform database maintenance operations such as index rebuilds. The Enterprise edition of SQL Server can perform index operations online while users are accessing the database. During the online rebuild operation, the database engine applies schema locks to prevent metadata changes. This means that users cannot change the structure of the table using commands such as ALTER TABLE while the online index rebuild operation is occurring.[info] Because of the extra work that needs to be performed, online index rebuild operations are typically slower than offline ones. [/info]
One of the main tasks that SQL Server performs when it is optimizing queries is to determine which indexes to use and which ones not to use. To make these decisions, the query optimizer uses statistics about the distribution of the data in the index and data pages. SQL Server automatically creates statistics for indexed and non-indexed columns when you enable the AUTO_CREATE_STATISTICS database option.SQL Server automatically updates statistics when AUTO_UPDATE_STATISTICS is set. This is the default setting and it is best that you do not disable this option unless you really have to, because it is necessary for a package solution you are implementing on top of SQL Server.For large tables, the AUTO_UPDATE_STATISTICS_ASYNC option instructs SQL Server to update statistics asynchronously instead of delaying query execution, where it would otherwise update an outdated statistic before query compilation.You can also update statistics on demand. Executing the Transact-SQL code UPDATE STATISTICS
SQL Server Maintenance Plans
You can use the SQL Server Maintenance Plan Wizard to create SQL Server Agent jobs that perform routine database maintenance tasks. The wizard creates SQL Server Integration Services (SSIS) packages for SQL Server Agent tasks to execute on the specified schedules. You can schedule many maintenance tasks to run automatically, including:
- Checking database integrity.
- Shrinking databases.
- Rebuilding and reorganizing indexes in a database.
- Updating database object statistics.
- Cleanup of task history.
- Activation of related SQL Agent jobs and alerts.
You can monitor them by using the standard Job Activity Monitor in SSMS. The results of the maintenance tasks are stored in the maintenance plan tables in the msdb database. You can query these tables, dbo.sysmaintplan_log and dbo.sysmaintplan_log_detail to view the entries from the Log File Viewer. In addition, tasks can generate text-based reports, write them to the file system, and send them automatically to operators defined in the SQL Server Agent service system.