Deep down I know just how important it is to schedule the time to do SQL Server health check-up, and I know you are not the only one making up excuses. It’s something DBA needs to think about and prioritize. From experience it’s better to know SQL Server results and be able to take real, actionable steps towards better SQL Server environment rather than not knowing at all.
Looking after SQL Server health is the most important thing you can do not only for yourself, but for your company business continuity as well. Having regular SQL Server heath check-ups are a great way of knowing how SQL Server health is tracking, as it provides you with an opportunity to discuss SQL Server results with an expert consultant, and talk about areas that you may need working on.
Regular checks of SQL Server memory pressure, resource usages, cache plans, index design and database growth can alert you to any potential problems that can be easily identified and can be rectifiable. This allows you to take SQL Server health in your own hands and make changes to any areas where SQL Server may be at risk, through things just as simple as the index fragmentation. Creating this awareness is not only empowering it is making SQL Server health your number one priority!
So here is a quick recap of the SQL Server health risks that are not only damaging to your SQL Server environment , but also to your company.
To maintain smooth and well running SQL Server environment, There is a need of health check process to find out current and potential issues and design rectify plan. The health check process can be done on the surface or deep dive.
Surface (Quick) Health Check: This health check process is very quick and it only covers Operating System errors and SQL Server errors. It is good to make it within the short period of time.
The Event viewer tool is the first place you should look for Operating System level issues. Don’t waste too much time on the Application log, as the SQL Server messages you will find there, are duplicates of the messages you will also see at the SQL Server ERRORLOG. The System log is where you should put the focus on. I usually filter the log so that it shows only the Errors and Warnings – Information messages are useful for troubleshooting specialized issues and not for generic health checks. Error and Warning event sources that usually draw my attention are:
- NTFS (possible NTFS corruption messages)
- Disk (possible Storage issues)
- EventLog (Unexpected shutdowns)
- Microsoft-Windows-WER-SystemErrorReporting (Unexpected shutdowns)
- Microsoft-Windows-Kernel-Power (Unexpected shutdowns)
These SQL Server error log files are usually found here:
C:Program FilesMicrosoft SQL Server
They can be accessed from inside SQL Server Management Studio if you prefer so.
Messages that usually draw my attention are:
- All kinds of Errors (except login errors, which are usually spammed in the ERRORLOG)
- Any enabled trace flags (you will see them appear in this format: -TXXXX, where XXXX is a number)
- Any database corruption messages
Also these messages:
- A significant part of SQL Server memory has been paged out (incorrect memory configuration)
- SQL Server has encountered x occurrence(s) of I/O requests taking longer than 15 seconds to complete on file (storage performance issues)
- SQL Server is starting at high priority base (priority needs to be changed back to normal)
- Deadlock encountered …. Printing deadlock information (deadlock)
- deadlock-list (deadlock)
Deep Dive Health Check:
This health check process is very comprehensive and it covers Operating System, SQL Server errors and SQL Server internals such as tasks, caches, objects, index design, fragmentations, resource usages and etc. This is the long running process and is recommended to be done before any major rectification.
Health Check process suppose to be ran monthly and for heavy workload production weekly to highlight SQL Server’s unusual behavior and predict the potential issues.