Database optimization is the act of design and developing database systems to take maximum advantage of the resources available. For example, databases can be optimized to benefit from the large quantity of memory available on a particular server, or the speed of the specific I/O hardware available, or particular features of the processor being used. Even in development environments such as the Microsoft .NET Framework, where much of the low-level details and hardware-specific code have been abstracted away to a runtime engine, there is much that a database designer and developer should do to ensure that queries execute in an optimal manner. An application that does not perform well will not be a success, so the benefits of optimization are as much commercial as they are a simple convenience for end users.
As mentioned above, the reasons for optimizing database queries are essentially commercial; very few consumers will purchase a system that has a reputation for poor database performance, especially if a more viable alternative is available for a similar price. In the world of database development, performance may also be a contractual or even a legal requirement (through Service Level Agreements). For example, the systems used by many clearing banks must be capable of performing large numbers of transactions in a short period of time, often dictated by regulatory authorities such as the Federal Reserve in the United States. Failure to complete processing within a given timeframe can result in a bank incurring interest charges on the funds not transferred (in many cases, the transactions can total well over a billion dollars, so even one night’s interest payments would be considerable), a fine being imposed, or even the loss of a trading license if database performance is consistently poor.
The definition of optimization according to Dictionary.com is “The procedure or procedures used to make a system or design as effective or functional as possible-” Optimization is thus a subjective quantity, and how well a system is optimized depends upon the expectations and point of view of the user.
It is important that to quantify performance so that systems can be optimized within realistic expectations. It is also important to state these performance measures up front, possibly as part of the customer acceptance criteria, so the system can be designed and implemented accordingly.
How well a database is optimized is often measured by the quality of service it delivers. The quality of service is more objective than simple user expectations, and it mandates performance measures in terms of response times, throughput, and so on. The quality of service will also govern aspects such as the reliability and availability of a database, and these features can also have a bearing on the optimizations applied. Therefore, an important question to ask when designing a database is “what are the parameters that define acceptable performance?”
To end users sitting in front of a desktop computer, performance is usually equated with response time. An application that fetches 1000 rows of data from a database before displaying the first row is likely to be perceived by the user as slower than an application that fetches the first 10 rows and displays them while retrieving the remainder either as a lower priority background task, or on demand as and when the user “pages” to that data.
To data-processing departments, performance means throughput, often measured in terms of the number of transactions processed per second. In this scenario raw processing power coupled with devices that can provide fast data storage is essential. Requests received from potentially thousands of clients must be processed and responded to quickly.
These are frequently overlooked aspects of high performance databases, although as requirements they are often critical. (A clearing bank, for example, cannot afford for its databases to go down, even for a few minutes, without expec ting to incur some financial penalty.) A database that is not functioning has zero performance. An optimized database must therefore also be extremely reliable.
It should be clear that performance and optimization-like any form of quality-is not something that can be “bolted on” to a database as an extra once it has been created. Optimization is an on-going, non-functional requirement that affects all stages in the development of a database, from analysis and design through development and implementation.
Once a database is up and running, optimization does not finish. The database should be monitored for any signs of instability or bottlenecks, and corrective action taken before a potential failure occurs. As the number of client computers accessing a server scales up, optimization may involve adding further hardware (memory, processors, or even entire servers in a clustering environment), and balancing data access across additional disk controllers.
These days, optimization is much more likely to involve identifying the appropriate hardware to use as a platform, and then tuning the database for that hardware. It is far more cost-effective to add another 16GB of memory to a server running an database server than to pay for a consultant or developer to rewrite part of the database queries. Fast hardware is not an excuse for poor design and coding practice; if an database halts while waiting for user input, or while data is locked in a database, it does not matter how fast the processor is.
Optimization is a key requirement of all commercial systems. Failure to optimize can at best mean user-frustration, and at worst the loss of an entire business.