Write Ahead Logging
To understand how the write-ahead log works, it is important for you to know how modified data is written to disk. SQL Server maintains a buffer cache into which it reads data pages when data must be retrieved. Data modifications are not made directly to disk, but are made to the copy of the page in the buffer cache. The modification is not written to disk until a checkpoint occurs in the database, or the modification must be written to disk so the buffer can be used to hold a new page. Writing a modified data page from the buffer cache to disk is called flushing the page. A page modified in the cache, but not yet written to disk, is called a dirty page.
At the time a modification is made to a page in the buffer, a log record is built in the log cache that records the modification. This log record must be written to disk before the associated dirty page is flushed from the buffer cache to disk. If the dirty page is flushed before the log record is written, the dirty page creates a modification on the disk that cannot be rolled back if the server fails before the log record is written to disk. SQL Server has logic that prevents a dirty page from being flushed before the associated log record is written. Log records are written to disk when the transactions are committed.
Transaction Log Physical Architecture
The SQL Server Database Engine divides each physical log file internally into a number of virtual log files. Virtual log files have no fixed size, and there is no fixed number of virtual log files for a physical log file. The Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files. The Database Engine tries to maintain a small number of virtual files. The size of the virtual files after a log file has been extended is the sum of the size of the existing log and the size of the new file increment. The size or number of virtual log files cannot be configured or set by administrators.
The only time virtual log files affect system performance is if the log files are defined by small size and growth_increment values. If these log files grow to a large size because of many small increments, they will have lots of virtual log files.
The transaction log is a wrap-around file. For example, consider a database with one physical log file divided into four virtual log files. When the database is created, the logical log file begins at the start of the physical log file. New log records are added at the end of the logical log and expand toward the end of the physical log. Log truncation frees any virtual logs whose records all appear in front of the minimum recovery log sequence number (MinLSN). The MinLSN is the log sequence number of the oldest log record that is required for a successful database-wide rollback. The transaction log in the example database would look similar to the one in the following illustration.
When the end of the logical log reaches the end of the physical log file, the new log records wrap around to the start of the physical log file.
This cycle repeats endlessly, as long as the end of the logical log never reaches the beginning of the logical log.
The WAL protocol for SQL Server is the ability to secure/harden the log records to stable media. SQL Server uses the FILE_FLAG_WRITE_THROUGH option when opening the file (
CreateFile) to accomplish this task. by using FILE_FLAG_WRITE_THROUGH, Write operations will not go through any intermediate cache, they will go directly to disk. There is another option called
FILE_FLAG_NO_BUFFERING to disable system caching for data reads and writes whilst the file or device is open.
FILE_FLAG_NO_BUFFERING are independent and may be combined.
FILE_FLAG_WRITE_THROUGH is used but
FILE_FLAG_NO_BUFFERING is not also specified, so that system caching is in effect, then the data is written to the system cache but is flushed to disk without delay.
FILE_FLAG_NO_BUFFERING are both specified, so that system caching is not in effect, then the data is immediately flushed to disk without going through the Windows system cache. The operating system also requests a write-through of the hard disk’s local hardware cache to persistent media.
Note Not all hard disk hardware supports this write-through capability.
What is Hard or Stable Media?
Hard/Stable media is deemed any media that can survive a power failure. This could be the physical storage device or sophisticated, battery backed caching mechanisms. As long as the I/O path returns successful write to the SQL Server it can uphold that guarantee.
Every transaction in any RDBMS needs to provide ACID (Atomic, Consistent, Isolated, Durable) characteristics, therefore SQL Server uses Write Ahead Logging Protocols to provide durability of any transaction. to continue, we do need to understand that Latch is only used to protect the physical integrity of the data page while in memory and lock is more of a virtual concept and can protect data that is no longer or not yet in memory.
Take dbo.Transaction table with about 10 Billions records as an example, we do need to execute the following script to update the entire table records.
--Updates about 1.5 Billions Records. Update dbo.Transactions Set TransStatus = 1 --Means Completed Where TransDate = GetDate();
Simplified internal process for the above script is illustrated as below figure.
The entire table won’t fit into memory so lazy writer will wrote dirty pages to disk. In doing so
LazyWriter will first call the routine
FlushToLSN to flush all log records up to and including the LSN stored in the page header of the dirty page. Then
LazyWriter, and only then, will it issue the write of the data page. in case of power outage at this point the log records can be used to reconstruct the page (rollback the changes).
Notice that the latch only protects the physical access to the in-memory data page only for the small about of time the physical change is being made. The lock protects the associated data until the commit or rollback takes place. This allows
LazyWriter to remove a dirty page while the locking structures continue to maintain the transactional integrity.
Remember that SQL Server needs to fetch back all the data pages in case of
Rollback command is issued instead of
Commit, Because the page could have been removed from buffer pool/page cache by lazy writer after the
FlushToLSN took place. When the rollback occurs the pages changed by the transaction may need to be fetched into buffer pool to undo the change. In SQL Server 2000,the log records used to populate the
DELETED internal/virtual tables in a trigger. From SQL Server 2005 and later versions, Snapshot isolation is used, internally, for any trigger instead of having to scan the log records. This means the version store (located) in TEMPDB is involved in
DELETED table population and row version tracking.