Microsoft sql server I/o basics: Chapter 2



Yüklə 0,61 Mb.
səhifə7/7
tarix16.08.2018
ölçüsü0,61 Mb.
#63142
1   2   3   4   5   6   7
Latch enforcement

SQL Server 2000 and SQL Server 2005 can perform latch enforcement for database pages located on the buffer pool cache. Latch enforcement changes the virtual memory protection (VirtualProtect) as the database pages are transitioned between the clean and dirty states. The following table outlines the virtual protection states.

Page State

Virtual Protection State

Dirty

Read Write during the modification.

Clean

Read Only; any attempt to modify the page when this protection is set (termed a scribbler) causes a handled exception, generating a mini-dump for additional investigation.

The database page remains in the virtual protection state of Read Only until the modification latch is acquired. When the modification latch is acquired. the page protection is changed to Read Write. As soon as the modification latch is released, the page protection is returned to Read Only.

Note: The default latch enforcement protection setting is disabled. Latch enforcement may be enabled with trace flag –T815. SQL Server 2000 SP4 and 2005 allow for the trace flag to be enabled and disabled without a restart of the SQL Server process by using the DBCC traceon(815, -1) and DBCC traceoff(815,-1) commands. Earlier versions of SQL Server require the trace flag as a startup parameter.

Note: The trace flag should only be used under the direction of Microsoft SQL Server Support as it can have significant performance ramifications and virtual protection changes may not be supported on certain operating system versions when you are using PAE/AWE.

Note: Windows extended support for VirtualProtect in Windows Server™ 2003 SP1 and Windows XP SP2 to allow virtual protection of AWE allocated memory. This is a very powerful change but could affect the performance of SQL Server if it is configured to use AWE or locked pages memory due to the extended protection capabilities.

Latch enforcement applies only to database pages. Other memory regions remain unchanged and are not protected by latch enforcement actions. For example, a TDS output buffer, a query plan, and any other memory structures remain unprotected by latch enforcement.

To perform a modification, SQL Server must update the page protection of a database page to Read Write. The latch is used to maintain physical stability of the database page so the modification latch is only held for long enough to make the physical change on the page. If the page is damaged during this window (scribbled on), latch enforcement will not trigger an exception.

In versions earlier than SQL Server 2004 SP4, SQL Server latch enforcement protection involved more protection transitions. The following table outlines the protection transactions performed by SQL Server earlier than SQL Server 2000 SP4.




Page State

Virtual Protection State

Dirty

Read Write during the modification.

Clean No References

No Access; any attempt to read or write from the page causes an exception.

Clean With References

Read Only; any attempt to modify the page when this protection is set (termed a ‘scribbler’) causes a handled exception, generating a mini-dump for additional investigation.

Because virtual protection transitions are expensive, SQL Server 2000 SP4 and SQL Server 2005 no longer transition the page protection to No Access, thereby reducing the number of transitions significantly. The older implementation could raise an exception for an invalid read try where the newer implementations cannot. The overhead of No Access protection transitions frequently made latch enforcement too heavy for use in a production environment. Leaving the page with Read Only access reduces the number of protection changes significantly and still helps in the identification of a scribbler.

SQL Server does not return all data pages to Read Write protection as soon as the trace flag is disabled. The pages are returned to Read Write protection as they are modified so that it may take some time to return to a fully non-latch enforced buffer pool.

Checksum on backup and restore

SQL Server 2005 BACKUP and RESTORE statements provide the CHECKSUM option to include checksum protection on the backup stream and trigger the matching validation operations during restore. To achieve a checksum-enabled backup, the BACKUP command must include the CHECKSUM option.

The backup and restore processes try to work with large blocks of data whenever possible. For example, the backup operation examines the allocation bitmaps in the database to determine what data pages to stream to the backup media. As soon as a block of data is identified, the backup operation issues a large 64 KB to 1 MB read from the data file and a matching write operation to the backup stream. The backup operation avoids touching individual bytes of the data pages or log blocks to maximize its throughput as a high speed copy implementation.

Backup and restore operations that use checksum capabilities increase data integrity protection and also increase CPU usage requirements. A backup or restore with the checksum option requires that each byte be interrogated as it is streamed, thereby increasing CPU usage. The checksum that is used for backup and restore uses the same algorithm to calculate the checksum value for the backup media as is used for data pages and log blocks.

The following rules apply to the BACKUP and RESTORE command CHECKSUM operations.



  • By default, SQL Server 2005 BACKUP and RESTORE operations maintain backward compatibility (NO_CHECKSUM is the default).

  • The database’s PAGE_VERIFY setting has no affect on backup and restore operations; only the CHECKSUM setting on the backup or restore command is relevant.

  • The backup and restore checksum is a single value representing the checksum of the complete stream; it does not represent individual pages or log blocks located in the backup stream. The value is calculated during the backup and stored with the backup. The value is recalculated during the restore and checked against the stored value.

  • Backup with the CHECKSUM option will not change the pages as it saves them to the backup media; a page’s protection state (NONE, CHECKSUM, or TORN) is maintained as read from the database file. If a checksum was already stored on the data page, it is verified before the page is written to the backup stream.

  • Restore and Verify commands can be used to validate the CHECKSUM if the backup was created by using the CHECKSUM option. Trying to restore with the CHECKSUM option on a backup without a checksum returns an error.

For more information on backup and restore, see SQL Server 2005 Books Online.
Page-level restore

SQL Server 2005 Enterprise Edition introduces page-level restore to repair damaged pages. The database can restore a single page from backup instead of requiring a full database, file group, or file restore. For complete details, see SQL Server 2005 Books Online.
Database available during Undo phase

SQL Server 2005 Enterprise Edition enables access to the database as soon as the Redo phase of recovery is finished. Locking mechanisms are used to protect the rollback operations during the Undo phase. To reduce downtime, page-level restore can be combined with the crash recovery capability of enabling access to the database during the Undo phase of recovery.
Torn page protection

Torn page protection has not significantly changed from SQL Server 7.0 and SQL Server 2000. This section provides details on torn page protection and how it works to help you compare TORN protection and CHECKSUM protection. A torn page commonly indicates that one or more sectors have been damaged.
Common reasons

Following are some common problems found by Microsoft SQL Server Support that cause TORN page error conditions.

  • The subsystem or hardware does not handle the data correctly and returns a mix of sector versions. This has been reported on various controllers and firmware because of hardware read-ahead cache issues.

  • Power outages occur.

  • Bit flips or other damage occurs on the page header. This indicates that a page status of TORN detection was enabled when really was not.
Implementation

Torn page protection toggles a two bit pattern between 01 and 10 every time the page is written to disk. Write A obtains bit protection of 01 and write B obtains bit protection of 10. Then write C obtains 01 and so on. The low order (last) two bits of each 512-byte sector are stored in the page header and replaced with the torn bit pattern of 01 or 10.

The relevant members of the SQL Server data page header are shown in the following list together with a TORN bit layout diagram.


Member

Description

m_flagBits

Bit field where TORN, CHECKSUM or NONE is indicated.

m_tornBits

Contains the TORN or CHECKSUM validation value(s).


Figure 1


The torn page toggle bits are established as 01 or 10 and positioned in the low order 2 bits of the m_tornBits value. For the remaining 15, 512-byte sectors, the low order 2 bits of each sector are positioned in incrementing bit positions of the m_tornBits and the established bit pattern is stored in their location.

Following are the steps shown in the previous diagram.

Step #1: The original sector bit values are stored in the m_tornBits from low order to high order (like a bit array), incrementing the bit storage positions as sector values are stored.

Step #2: The established torn bit pattern is stored in the low order two bits of each sector, replacing the original values.

When the page is read from disk and PAGE_VERIFY protection is enabled for the database, the torn bits are audited.

Step #1: The low order bits of the m_tornBits are checked for the pattern of either 10 or 01 to make sure that the header is not damaged.

Step #2: The low order two bits in each sector are checked for the matching torn bit pattern as stored in the low order two bits of m_tornBits.

If either of these checks fail, the page is considered TORN. In SQL Server 2000 this returns an 823 error and in SQL Server 2005 it gives you an 824 error.

Step #3:


SQL Server 2000: Replaces the original values as each sector is checked, even if an error was detected. This makes it difficult to investigate which sector was torn.

SQL Server 2005: Enhances troubleshooting by leaving the bits unchanged when an error is detected. Investigate the page data to better determine the torn footprint condition.


Stale read protection

Stale reads have become a problem that is frequently reported to Microsoft SQL Server Support. A stale read occurs when a physical read returns an old, fully point-in-time consistent page so that it does not trigger TORN or CHECKSUM audit failures; instead, the read operation returns a previous data image of the page. This is also called a lost write because the most recent data written to stable media is not presented to the read operation.

A common cause of stale reads and lost writes is a component such as a hardware read-ahead cache that is incorrectly returning older cached data instead of the last write information.

This condition indicates serious I/O subsystem problems leading to page linkage corruption, page allocation corruption, logical or physical data loss, crash recovery failures, log restore failures, and a variety of other data integrity and stability issues.

In a SQL Server 2000 SP3-based hot fix (build 8.00.0847), stale read detection was added. This addition is outlined in the Microsoft Knowledge Base article, PRB: Additional SQL Server Diagnostics Added to Detect Unreported I/O Problems (http://support.microsoft.com/default.aspx?scid=kb;en-us;826433).

Enhancements

By changing from a ring buffer to a hash table design, SQL Server 2000 SP4 and SQL Server 2005 provide enhanced, low-overhead stale read checking. The original SQL Server 2000 SP3 implementation only checks for a stale condition if another error was found first (605, 823, and so forth). The hash table design, used in newer builds, allows for the page read sanity checking to include a stale read check when trace flag –T818 is enabled for any page that is read without a noticeable performance affect.

For SQL Server 2005, every time a page is written to disk, a hash table entry is inserted or updated with the DBID, PAGEID, RECOVERY UNIT, and LSN that is being flushed to stable media. When a read is complete, the hash table is searched for a matching entry. The matching DBID and PAGEID entry is located. The hash table LSN value is compared to the LSN value that is stored in the page header. The LSN values must match or the page is considered damaged. Thus, if the most recent LSN that was written was not returned during the subsequent read operation, the page is considered damaged.

To maintain a high level of I/O performance and limit the memory footprint, the hash table size is bounded. It tracks only the recent window of data page writes. The number of I/Os tracked varies between the 32- and 64-bit versions of SQL Server 2000 SP4 and SQL Server 2005. To optimize speed, each hash bucket and its associated entries are designed to fit in a single, CPU cache line, thereby limiting the hash chain length to five entries for each bucket. In 32-bit installations, the total size of the hash table is limited to 64 KB (equating to 2,560 total entries = 20 MB window of data) and on 64-bit installations to 1 MB (equating to 40,960 total entries = 320 MB window of data).

The size restriction is based on the testing of known bugs that caused stale reads or lost writes. The bug characteristics typically involve a hardware memory cache that held the older page data and a read operation that immediately followed or overlapped the write operation.

Stalled I/O detection

Database engine performance can be highly affected by the underlying I/O subsystem performance. Stalls or delays in the I/O subsystem can cause reduced concurrency of your SQL Server applications. Microsoft SQL Server Support has experienced an increase in I/O subsystem delays and stall conditions resulting in decreased SQL Server performance capabilities.

For a SQL Server 2000 installation, an I/O stall or delay is frequently detected by watching sysprocesses for I/O-based log and/or buffer (data page) wait conditions. Whereas small waits might be expected, some filter drivers or hardware issues have caused 30+ second waits to occur, causing severe performance problems for SQL Server-based applications.

Starting with SQL Server 2000 SP4 and SQL Server 2005, SQL Server monitors and detects stalled I/O conditions that exceed 15 seconds in duration for data page and log operations. The following Microsoft Knowledge Base article describes the SQL Server 2000 SP4 implementation: SQL Server 2000 SP4 diagnostics help detect stalled and stuck I/O operations (http://support.microsoft.com/default.aspx?scid=kb;en-us;897284).

SQL Server 2000 SP4 and SQL Server 2005 also increase the visibility of latch operations. A latch is used to guarantee the physical stability of the data page when a read from or a write to stable media is in progress. With the increased latch visibility change, customers are frequently surprised after they apply SQL Server 2000 SP4 when a SPID appears to block itself. The following article describes how the latch information displayed in sysprocesses can be used to determine I/O stall conditions as well as how a SPID can appear to block itself: The blocked column in the sysprocesses table is populated for latch waits after you install SQL Server 2000 SP4 (http://support.microsoft.com/kb/906344/en-us).

SQL Server 2005 contains the stalled I/O monitoring and detection. The stalled I/O warning activity is logged when a stall of 15 seconds or longer is detected. Additionally, latch time-out error messages have been extended to clearly indicate that the buffer is in I/O. This indicates that the I/O has been stalled for 300 seconds (five minutes) or more.

There is a clear difference between reporting and recording. Reporting only occurs in intervals of five minutes or longer when a new I/O action occurs on the file. Any worker posting an I/O examines the specific file for reporting needs. If I/O has been recorded as stalled and five minutes has elapsed from the last report, a new report is logged to the SQL Server error log.

Recording occurs in the I/O completion routines, and the lazy writer checks all pending I/Os for stall conditions. Recording occurs when an I/O request is pending (FALSE == HasOverlappedIoCompleted) and 15 seconds or longer has elapsed.

Note: The FALSE return value from a HasOverlappedIoCompleted call indicates that the operating system or I/O subsystem has not completed the I/O request.

sys.dm_io_pending_io_requests (DMV)

SQL Server 2005 provides dynamic access to pending I/O information so that a database administrator can determine the specific database, file, and offset leading to a stall. The dynamic management view (DMV) sys.dm_io_pending_io_requests contains details about the offset and status of each outstanding I/O request. This information can be used by Microsoft Platforms Support and various utilities to track down the root cause. For more information, go to http://support.microsoft.com and search for information related to IRP and ETW event tracing.

The io_pending column is a specific key for evaluating the result set. The io_pending column indicates whether the I/O request is still pending or if the operating and I/O subsystem have completed it. The value is determined by using a call to HasOverlappedIoCompleted to determine the status of the I/O request. The following table outlines the returned value possibilities for io_pending.




io_pendingValue

Description

TRUE

Indicates the asynchronous I/O request is not finished. SQL Server is unable to perform additional actions against the data range until the operating system and I/O subsystem complete the I/O request.

To learn more about pending I/O conditions, see HasOverlappedIoCompleted in the SDK documentation.

Lengthy asynchronous I/O conditions typically indicate a core I/O subsystem problem that should be addressed to return SQL Server to ordinary operating conditions.


FALSE

Indicates the I/O request is ready for additional processing actions by SQL Server.

If the pending time of the I/O continues to climb, the issue may be a SQL Server scheduling problem. For a discussion of SQL Server scheduler health and associated troubleshooting, see the following white paper.



How to Diagnosis and Correct Errors 17883, 17884, 17887, and 17888 (http://download.microsoft.com/download/4/f/8/4f8f2dc9-a9a7-4b68-98cb-163482c95e0b/DiagandCorrectErrs.doc)

The io_pending_ms_ticks column is the elapsed milliseconds (ms) of the I/O request that was posted to the operating system.

The io_handle is the file HANDLE that the I/O request is associated with. This column can be joined to the dynamic management function (DMF) sys.dm_io_virtual_file_stats column file_handle to obtain specific file and database association from the I/O. The following is an example of a query to obtain this information.
SELECT fileInfo.*, pending.*

    FROM sys.dm_io_pending_io_requests AS pending

INNER JOIN (SELECT * FROM sys.dm_io_virtual_file_stats(-1, -1))

    AS fileInfo ON fileInfo.file_handle = pending.io_handle


This join can be additionally enhanced by adding information such as the database name or by using the offset to calculate the actual PAGEID; (Offset/8192 = PAGEID).

WARNING: DMVs and DMFs access core system structures to produce the result set. Internal structures must be accessed with thread safety, which may have performance ramifications. The use of DMVs that access core SQL Server components should be limited to avoid possible performance affects.
Read retry

SQL Server 2005 extends the use of read retry logic for data pages to increase read consistency possibilities. A read retry involves performing exactly the same read operation immediately following a read failure in an attempt to successfully complete the read.

Microsoft has successfully used read retries to compensate for intermittent failures of an I/O subsystem. Read retries can mask data corruption issues in the I/O subsystem and should be investigated carefully to determine their root cause. For example, a disk drive that is going bad may intermittently return invalid data. Re-reading the same data may succeed and the read retry has provided runtime consistency. However, this is a clear indicator that the drive is under duress and should be examined carefully to avoid a critical data loss condition.

The Microsoft Exchange Server product added read retry logic and has experienced improved read consistency. This section outlines how and when SQL Server 2000 and SQL Server 2005 perform read retry operations.

Resource-based retries

SQL Server 2000 performs read retries only when beginning the read operations fails and returns an operating system error of ERROR_WORKING_SET_QUOTA (1453) or ERROR_NO_SYSTEM_RESOURCES (1450). Unlike the SQL Server 2005 enhancements, SQL Server 2000 does not try any other form of read retry other than sort failures.

When the error occurs, the SQL Server worker yields for 100ms and tries the read operation again. This loop continues until the I/O is successfully issued. A simplified example demonstrates this behavior.


WHILE( FALSE == ReadFile()

    && (1450 == GetLastError() || 1453 == GetLastError())

)

{

    Yield(100);



}
SQL Server 2005 maintains the same logic when it is trying to start a read operation.
Sort retries

SQL Server 7.0, 2000, and 2005 have sort-based retry logic. These frequently appear as ‘BobMgr’ entries in the SQL Server error log. When a read of a spooled sort buffer from tempdb fails, SQL Server tries the read again. The retries are only attempted several times before they are considered to be fatal to the sort. Sort retries should be considered a serious I/O stability problem. To correct the problem, try moving tempdb to different location.
Other read failure retries

SQL Server 2005 has extended the read retry logic to read failures that occur after the read was successfully started. When ReadFile returns TRUE, this indicates that the operating system accepted the application’s request to read from the file. If a subsequent failure is experienced, the result is a SQL Server error such as an 823 or 824.

SQL Server 2005 allows for read retry operations to continuously occur when the read finishes with an error caused by a resource shortage. For all non-resource shortage conditions, four (4) more retries may be tried.

Each successive retry yields before it tries the read operation again. The yield is based on the following formula: (yield time = retry attempt * 250ms). If the error condition cannot be resolved within four retries (five total times: one initial read and four retries), an 823 or 824 error is reported. SQL Server 2005 saves the original error condition details, such as a checksum failure. It also includes the original details with the error report in the SQL Server error log.

If the retry succeeds, an informational message is added to the SQL Server error log. This indicates that a retry occurred. The following is an example of the message.

“A read of the file <> at offset <
> succeeded after failing <> time(s) with error: <>. Additional messages in the SQL Server error log and system event log may provide more detail. This error condition threatens database integrity and must be corrected. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.”

Read retry problems are a serious problem with data stability as the I/O subsystem is returning incorrect data to SQL Server. This condition is likely to cause a fatal SQL Server error or even a system-wide failure. Additionally, the retry activity has performance affect on SQL Server operations. This is because as soon as a read error is detected, the worker performs the reties until it succeeds or the retry limit is exhausted.


Page audit

SQL Server 2000 SP4 and SQL Server 2005 include additional page audit capabilities. Enabling the dynamic trace flag -T806 causes all physical page reads to run the fundamental DBCC page audit against the page as soon as the read is complete. This check is performed at the same point as the PAGE_AUDIT and other logical page checks are performed.

This is another way to locate data page corruption in areas of the page other than the basic page header fields, which are always checked during the physical read. For example, when trace flag –T806 is enabled, the row layout is audited for appropriate consistency.

Page audit was first added in SQL Server 2000 SP3, hot fix build 8.00.0937. For more information, see the following Microsoft Knowledge Base article: FIX: Additional diagnostics have been added to SQL Server 2000 to detect unreported read operation failures (http://support.microsoft.com/kb/841776/en-us).

Note: Enabling page audit capabilities can increase the CPU load on the server and decrease overall SQL Server performance.

SQL Server 2005 introduces checksum protection, which generally supersedes page audit capabilities. Checksum protection ensures that every bit on the page is the same as that written to stable media.

For SQL Server 2005 installations, checksum is often a better solution than constant data integrity auditing. However, page audit can help catch corruption which was stored to stable media even though physical page consistency was not compromised. Microsoft SQL Server Support has encountered an example of this. In that instance, a third-party extended stored procedure scribbled on a data page that was already marked dirty. The checksum was calculated on a damaged page and the page was written. The reading in of this page, with page audit enabled, could indicate the error condition when checksum would not detect the failure. If you believe the server may be experiencing a problem that checksum cannot detect but page audit is detecting, consider in-memory checksumming and latch enforcement to help locate the scribbler.

Log audit

SQL Server 2000 and 2005 include trace flag –T3422 which enables log record auditing. Troubleshooting a system that is experiencing problems with log file corruption may be easier using the additional log record audits this trace flag provides. Use this trace flag with caution as it introduces overhead to each transaction log record.
Checkpoint

SQL Server 2005 implemented user-controlled I/O target behavior for the manual CHECKPOINT command and improved the I/O load levels during automatic checkpointing. For more information on how to issue a manual CHECKPOINT command and specify a target value (in seconds), see SQL Server 2005 Books Online.

Microsoft has received requests to implement a more dynamic checkpoint algorithm. For example, this would be useful during a SQL Server shutdown. Especially for highly available databases, a more aggressive checkpoint can reduce the amount of work and time that crash recovery needs during a restart.

The amount of transaction log activity determines when to trigger checkpoint of a database. Transaction log records have a recovery cost value calculated in milliseconds. Each time a new transaction log record is produced, the accumulated cost is used to determine the estimated recovery time required since the last checkpoint. When the recovery time goal is exceeded, a checkpoint is triggered. This keeps the crash recovery runtime within the specified recovery interval goal.

The following base rules apply to checkpoint. The term latency as it is used here indicates the elapsed time from when the write was issued until the write is considered complete by the checkpoint process.



Action

Description

Manual Checkpoint – Target Specified

  • I/O latency target set to the default of 20ms. The target is set to 100ms if shutdown is in progress.

  • Maximum number of standing I/Os is capped at the larger of the following calculations:

  • Committed Buffer Count / 3750

  • 80 * Number of Schedulers

  • Number of outstanding I/Os is constantly adjusted so that progress through the buffer pool is commensurate with elapsed time and target time.

Manual Checkpoint – No target specified

- or -

Automatic Checkpoint in response to database activity

  • I/O latency target set to the default of 20ms. The target is set to 100ms if shutdown is in progress.

  • Maximum number of standing I/Os is capped at the larger of the following calculations:

  • Committed Buffer Count / 3750

  • 80 * Number of Schedulers

  • Minimum number of outstanding I/Os required is 2.

  • Number of outstanding I/Os is adjusted to keep write response time near latency target.


For any checkpoint invocation

When checkpoint reaches its outstanding I/O target, it yields until one of the outstanding I/Os is finished.



For no target specified or automatic checkpointing

The checkpoint process tracks checkpoint-specific I/O response times. It can adjust the number of outstanding I/O requests if the I/O latency of checkpoint writes exceed the latency target. As checkpoint processing continues, the goal for the outstanding number of I/Os is adjusted in order to maintain response times that do not exceed the established latency goal. If the outstanding I/O levels begin to exceed the tolerable latency goals, checkpoint adjusts its activity to avoid possible affects on the overall system.



For a manual checkpoint, target specified

When checkpoint processing detects that it is ahead of the specified target, it yields until activities fall within goal, as outlined in the previous table, or until all the outstanding checkpoint I/Os finish. If all outstanding I/Os are complete, checkpoint issues another I/O and again tries to use the target goal.



SQL Server 2005 SP1 allows for continuous check pointing

SQL Server 2005 SP1 alters the checkpoint algorithm slightly. SQL Server 2005 does not add time between I/Os. Therefore, the checkpoint process may finish ahead of the target schedule. Service Pack 1 introduces the appropriate delays to honor the target as closely as possible. We do not recommend this, but administrators can disable automatic checkpointing and use manual checkpointing with a specified target. Putting the manual, targeted checkpoint in a continuous loop provides a continuous checkpoint operation. Do this with extreme caution because checkpoints are serialized and this could affect other databases and backup operations. It also requires that a checkpoint process be established for all databases.

Notice that SQL Server 2005 Service Pack 1 also contains a fix for a very rare checkpoint bug. The fix is for a very small window where checkpoint could miss flushing a buffer. This could lead to unexpected data damage. To avoid this problem, apply SQL Server 2005 SP1.

WriteMultiple extended

SQL Server 7.0 introduced an internal routine named WriteMultiple. The WriteMultiple routine writes data pages to stable media. For more information, see“Flushing a Data Page To Disk” in SQL Server I/O Basics on MSDN.

SQL Server 7.0 and 2000 could issue a WriteMultiple operation for up to 16 pages (128 KB). SQL Server 2005 extends the capability of WriteMultiple up to 32 pages (256 KB). This may change the block size configurations for your performance goals. For more information about physical database layout, see the article “Physical Database Storage Design” (http://download.microsoft.com/download/4/f/8/4f8f2dc9-a9a7-4b68-98cb-163482c95e0b/PhysDBStor.doc).

SQL Server 2005 varies the WriteMultiple logic. In SQL Server 7.0 and 2000, the function accepts a starting page ID. The starting page and up to 16 subsequent, contiguous dirty pages for the same database are bundled in a single write request. SQL Server does this by using hash table lookups for subsequent contiguous pages. When a page is not found or a page is found that is clean, the I/O request is considered finished.

SQL Server 2005 adds additional lookup and safety steps to WriteMultiple. SQL Server 2005 does the forward page search in the same way as SQL Server 7.0 and 2000. When the forward search is finished, SQL Server 2005 can do a backward search if all 32 pages of the I/O request are not yet filled. The same hash table lookup activity occurs when SQL Server searches for more pages. For example if WriteMultiple was passed page 1:20, the search would examine 1:19, 1:18, and so on. The search continues until:



  • A page is not found.

  • A page is found to be clean.

  • All 32 pages for the I/O have been identified.

SQL Server 2005 adds additional page header checks. One such additional check is the actual page ID check. The expected page ID is compared to that of the actual page header. The prevents writing a scribbled or incorrect page to disk and causing permanent database damage.
Read-ahead enhanced

In SQL Server 2005, the read-ahead design is enhanced so that it reduces physical data transfer requirements by trimming the leading and trailing pages from the request if the data page(s) are already in the buffer pool.

For more information on SQL Server read-ahead logic, see SQL Server I/O Basics (http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx).

For example, a read-ahead request is to be issued for pages 1 through 128 but pages 1 and 128 are already located in the SQL Server buffer pool. The read-ahead request would be for pages 2 through 127 in SQL Server 2005. In comparison, SQL Server 2000 requests pages 1 through 128 and ignores the data that is returned for pages 1 and 128.

Sparse files / Copy on write / Streams

NTFS sparse file technology is used for database snapshots and online DBCC CHECK* operations. This section provides more detailed information about this technology in SQL Server.

Note: At the time of publication, manufacture-specific “thin provisioning” implementations have not yet been tested. See the SQL Server Always On Storage Solution Review program (http://www.microsoft.com/sql/AlwaysOn) for newer information about this topic.
Streams

Online DBCC CHECK* uses a transient, sparse file stream for each data file that is checked. The streams are named using the following template: “<>:MSSQL_DBCC<>”. The stream is a secondary data area associated with the original file provided by the file system. Online DBCC uses the stream to create a transient snapshot of the database as long as it performs checks. This snapshot is unavailable to database users. The snapshot stream enables online DBCC to create and test all facts against an exact point-in-time replica of the database. It requires only limited physical storage to do this. During online DBCC, only those pages that are modified after DBCC started are stored in the stream. When online DBCC is finished, the stream is deleted.

It is worthy to notice that the stream enables DBCC to reduce its locking granularity. If the stream cannot be created or runs out of space, DBCC reverts to the older, TABLE LOCK behavior. Administrators should review the free space available on each volume to make sure that high database concurrency is maintained.

For more information about online DBCC, see DBCC Internal Database Snapshot Usage in SQL Server 2005 Books Online.

Copy-on-write and sparse files

Snapshot databases contain images of data pages that have been modified after they were created. Establishing a database snapshot includes performing an immediate, secondary rollback of active transactions at the time of creation. Active transactions in the primary database must be rolled back in the new snapshot to obtain the correct point in time. Transaction states in the primary database remain unaffected.

To conserve physical disk space, snapshot databases are stored on sparse files. This limits the physical disk space requirement of the snapshot database to that of the modified images. As more data pages are modified in the parent database, the physical storage requirement of the snapshot database increases.

Snapshot database files are named as specified by the CREATE DATABASE command. For example, the snapshot parent database can contain the main.mdf file and the snapshot may use snapshot_main.mdf.

SQL Server 2005 implements copy-on-write for the data pages of a snapshot database. Before the primary database page can be modified, the original data page is written to any associated database snapshot. It is important for administrators to monitor the physical size of the snapshot databases to determine and predict the physical storage requirements. Notice that the smallest allocation unit in a sparse file is 64 KB. Therefore, the space requirement may grow faster than you expect.

For more information about snapshot databases, see How Database Snapshots Work in SQL Server 2005 Books Online.

Recovery redo and undo operations use the transaction log to return a database to a consistent transactional state. However, this logic does not apply to a snapshot database. Notice that snapshot databases do not have transaction logs. Therefore, the copy-on-write activity must be complete before the primary database transaction can continue.

The snapshot uses a combination of API calls to determine the 64 KB allocated regions in the sparse file. It must also implement an in-memory allocation bitmap to track the individual pages that have been stored in the snapshot. During the creation or expansion of the snapshot file, SQL Server sets the appropriate file system attributes so that all unwritten regions return complete zero images for any read request.

When a modification request is started in the primary database, the data page may be written to any snapshot database and the in-memory allocation bitmap is appropriately maintained. Because it can introduce I/O on the snapshot database when the copy-on-write decision is being made, it is important to consider the additional overhead. Determining when a copy of the database page is necessary involves the combination of various API calls and possible read requests. Therefore, read requests may be generated to the snapshot database in order to determine whether the data page has already been copied to the snapshot database.

When SQL Server performs the copy-on-write operation into a sparse file, the operating system may perform the write in a synchronous manner when it acquires new physical storage space. To prevent the synchronous nature of these writes from affecting the SQLOS scheduler, the copy-on-write writes may be performed by using secondary workers from the worker pool. In fact, if multiple snapshots exist for the same primary database, the writes can be performed by using multiple workers in a parallel manner. The initiating worker waits for the secondary workers and any writes to complete before continuing with the modification on the original data page. When SQL Server waits for the writes to complete, the originating worker may show a wait status such as replica write or a latch wait for the replica data page which is in I/O.

Even if the transaction is rolled back, the data page has been written to the snapshot. As soon as a write occurs, the sparse file obtains the physical storage. It is no longer possible to fully roll back this operation and reclaim the physical disk space.



Note: If a copy-on-write operation fails, the snapshot database is marked as suspect. SQL Server generates the appropriate error.

Realize that the increased copy-on-write operations (actual writes or reads to determine whether a copy-on-write is necessary) can change the performance dynamics of some queries. For example, the I/O speed of a snapshot database could limit certain query scalabilities. Therefore, you should locate the snapshot database on a high speed I/O subsystem.



Note: Utilities such as WinZip, WinRAR, copy and others do not maintain the actual integrity of a sparse file. When a file is copied by using these utilities, all unallocated bytes are read and restored as zeros. This requires actual allocations and the restored file looses the sparse file attributes. To copy a sparse file, you must use a utility that understands how to capture and restore metadata structure of the file.
Stream and sparse file visibility

Stream and sparse file sizes are easier to monitor if you are aware of some key visibility limitations.

Secondary file streams are frequently invisible to commands such as ‘dir’. This can make it difficult to determine how much copy-on-write activity has occurred during an online DBCC CHECK*. However, various third-party utilities are available which can be used to view size information about file streams.

Similarly, sparse file sizes that are reported to commands such as ‘dir’ indicate the complete file size as established by the End Of File (EOF) setting and not the actual physical storage on disk. Windows Explorer shows the logical as ‘Size’ and the physical as ‘Size on Disk.’

When SQL Server writes to a sparse database file, it can acquire space in database extent sizes of 64 KB (8 pages * 8KB each = 64KB). SQL Server 2005 detects when a segment of the file has not been allocated in the sparse file. It not only copies the page during the copy-on-write operation but establishes the next seven pages (one extent) with all zero images. This reduces the operating system-level fragmentation by working on the common operating system file system cluster boundaries. It also enables future copy-on-write requests for any one of the other seven pages to finish quickly because the physical space and file system tracking has already been established. It also enables SQL Server to use the file level allocation information to determine what extents are physically allocated in the sparse file. The zeroed page images can be used to determine which of the pages that are enclosed in allocation region have been copied from the primary database.


Snapshot reads

SQL Server 2005 is designed to read data from both the snapshot file and the matching primary database file when data is queried in the snapshot database. By using the in-memory sparse database allocation bitmaps and the zero page images, SQL Server can quickly determine the actual location of data pages that are required by a query.

For example, during query execution SQL Server can elect to perform a large read, reading in eight or more data pages with a single read request such as a read-ahead. Look at a specific example.

A read-ahead operation in the primary database, for the same region, creates a single read of 64 KB to retrieve eight contiguous pages. However, if the third and sixth pages have been modified and copied (with copy-on-write) to the snapshot database, this causes a split set of reads. This example requires five separate read requests.


  • Pages 1 and 2 can be read from the primary

  • Page 3 from the snapshot

  • Pages 4 and 5 from the primary

  • Page 6 from the snapshot

  • Pages 7 and 8 from the primary

SQL Server always tries to optimize physical data access requests but a query against the snapshot database may have to perform more I/Os than the identical query executed in the primary database.

WARNING: If you use SQL Server database snapshots or online DBCC CHECK* operations, apply the operating system for system bug 1320579 fix to avoid corruption of the snapshot. This is covered in the following article: Error message when you run the DBCC check command in SQL Server 2005: "8909 16 1 Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type unknown)" (909003)
Instant file initialization

Newer operating system versions, including Windows XP and Windows Server 2003, implement instant file initialization capabilities by supplying the API SetFileValidData. This API enables SQL Server to acquire physical disk space without physically zeroing the contents. This enables SQL Server to consume the physical disk space quickly. All versions of SQL Server use the database allocation structures to determine the valid pages in the database; every time that a new data page is allocated, it is formatted and written to stable media.

SQL Server 2000 creates and expands database log and data files by stamping the new section of the file by using all zero values. A SQL Server 2005 instance with incorrect account privileges will revert to SQL Server 2000 behavior. The algorithm used by SQL Server is more aggressive than the NTFS zero initialization (DeviceIoControl, FSCTL_SET_ZERO_DATA), thereby elevating the NTFS file lock behavior and enabling concurrent access to other sections of the file. However, zero initializing is limited by physical I/O capabilities and can be a lengthy operation.

SQL Server 2005 uses instant file initialization only for data files. Instant file initialization removes the zero stamping during the creation or growth of the data file. This means that SQL Server 2005 can create very large data files in seconds.

The following rules apply to SQL Server 2005 and instant file initialization.



  • The operating system and file system must support instant file initialization.

  • The SQL Server startup account must possess the SE_MANAGE_VOLUME_NAME privilege. This privilege is required to successfully run SetFileValidData.

  • The file must be a SQL Server database data file. SQL Server transaction log files are not eligible for instant file initialization.

  • If trace flag –T1806 is enabled, SQL Server 2005 behavior reverts to SQL Server 2000 behavior.

SetFileValidData does allow for fast allocation of the physical disk space. High-level permissions could enable data that already exists on the physical disk to be seen, but only internally, during a SQL Server read operation. Because SQL Server knows which pages have been allocated, this data is not exposed to a user or administrator.

To guarantee transaction log integrity, SQL Server must zero initialize the transaction log files. However, for data files SQL Server formats the data pages as they are allocated so the existing data does not pose a problem for database data files.



Note: To guarantee the physical data file space acquisition during data file creation or expansion, on a thin provisioned subsystem, use trace flag –T1806.

Trace flag –T1806 provides backward compatibility to zero initialize database data files without using instant file initialization of files. You may also remove the SE_MANAGE_VOLUME_NAME privilege to force SQL Server to use zero file initialization. For more information on the SE_MANAGE_VOLUME_NAME privilege, see Database File Initialization in SQL Server 2005 Books Online.



Note: Zero file initialization does not protect against previously stored data discovery. To fully prevent discovery of previous data, the physical media must have a Department Of Defense (DOD)-level series of write. This is generally seven unique write patterns. Zero file initialization only performs a single, all-zero write to the data pages.

For more information on previously stored data security, see Protect and purge your personal files (http://www.microsoft.com/athome/moredone/protectpurgepersonalfiles.mspx).

For more information about DOD-5012.2 STD, see Design Criteria For Electronics Record Management Software Applications (http://www.dtic.mil/whs/directives/corres/pdf/50152std_061902/p50152s.pdf).

I/O affinity and snapshot backups

I/O affinity dedicates special, hidden schedulers to performing core buffer pool I/O and log writer activities. The I/O affinity configuration option was introduced in SQL Server 2000 SP1. The INF: Understanding How to Set the SQL Server I/O Affinity Option (http://support.microsoft.com/default.aspx?scid=kb;en-us;298402) Microsoft Knowledge Base article outlines the I/O affinity implementation.

Vendors can use Virtual Device (VDI)-based snapshot backups to perform actions such as splitting a mirror. To accomplish this, SQL Server must guarantee point-in-time data stability and avoid torn writes by freezing all new I/O operations and completing all outstanding I/Os for the database. The Windows Volume Shadow Copy Service (VSS) backup is one such VDI application.

For more information on VDI snapshot backups, see Snapshot Backups in SQL Server Books Online.

For more information on VDI Specifications, see SQL Server 2005 Virtual Backup Device Interface (VDI) Specification (http://www.microsoft.com/downloads/details.aspx?FamilyID=416f8a51-65a3-4e8e-a4c8-adfe15e850fc&DisplayLang=en).

The SQL Server 2000 design does not account for frozen I/O in combination with I/O affinity. A single database snapshot backup freezes I/O requests (reads and writes) for all databases. This makes I/O affinity and snapshot backups an unlikely combination choice for SQL Server 2000.

SQL Server 2005 supports I/O affinity as outlined in the article mentioned earlier (INF: Understanding How to Set the SQL Server I/O Affinity Option). It corrects the condition which could freeze all I/O requests when I/O affinity is enabled.

SQL Server I/O affinity is a very specialized, high-end server configuration option. It should only be used after significant testing indicates that it will result in performance improvements. A successful I/O affinity implementation increases overall throughput. The I/O affinity schedulers maintain reasonable CPU usage levels and effectively allow other schedulers access to increased CPU resources.

Locked memory pages

Both 32-bit and 64 bit versions of SQL Server use the AWE API set to lock pages in memory as VirtualLock is not guaranteed to keep all locked pages in memory like AllocateUserPhysicalPages. This can sometimes be confusing because the AWE sp_configure settings exist but are not used on 64 bit SQL Server. Instead, the lock pages privilege determines when to use the AWE API set. The Windows “Lock Pages In Memory” privilege is necessary for SQL Server to make AWE allocations.

Lock Pages In Memory can have a positive affect on I/O performance and can prevent the trimming of the working set of SQL Server.



Warning: Do not use AWE or locked pages without testing. Forcing strict, physical memory usage can result in undesired physical memory pressure on the system. System-level or hardware components may not perform well when physical memory pressure is present. Use this option with caution.

During an I/O operation, the memory for the I/O should not cause page faults. Therefore, if the memory is not already locked, it will be. Because most applications do not lock I/O memory, the operating system transitions the memory to a locked state. When the I/O finishes, the memory is transitioned back to an unlocked state.

SQL Server 2005 64-bit Enterprise Edition detects if the Lock Pages In Memory privilege is present and establishes a locked pages data cache. SQL Server 32-bit installations require enabling AWE memory options to establish the locked memory behavior. This avoids the lock and unlock transition during I/O, thereby providing improved I/O performance. To disable this behavior, remove the privilege or, for 64-bit installations, enable startup trace flag –T835.

Important: SQL Server always tries to avoid paging by releasing memory back to the system when it is possible. However, certain conditions can make this difficult. When pages are locked they cannot be paged. We recommend careful consideration of locked page usage.

It is not always possible for SQL Server to avoid paging operations. Locked pages can also be used to avoid paging as a troubleshooting technique if you have reason to believe damage to memory may have occurred during a paging operation.


Idle server

SQL Server 2005 can provide an idle SQL Server process (sqlservr.exe) similar to the suspend/resume operations that are provided by the operating system. SQL Server 2005 introduces a per-node, system task called Resource Monitor. Resource Monitor’s primary responsibility is to watch core memory levels and then trigger cache adjustments accordingly. When the SQL Server 2005 Resource Monitor detects that there are no user-initiated requests remaining to process, it can enter the idle state. When a new user request arrives or a critical event must run, SQL Server wakes and handles the activities. The following table outlines some of the key event/request types and associated actions as they relate to the idle SQL Server process.


Event/Request Type

Description

User Request

A user request includes those actions initiated by a client application that require SQL Server to perform a service.

The following are common examples of user requests.



  • Batch request

  • SQL RPC request

  • DTC request

  • Connection request

  • Disconnect request

  • Attention request

These actions are also called external requests.

Active user requests prevent SQL Server from entering an idle state.

When SQL Server is in an idle state, a user request wakes the SQL Server process.


Internal Tasks

An internal request includes those actions that a user cannot issue a specific client request to trigger or control. For example:

  • Lazy writer

  • Log writer

  • Automatic checkpointing

  • Lock monitor

Internal tasks do not prevent SQL Server from entering an idle state. Only critical internal tasks can wake the SQL Server from an idle state.

Critical Event

Some SKUs of SQL Server respond to events such as memory notifications in order to wake the SQL Server process from an idle state and honor the event.

There are some basic rules the SQL Server process uses to determine whether it can enter an idle state.

The SQL Server idle state is not entered until:



  • No user requests have been active in 15 minutes.

  • The instance is not participating in database mirroring.

  • Service Broker is in an idle state.

SQL Server wakes in response to:

  • Memory pressure (except on SQL Server Express).

  • A critical internal task or event.

  • An external request such as a Tabular Data Stream (TDS) packet arrival.

The idle activity can change the way SQL Server interacts with system. The following table outlines specific behavior related to the individual SKUs.


SKU

Behavior

SQL Server Express Service

  • By default can declare the server idle.

  • Triggers the operating system to aggressively trim the SQL Server working set memory using API SetProcessWorkingSetSize(…, -1, -1)

  • Does not wake in response to operating system memory pressure notifications.

  • Tries to enter an idle state immediately after service startup.

SQL Express Individual User Instance

  • By default can declare the server idle.

  • May be configured to allow for aggressive working set trimming.

  • Wakes in response to operating system memory pressure notifications.

Workgroup

  • By default can declare the server idle.

  • May be configured to allow for aggressive working set trimming.

  • Wakes in response to operating system memory pressure notifications

Standard
Enterprise
Developer


  • Idle server behavior requires a trace flag.

  • Wakes in response to operating system memory notification.

SQL Server always tries to avoid using the page file whenever possible. However, it is not always possible to avoid all page file activity. An idle SQL Server process may introduce aggressive use of the page file and increased I/O path usage even though SQL Server tries to avoid these. The paging activities require that memory regions be stored and retrieved on disk. This opens the possibility of memory corruption if the subsystem is not handling the paging activity correctly.



Note: The Windows operating systems use storage and read-ahead design techniques for the page file that are similar to that which SQL Server uses for data and log files. This means that an idle SQL Server can experience I/O patterns during a paging operation similar to that of its own data file reads and writes. To guarantee data integrity, the page file should uphold I/O specifications suited for SQL Server database and log files.

SQL Server idle server activity can be controlled with the following trace flags.




Trace Flag

Description

8009

Enable idle server actions

8010

Disable idle server actions
Database mirroring (DBM)

SQL Server 2005 SP1 introduces the database mirroring (DBM) feature set. For more information about database mirroring solutions, see Database Mirroring in SQL Server 2005 Books Online.

Database mirroring is not specifically targeted as a remote mirroring solution. However, database mirroring addresses the necessary properties required to maintain the primary and mirror relationship, guarantee data integrity, and allow for both failover and failback to occur as outlined in Remote Mirroring earlier in this paper.

Database mirroring uses CRC checks to guarantee data transmissions between the primary and mirror to maintain the correct data integrity. SQL Server 2005, for new databases, provides checksum capabilities for data pages and log blocks to strengthen data integrity maintenance. To enhance your ‘Always On’ solution, we recommend using the CRC transmission checks in combination with the checksum database capabilities to provide the highest high level of protection against data corruption.

Multiple instance access to read-only databases

SQL Server 2005 introduces Scalable Shared Database support (SSD), enabling multiple SQL Server instances to use the same read-only database from a read-only volume. The setup details and other detailed information about SSD are outlined in the SQL Server 2005 Books Online Web Refresh and in the following Microsoft Knowledge Base article: Scalable Shared Databases are supported by SQL Server 2005 (http://support.microsoft.com/?kbid=910378).

SSD provides various scale out possibilities, including but not limited to the following:



  • Multiple server access

  • Separate server resources

  • Separate tempdb resources

Note: Multiple server, database file access is never supported for SQL Server databases in read/write mode. SQL Server SSD is never supported against writable database files.

Some I/O subsystems support features such as volume-level, copy-on-write (COW) snapshots. These I/O subsystem solutions monitor write operations on the primary (read/write) volume and save the initial data to the volume snapshot. The volume snapshot is presented as a read-only copy when mounted. The read-only, snapshot volume is a supported configuration for SQL Server SSD databases. Such implementations can provide a powerful tool for accessing live production data, read only, from many servers.

Some subsystems use distributed locking mechanisms instead of read-only volume snapshot capabilities. This allows multiple servers to access the same read/write volume. The distributed locking environments are very powerful but do not present a point-in-time image of the data to secondary servers. The live data is presented to all servers connected to the volume. SQL Server SSD is not supported against the writable database files. The SQL Server buffer pool cannot maintain point-in-time synchronization on secondary servers. This would lead to various problems because of the resulting dirty read activities.

Ramp up of local cache

The Enterprise Edition of SQL Server 2005 tries to ramp up a node’s local data cache during the node initialization phase. The initial memory growth committal of a node is considered to be the ramp-up period. During the ramp-up period, whenever a single page read is requested, the request is turned into an eight-page request (64 KB) by the buffer pool. The ramp-up helps reduce waits for subsequent single page reads after a restart. This enables the data cache to be populated quicker and SQL Server to return to cached behavior quicker.

A non-NUMA computer is considered to have a single-node implementation. On larger memory installations, this can be a significant advantage because it enables quicker data cache repopulation. Each node is assigned and monitored by a separate Resource Monitor task. SQL Server is aware of the different nodes and different memory requirements that may exist within those nodes. This includes the buffer pool. This makes SQL Server fully NUMA aware.


Encrypted file systems (EFS)

SQL Server databases can be stored in NTFS encrypted files. However, use this feature with caution as encryption actions disable asynchronous I/O capabilities and lead to performance issues. When performing I/O on an EFS-enabled file, the SQL Server scheduler becomes stalled until the I/O request completes. Actions such as SQL Server read ahead become disabled for EFS files. We recommend using the built-in SQL Server 2005 encryption capabilities instead of EFS when possible.

Use of EFS for SQL Server should be limited to physical security situations. Use it in laptop installations or other installations where physical data security could be at risk.

If EFS must be deployed in server environment consider the following.


  • Use a dedicated SQL Server instance.

  • Test throughput limitations well.

  • Test with and without the I/O affinity mask. Using I/O affinity may provide a pseudo-async capability to SQL Server.
DiskPar.exe

The diskpar utility provides alignment capabilities to prevent misalignment performance problems. Systems running SQL Server should properly align on boundaries to help optimize performance. I/O subsystem manufactures have established recommendations for proper alignment in a SQL Server environment.

This is the same recommendation as for Microsoft Exchange Server. The following is an except from the Microsoft Exchange documentation and is applicable to SQL Server.

“Even though some storage obfuscates sectors & tracks, using diskpar will still help by preventing misalignment in cache. If the disk is not aligned, every Nth (usually 8th) read or write crosses a boundary, and the physical disk must perform two operations.

At the beginning of all disks is a section reserved for the master boot record (MBR) consuming 63 sectors. This means that your partition will start on the 64th sector, misaligning the entire partition. Most vendors suggest a 64-sector starting offset.

Check with your particular vendor before standardizing this setting on a particular storage array.”

Always On high-availability data storage

SQL Server 2005 introduces the Always On storage review program for high-availability solutions. Various manufacturers have reviewed their solutions against Microsoft SQL Server requirements and have published detailed white papers about how their solutions can be used with SQL Server to maintain the Always On goal. For more information, see SQL Server Always On Storage Solution Review Program (www.microsoft.com/sql/AlwaysOn).
SQLIOSim

SQLIOSim replaces SQLIOStress. It is used to test SQL Server I/O patterns without requiring that SQL Server be installed. The tests do not use actual SQL Server database and log files but simulate them instead. This utility greatly extends testing capabilities by enabling control over memory footprint, multiple files per database, shrink and grow actions, files larger than 4 GB and other options.

We recommend using SQLIOSim to test the system before you install SQL Server. This will help you to improve your data safety.



Note: If SQL Server is reporting corruption or other I/O subsystem error conditions, back up your data and then run the SQLIOSim testing utility to test the I/O subsystem in addition to running other hardware check utilities provided by your hardware manufacture.

Important: SQLIOSim and SQLIOStress are also used by the Microsoft Hardware Compatibility Labs and by various vendors to make sure that the I/O subsystem is SQL Server I/O pattern compliant. If SQLIOSim or SQLIOStress return errors, this clearly indicates that the I/O subsystem is not performing at an HCL-compliant level. This could lead to severe data damage or loss.

Conclusion


For more information:

http://www.microsoft.com/technet/prodtechnol/sql/default.mspx

Did this paper help you? Please give us your feedback. On a scale of 1 (poor) to 5 (excellent), how would you rate this paper?!href(mailto: sqlfback@microsoft.com?subject=Feedback: [Paper Title])


References


There are many aspects to consider when you are setting up the I/O subsystem. This section provides a list of documents that you might consider reading.

For updated I/O details, you can also visit the http://www.microsoft.com/sql/support Web page.



SQL Server Always Storage Solution Review Program

  • http://www.microsoft.com/sql/AlwaysOn

Certification Policy

  • KB913945- Microsoft does not certify that third-party products will work with Microsoft SQL Server

  • KB841696 - Overview of the Microsoft third-party storage software solutions support policy

  • KB231619 - How to use the SQLIOStress utility to stress a disk subsystem such as SQL Server

Fundamentals and Requirements

  • White paper- SQL Server 2000 I/O Basics (applies to SQL Server versions 7.0, 2000, and 2005)

  • KB230785 - SQL Server 7.0, SQL Server 2000 and SQL Server 2005 logging and data storage algorithms extend data reliability

  • KB917047 - Microsoft SQL Server I/O subsystem requirements for the tempdb database

  • KB231347 - SQL Server databases not supported on compressed volumes (except 2005 read only files)

Subsystems

  • KB917043 - Key factors to consider when evaluating third-party file cache systems with SQL Server

  • KB234656- Using disk drive caching with SQL Server

  • KB46091- Using hard disk controller caching with SQL Server

  • KB86903 - Description of caching disk controls in SQL Server

  • KB304261- Description of support for network database files in SQL Server

  • KB910716 (in progress) - Support for third-party Remote Mirroring solutions used with SQL Server 2000 and 2005

  • KB833770 - Support for SQL Server 2000 on iSCSI technology components (applies to SQL Server 2005)

Design and Configuration

  • White paper - Physical Database Layout and Design

  • KB298402 - Understanding How to Set the SQL Server I/O Affinity Option

  • KB78363 - When Dirty Cache Pages are Flushed to Disk

  • White paper - Database Mirroring in SQL Server 2005

  • White paper - Database Mirroring Best Practices and Performance Considerations

  • KB910378 - Scalable shared database are supported by SQL Server 2005

  • MSDN article - Read-Only Filegroups

  • KB156932 - Asynchronous Disk I/O Appears as Synchronous on Windows NT, Windows 2000, and Windows XP

Diagnostics

  • KB826433 - Additional SQL Server Diagnostics Added to Detect Unreported I/O Problems

  • KB897284 - SQL Server 2000 SP4 diagnostics help detect stalled and stuck I/O operations (applies to SQL Server 2005)

  • KB828339 - Error message 823 may indicate hardware problems or system problems in SQL Server

  • KB167711 - Understanding Bufwait and Writelog Timeout Messages

  • KB815436 - Use Trace Flag 3505 to Control SQL Server Checkpoint Behavior

  • KB906121 - Checkpoint resumes behavior that it exhibited before you installed SQL Server 2000 SP3 when you enable trace flag 828

  • WebCast- Data Recovery in SQL Server 2005

Known Issues

  • KB909369 - Automatic checkpoints on some SQL Server 2000 databases do not run as expected

  • KB315447 - SQL Server 2000 may be more aggressive with Lazy Writers than SQL Server 7.0

  • KB818767 - Improved CPU Usage for Database Logging When Transaction Log Stalls Occur

  • KB815056 - You receive an "Error: 17883" error message when the checkpoint process executes

  • KB915385 A snapshot-based database backup restore process may fail, and you may receive an error message in SQL Server 2005

  • Support Assistance (http://www.microsoft.com/sql/support)

Utilities

  • Download - SQLIO Disk Subsystem Benchmark Tool

  • Download - SQLIOStress utility to stress disk subsystem (applies to SQL Server 7.0, 2000, and 2005 - replaced with SQLIOSim)


Yüklə 0,61 Mb.

Dostları ilə paylaş:
1   2   3   4   5   6   7




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə