Determining the required capacity of an I/O subsystem before deploying the database
It is a good practice to determine the required capacity for a database application before it is deployed on the Production environment. As it speaks the I/O system is "most" important to the performance of SQL Server, in the event of any performance loss or even configuring a new server for SQL Server you would need to assess the disk space requirement and I/O capacity to address the needs.
Using number of tools that are available for performing this type of testing is one way to go, for specific to SQL SErver I would recommend to use SQLIO.exe tool that covers basic I/O configuration best practices for SQL Server 2005. Take into consideration of combining the I/O sizes for READ/WRITE with SEQUENTIAL/RANDOM methods of I/O. By default choosing 128KB sequential I/O is better to larger databases. But there are differences in measuring the performance of sequential I/O versus measuring the performance of random I/O. The important unit of measurement for sequential I/O is throughput (MB/s) whereas for random I/O, the important unit of measure is the number of I/Os per second.
By Default OLTP based applications demands high level of I/O operations for SQL Server, so better to go with random in nature for both reads and writes issued against data. Write operations will always take upper hand than Read operations due to the checkpoint occurrence as the default frequency within SQL Server or backup operations as per the recovery settings. Similarly the log writes that are sequentially default will a differentiaing size depending upon the nature of processes & activities on that OLTP database. This works as opposite within DSS based systems where both Reads & Writes are sequential with a better table indexes scan, but this needs a care during the bulk load operations as it tends to vary in I/O sizes that are larger than 8KB of data size. Use the appropriate RAID technology to ensure that your disk array is fault tolerant. Be as proactive as possible in replacing failed or marginal disks, if in doubt you can take help from Vendors. A disk failure will put a greater load on the remaining disks in an array and may cause other marginal disks to fail.
A number of tools are available to stress and validate the functionality of I/O subsystems. Some of these tools actually simulate SQL Server I/O patterns and others allow you to control the I/O patterns submitted to the subsystem. There are tools that confirm the functional validity of a configuration and others that are used purely to test performance capacity. The following table compares the most common tools currently used to test I/O subsystems in the context of SQL Server. So whenever you perform such testing it is better to size the test files similar to the size of the database being deployed. So in this regard you need to make sure that the total size of the test files used in each test is significantly larger than the amount of cache on the storage array. The issue with a small test files or say 10% of actual file will result in skewed results since the entire file may be cached on the array. Due to the nature these small files could result in only a small portion of the disk being utilized, resulting in lower seek times (commonly referred to as short stroking the disk) and ultimately skewing the results.
One of the best practice is to perform the test that remain memory resident on the array can be a technique that is used to determine the capacity of other components when there are no disk-based bottlenecks. By using this approach, you can quickly determine if you will be able to realize all of the theoretical bandwidth between array and host. This is more high-end methods of obtaining the test results with a theoretical bandwidth, which will differ to environment to environment. When you have started the collection of test data ensure to store all of the benchmark data in order to compare it with the SQL Server I/O throughput numbers. Even better to include the Vendor for your Storage to determine if your results are what would be expected for your particular storage configuration. Simulate the testing phase such as busy times and lesser-busy times. The default nature of the larger storage systems utilize a self-tuning cache and may require that a workload be run for a period of time before an optimal cache performance is reached. Further to this theory refer to the below links:
Storage Top 10 Best Practices.
Deployments and Tests in an iSCSI SAN
SQLIO.EXE is best used tool so far (IMHO) to validate performance and ensure that the system is tuned optimally for SQL Server before deployment helps identify hardware or I/O configuration related issues before the complexity of SQL Server is introduced. Testing a variety of I/O types is necessary because performance problems can be specific to a certain type of I/O or pattern of I/O.
**__________________________________**
SQL Server MVP, Sr. DBA & industry expert.
-
Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it. It is also a power and you will gain by sharing it.