Welcome to

SqlServer-QA.net

Sign in | Join | Help

SQL Server 2008 (SSQA.net)

Trusted, Productive and Intelligent enterprise data platform.
SQL Server 2008 Data Compression, what it is - how does it work?

As you know data compression facility is added onto the SQL Server 2008 version, though this is not a new feature and I would call it as an enhanced feature in SQL Server. This subject of data compression has already been introduced in SQL Server 2005 version which is usage of VARDECIMAL with having table-level option can be used with decimals & numeric fields data, so this has been stretched further in 2008 version to the users.

So this compression is a big leap in 2008 and different to normal compression formats, say if you have enabled data compress on a table or index then underlying row/page format will be different in terms of ROW & PAGE value that is stored in the engine. This value of compression can only be useful/available within SQL 2008 storage engine and that will not be available for previous versions. Going further such page is compressed separately, with the prefixes and dictionaries stored in the page itself.

Compression does not occur until the page is nearly full, as pages are allocated atomically and compressing a half a page to a quarter page wouldn't gain anything. Bear in mind about performance discounts in using this row/page compression, while selecting data from these fillers CPU utilization & I/O will go high along with heavy memory usage. Though I haven't had a chance to compare such utilization with the data compression performance to that of SQL Server 2005, so I can't directly say whether or not it was faster or produced smaller compressed files. In general, though, data compression enables data in a database to be stored more effectively and reduces the storage requirements for your data. Data compression also provides significant performance improvements for large I/O-bound workloads, getting data to and from the database more quickly. 

Always there will be a trade-off between achieving highest performance and sensible usage of data compression. Still I have to study further on this feature to share my experience, meanwhile for further information on this feature follow the below links:

SQL2005-2008_DataCompression by Kimberly Tripp

TypesOfDataCompression-SQL2008 by Sunil Agarwal

WMV - LiveMeeting-Presentation on SQL Server 2008 Data Compression

Data-Capture_with_Data-compression

 

 

Posted: Monday, December 03, 2007 6:54 AM by SQL Master

Comments

SSQA.net - SqlServer-QA.net said:

As you know data compression facility is added onto the SQL Server 2008 version, though this is not a

# December 3, 2007 7:48 AM

Other SQL Server Blogs around the Web said:

As you know data compression facility is added onto the SQL Server 2008 version, though this is not a

# December 3, 2007 3:42 PM

Tom Russ said:

Oh, compression is that kind of technology that a DBA could sometimes call magic. And I really enjoy using SQL server 2008 in my testlab environment. But I'd say it even more that I can implement backup/restore compression right in SQL 2005. It can be surprising but an overwhelming number of my fiends and colleagues don't have any clue that it's possible to save both the space and operation completion time by implementing compression right using SQL 9.  SQL server 2008 supports on-the-fly row-level compression when it compresses the rows as you write them to the database and decompresses back as you retrieve the data. That's really a great feature and has its value but as a person who administers several SQL servers I'd like to note that probably the first thing that an administrator would start thinking if you ask him about his desire to implement some technology in his environment would be data protection. Data protection is considered to be of two types. You are protected if you have implemented a procedure that allows you roll back changes or quickly restore data from every point in your environment. But you are twice as protected if you've implemented encryption in addition to backup procedure. If you don't backup data you are playing with fire. But if you have implemented backup you still haven't solved the tasks because you should also be thinking about taking this in a safe place and making the procedure reliable. A reliable procedure is a procedure that allows you to be able to implement restore seamlessly to your environment. That is you have to worry about redundancy. When you get a redundancy you immediately have task to implement compression. While the cost of disk space goes down the amount of data rises up and it doesn’t mean that having cheap SATA backup arrays you freed yourself from a space problem. Here's where backup compression plays a great value. However, some of us still continues to implement old schemes of data backup that do not work anymore with that amount of data that we host on our SQL servers today. I remember we used something like scripts implementing zip compression over the backup data. That seemed to be reliable those days. Today when I can implement automatic backup, compression and encryption for every database on all instances on my SQL server farm with tools like Scriptlogic's Litespeed http://scriptlogic.com/products/litespeed/ that I use for my data protection procedure I can only smile on how naive I was those days when I used those scripts. As you've said I/O operations goes hi but don't forget the less size you process the lesser you have to read and the faster your I/O operations are proceeded. I'd say that I don't see any dramatic increase in memory usage and this could be because of the compression techniques implemented in Litespeed (or SQL that Litespeed uses). CPU load rises dramatically if you count a relative rise. But if you take absolute percentage and count how many resources are still free when you compress the data you'd call compression CPU load just an approach to use your CPU power effectively. If the most of CPU power goes to a Idle process it's doubted an effective process. Another thing is how quickly you can implement this. If you compress the data pretty quick and transfer it even more quicker but spend a 70-100% more on implementing this it's again not effective. That's why I'd say that data management matters here. With Litespeed I can control backup right from a special snap-in from a remote place so that I don’t  have to worry about how do I script it to run on the DB server and then upload the backed data to a file server because I can build a plan and distribute backup load through the servers selecting different techniques and tuning compression parameters for every single backup job. Then it will be automatically uploaded to a remote storage. If I wish I can then request information about how many backups  I had, which of them failed and so on. The same way I like the easy of performing SQL operations and queries with SQL management studio the same way I like the procedure of making backups with Scriptlogic's Litespeed. Take encryption. That’s so easy to select a needed algorithm and make the backup secure. Then I even can put it on a DVD and send it with a courier so that I am sure nobody would be able to restore it without knowing the password. Having such crypto secure algorithms as AES-256 I can sleep soundly as I am sure my databases will be automatically compressed, encrypted and backed up to a remote location on a nightly basis. Thus you can fine-tune your compression and encryption setting and find that solution that works best with every particular database.

# February 13, 2008 9:46 AM

SQL Master said:

check this comment - awesome

# June 8, 2008 4:41 PM
Anonymous comments are disabled