What does it cost you to compress your data online?

· Uncategorized

Where there is often lots of info on compression available everywhere, it’s often much harder to find info what the cost is related to online vs offline page compression. So I did a test.

I have a 10GB filegroup containing a 10GB heap on it (no indexes), data and log files have autogrowth set to 1 GB. Simple recovery model.
This is my table structure:
c_id int identity
, c_guid uniqueidentifier default newid()
, c_date datetime2  default DATEADD(mi, @@identity, '19900101')
, c_int int default rand()*100000000
, c_bigint bigint  default rand()*10000000000
, c_num numeric(19,4)  default rand()*10000
, c_char char(5) default char(rand()*255)+char(rand()*255)+char(rand()*255)+char(rand()*255)+char(rand()*255)
, c_varchar varchar(300) default REPLICATE(char(rand()*255), rand()*300)

45 million records were inserted using the default values defined in the table definition, this ‘rather well’ designed table has one major drawback: compressionrate will only be 10%.
So I will first perform an offline compression, restore the DB in its original state and perform an online compression. During this online process, no other transaction were active.
USE [My20GbDB]
(DATA_COMPRESSION = PAGE, online=off/on

Here are the results from the Belgian jury:

One advantage, the duplication of the full table takes extra diskspace but allows you to stop the process immediately and not having to wait for hours for a rollback to complete.
Conclusion: you better have good reasons to run a compression online.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: