Defaulting Data Compression on Creating Indexes

Here’s a theoretical situation;

  • you have a vendor database, that is used solely for staging very large tables, and the tables are created and dropped regularly.
  • The tables can exist for a few hours, or for days.
  • This database has only 1 type of table, with a few indexes created on it, none of which are compressed.
  • You cannot change the code.
  • However, when compressed, these tables/indexes can save up to 90% of space.
  • And with space being a commodity, the idea that we can allocate a disk of a few hundred gb as opposed to several tb is very appealing to all concerned
  • compression can possibly even make the process run faster.

So how can we add compression? One solution is to use ddl triggers; ddl triggers are executed with the keywords CREATE, ALTER, DROP, GRANT, DENY, REVOKE or UPDATE STATISTICS. So by creating a ddl trigger that fires on CREATE INDEX to alter the index and rebuild the index as page compression, we can get optimal compression on the tables.

Clearly, this is very situational:

  • Depending on the data types of the object, sometimes compression can be added and the savings of space can be nothing, resulting in wasted clock cycles.
  • Again, subject to data types of the object, page compression offers greater compression than row compression, but sometimes it can only compress as much as row compression. So the “one size fits all” option of using page compression indiscriminatingly can be a bit of a waste.
  • In fact there are lots of considerations wrt data compression.
  • If an index is created on a non-empty table, then the operation to create an index will seemingly take longer because of the rebuild with page compression.
  • Adding compression on a non-empty table is a very resource intensive operation, so on a large table it could throttle the instance. Typically adding compression in this scenario is best done during an outage or during a very quiet time.

But at any rate, despite all these warnings the script below includes a trigger that will fire on “CREATE_INDEX” operation to rebuild the index with page compression. It will check to see if the index is either clustered or non clustered.


--create a dummy table to play around with
CREATE TABLE CompressTest
 (
 Col1 INT,
 Col2 VARCHAR(50),
 Col3 INT
 )
 GO

-- ddl trigger; triiger executes post "create index" operation
-- checks that the index created is clustered or non clustered
-- if it is it immediately rebuilds it with page compression
CREATE TRIGGER trgr_CompressIndexPostCreate
ON DATABASE
AFTER CREATE_INDEX
AS
BEGIN
 SET NOCOUNT ON
 DECLARE @TableSchema sysname
 DECLARE @TableName sysname
 DECLARE @IndexName sysname
 DECLARE @sql NVARCHAR(1024)
 SELECT @TableSchema = EVENTDATA().value('(/EVENT_INSTANCE/SchemaName)[1]','sysname')
 SELECT @TableName = EVENTDATA().value('(/EVENT_INSTANCE/TargetObjectName)[1]', 'sysname')
 SELECT @IndexName = EVENTDATA().value('(/EVENT_INSTANCE/ObjectName)[1]', 'sysname')
 IF EXISTS (select t.name, i.name
 from sys.tables t
 inner join sys.indexes i on i.object_id = t.object_id
 where t.name = @TableName
 and i.name = @IndexName
 and i.type IN (1, 2) --clustered or non clustered
 )
 BEGIN
 SET @sql = 'ALTER INDEX ' +QUOTENAME(@IndexName)+ 'ON' +
 QUOTENAME(@TableSchema) + '.' +
 QUOTENAME(@TableName) + ' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION=PAGE);'
 EXEC (@sql);
 END
END
GO

-- create clustered index to test

CREATE CLUSTERED INDEX [ClusteredIndex_col1] ON [dbo].[CompressTest]
(
 [Col1] ASC
)
GO

-- create non-clustered to test
CREATE NONCLUSTERED INDEX [NonClusteredIndex_Col2] ON [dbo].[CompressTest]
(
 [Col3] ASC
)
GO

-- how we check this
select distinct t.name, i.name
 from sys.tables t
 inner join sys.partitions p on p.object_id = t.object_id
 inner join sys.indexes i on i.object_id = t.object_id
 where i.type IN (1, 2)
 and p.data_compression > 0

--tidy up, because leaving triggers around is rarely a good idea
DROP TABLE CompressTest

DROP TRIGGER trgr_CompressIndexPostCreate ON DATABASE

As I’ve already said, there is a risk here adding this to any/all databases:

  • I would not recommend adding this to all vendor databases you have, like Sharepoint, or any of the TFS databases.
  • Test, test, and test again.

  • It also does not check if the index is already created with compression. In our scenario this is not a problem, but one such way to check is to search the CommandText in the eventdata for the command “data_compression”, and if it exists then skip altering the index.
  • This does not work for heaps, and if the clustered index is dropped then it will leave the underlying table with no compression.
  • And of course, it is possible to specify data compression when creating indexes, so if you manage the database code, then you can manage compression via source control.

Author: Richie Lee

Full time computer guy, part time runner. Full time Dad, part time blogger. Knows a thing or two about Pokémon. Knows too much about SQL Agent. Writer of fractured sentences. Maker of the best damn macaroni cheese you've ever tasted.

1 thought on “Defaulting Data Compression on Creating Indexes”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s