Delete Old Tables on a Schedule

Update: turns out this job doesn’t actually execute (d’oh!) so I made a stored proc. Read the updated post by clicking this linky

One of our databases is exclusively used for staging tables of raw data, before it is extracted and transformed by a SSIS package and inserted into our data warehouse. These tables are large and pretty much useless after one run. The database has over 1.5TB of storage, but even after a few days of heavy month end data loading, it can fill up quickly. There’s no real warning for when the database does fill up either, as we have expanded the Files in the Filegroup to fill out the disks, so we don’t slow down the process from autogrow. Because these tables are so throwaway, we have a SQL Agent job that deletes anything older than three days at midnight. There’s pros and cons at running it at this time, but seeing as our data-loading window is 24/7 it seems as good a time as any.

As the DROP TABLE statement won’t accept a parameter, you can’t do DROP TABLE @tablename. You can however build a string of a complete T-SQL statement and EXECUTE it. I use a CURSOR, with READ_ONLY, to loop through the SELECT statement of the tables I want to delete. The READ_ONLY keyword allows for a faster and safer execution as no lock is required; we are only running a select statement. My select is pretty simple; using sys.sysobjects I get a list of the tables that were created more than 72 hours old.

USE <db_name>
GO
DECLARE @tableName VARCHAR(100)
DECLARE @sql VARCHAR(max)
DECLARE db_cursor CURSOR READ_ONLY FOR
SELECT name AS tableName
FROM sys.objects
WHERE create_date > DATEADD (hh, -72,  CAST(GETDATE() AS datetime))

OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @tableName

WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql = 'DROP TABLE ' + @tableName
EXEC (@sql)
--PRINT @sql

FETCH NEXT FROM db_cursor INTO @tableName
END
CLOSE db_cursor
DEALLOCATE db_cursor

We need to be careful when just getting a bunch of names for tables and dropping them that way; if you put the hour at a sufficiently high number you will find all manner of sys tables listed there that you really don’t want to delete. There’s two ways I thought about dealing with this:

sqlAgent6

  • Add an “AND” clause to the “SELECT” to filter out anything you might not want to delete. As these are always staging tables for the same load, I was able to narrow it down by writing the following:

SELECT name AS tableName
FROM sys.objects
WHERE create_date > DATEADD (hh, -72,  CAST(GETDATE() AS datetime))

and name LIKE 'TempRun%'

  • Alternately, narrowing down the search to only tables that are defined as user type tables.

SELECT name AS tableName
FROM sys.objects
WHERE create_date > DATEADD (hh, -72,  CAST(GETDATE() AS datetime))

and type = 'U'

It’s surprising to see how different the Query Execution Plan is for either of these, and I was surprised to see which one was theoretically quicker (even though in the case of either of these queries, speed is not really an issue). There’s a print option commented out which you can switch in and comment out the EXECUTE @SQL line if you want to have a play around at first.

Once you’ve chosen which method you want to do it’s a case of creating a new SQL Agent Job. I’ve added the steps here in a gallery for completeness. Click on the first image to scroll through the gallery.

Author: Richie Lee

Full time computer guy, part time runner. Full time Dad, part time blogger. Pokémon Nut. Writer of fractured sentences. Maker of the best damn macaroni cheese you've ever tasted.

1 thought on “Delete Old Tables on a Schedule”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s