Moving Tables that use Partitioning and Compression Across File Groups

Intro

Last week there was an article on SQL Server Central about moving large tables across filegroups. This was something that I have had to do a few times over the past few years, however the data warehouse in particular was on Enterprise, so it used page compression and partitioning. For data security compliance, we store the client data on separate filegroups, which in turn have two separate files on two separate disks. For some of the older clients, the data needed moving as older data had been moved out and we had large files within the filegroups that had plenty of space on the disk, which we were fast running out of for other filegroups. Below is how I moved the data:

First I set the trace flag 1117 on so that the two files grow at the same size:


DBCC TRACEON (1117,-1)

Then we add a temporary file group and two files to the file group. Ideally the size for the files should be large enough to hold the data without auto-growth as it will make the query run faster; as these are temporary files it doesn’t really matter about the fragmentation.


USE Sales

ALTER DATABASE Sales ADD FILEGROUP [FGSecondaryTemp]
ALTER DATABASE Sales ADD FILE
(NAME = N'Secondary_Temp_F03', FILENAME = N'P:\SQLData\SecondaryTemp3.ndf', SIZE = 110GB, FILEGROWTH = 10GB)
TO FILEGROUP [FGSecondaryTemp]
ALTER DATABASE Sales ADD FILE
(NAME = N'Secondary_Temp_F04', FILENAME = N'Q:\SQLData\SecondaryTemp4.ndf', SIZE = 110GB, FILEGROWTH = 10GB)
TO FILEGROUP [FGSecondaryTemp]

Several of the tables were heaps as we only read once when loading into the cube. So we create a clustered index with moving the data to the new file group. Clustered indexes define how the data is stored physically, so this is the best way to move the data. As we want to keep the table as a heap we drop the index at the end of the move.


CREATE CLUSTERED INDEX [IX_NorthWindTemp] ON [Sales].[Invoice_NorthWind] ([Id] ASC) ON FGSecondaryTemp;
DROP INDEX [IX_NorthWindTemp] ON [Sales].[Invoice_NorthWind]  WITH ( ONLINE = OFF );

But for tables that do have clustered indexes, obviously you can only have one clustered index on a table. So first you drop the index and specify the WITH MOVE option to the new file group. After which you add the index back on to the table.


ALTER TABLE [Sales].[Product_NorthWind] DROP CONSTRAINT [PK_ProductNorthWind] WITH (MOVE TO FGSecondaryTemp)
ALTER TABLE [Sales].[Product_NorthWind] ADD CONSTRAINT [PK_ProductNorthWind] PRIMARY KEY ([Id] ASC)

So, assuming that we have moved all of the tables of the file group the files can be shrunk and deleted. If there are any tables left then you will be able to shrink and empty one file, but not the other (if you only have one file, then it will fail).


DBCC SHRINKFILE (Secondary_F01, EMPTYFILE)
ALTER DATABASE Sales REMOVE FILE Secondary_F01

DBCC SHRINKFILE (Secondary_F02, EMPTYFILE)
ALTER DATABASE Sales REMOVE FILE Secondary_F02

Now that we have moved over to a temporary file group, really we want to re-create the files and place them on the file group that now has no files associated to it (you might be wondering why, but the move options specified below will clear that up). So we run the queries all over again but with a few changes:


ALTER DATABASE Sales ADD FILE
(NAME = N'Secondary_F01', FILENAME = N'P:\SQLData\Sales_Secondary_F01.ndf', SIZE = 110GB, FILEGROWTH = 10GB)
TO FILEGROUP [FGSecondary]

ALTER DATABASE Sales ADD FILE
(NAME = N'Secondary_F02', FILENAME = N'Q:\SQLData\Sales_Secondary_F02.ndf', SIZE = 110GB, FILEGROWTH = 10GB)
TO FILEGROUP [FGSecondary]

Once again, we create the clustered index, but instead of referencing the file group, we reference the partition schema (which is still on the original file group, hence the reason why we did not drop the file group). And because the file group has the new files created, it will move the data from the FGSecondaryTemp to the original file group, but on the new files. Finally we rebuild the partition schema.


CREATE CLUSTERED INDEX [IX_BCNorthWindTemp] ON [Sales].[Invoice_Secondary] ([Id] ASC) ON [pscNorthWind]([Id]);
DROP INDEX [IX_BCNorthWindTemp] ON [Sales].[Invoice_NorthWind] WITH ( ONLINE = OFF );
ALTER TABLE [Sales].[Invoice_NorthWind] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE)

With regards to the table that already requires a clustered index we drop the index, moving to the file group. Then, add the index and specify the name of the partition schema. Finally, once again, we rebuild the partition schema, specifying the data compression.


ALTER TABLE [Sales].[Product_NorthWind] DROP CONSTRAINT [PK_ProductNorthWind] WITH (MOVE TO [FGSecondary])
ALTER TABLE [Sales].[Product_NorthWind] ADD CONSTRAINT [PK_ProductNorthWind] PRIMARY KEY ([Id] ASC) ON [pscNorthWind]([Id]);
ALTER TABLE [Sales].[Product_NorthWind] REBUILD PARTITION = ALL WITH(DATA_COMPRESSION = PAGE )

Now that all the tables have been moved over we can shrink and drop the files and file group that were used for temporary storage.


DBCC SHRINKFILE (Secondary_Temp_F03, EMPTYFILE)
ALTER DATABASE Sales REMOVE FILE Secondary_Temp_F03
DBCC SHRINKFILE (Secondary_Temp_F04, EMPTYFILE)
ALTER DATABASE Sales REMOVE FILE Secondary_Temp_F04
ALTER DATABASE Sales REMOVE FILEGROUP FGSecondaryTemp

It’s a similar process if there were no compression or partitions, but failure to specify them leaves you with an uncompressed, unpartitioned table which will fail any imports.

Notes

Here are some queries that I’ve used to troubleshoot any issues with the moves as occasionally things don’t go according to plan, particularly if this is a long running query. These are some queries I’ve had saved for a long time; credit to the guys who figured these out!

Check to see if the filegroup is in the partition scheme list. If the schema id is NULL but the file group still appears then chances are a table still exists on the temporary filegroup. Note, if the table has no data stored in it it can still exist on the file group even if all files have deleted. I know that sounds like nonsense but I have seen this happen. I’d hazard a guess that if hte table did have data in it then you would not be able to remove one of the files. (There are situations that you might want a table even if it is empty).


select
partition_scheme_id,destination_id,a.data_space_id,name,b.data_space_id
from sys.destination_data_spaces a right join sys.filegroups b
on a.data_space_id = b.data_space_id

Following on from the query above, this determines which partition function and scheme is being used by a given file group (take the data space id from the query above and replace in the “where” statement.) Again, if the query above did not return a data space id then chances are there are no issues with partitions.


select f.name as [Function Name], f.type_desc,
s.name as [Scheme Name], s.type_desc
from sys.partition_schemes s join sys.partition_functions f
on s.function_id = f.function_id
where s.data_space_id = 65602

This query determines which objects are stored on which file group:


SELECT o.[name], o.[type], i.[name], i.[index_id], f.[name]
FROM sys.indexes i
INNER JOIN sys.filegroups f
ON i.data_space_id = f.data_space_id
INNER JOIN sys.all_objects o
ON i.[object_id] = o.[object_id]
WHERE i.data_space_id = f.data_space_id
AND o.type = 'U'
GO

References for queries:
Solving “The filegroup ” cannot be removed because it is not empty” error

SQL SERVER – List All Objects Created on All Filegroups in Database

Some observations from moving data in SQL:

  • Moving lots of data around in SQL can take a considerable amount of time, and will lock the tables and partition schemas, so be sure to test your process out as best you can and get an idea of time scales.
  • We don’t bother with rebuilding the partitions and adding page compression because it is a temporary move; it’ll only take time and we’ll have to do it all over again when we move back.
  • It does seem wasteful to move it, then move again, but it’s the safest way to move the data.
  • It’s important on the final move to move a table, then reorganise and compress each table as opposed to moving all the data then compressing all the tables. This is because uncompressed tables will take up a lot more space than what will be required at the end of the whole process. So you’ll end up with a load of blank space in the files which makes the moving process pretty much pointless.

Author: Richie Lee

Full time computer guy, part time runner. Full time Dad, part time blogger. Pokémon Nut. Writer of fractured sentences. Maker of the best damn macaroni cheese you've ever tasted.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s