Andrew Thomas

Developer

CTO

Skier

Blog Post

SQL Server UniqueIdentifier column and Row/Page Compression

Jun 15, 2020 SQL

I recently was debating about the use of uniqueidentifier use for SQL Server and a lot of articles mentioned how it didn’t compress at all. Since it takes up 16 bytes this was of concern, but the logical side of my brain was telling me that SQL Server should be able to compress it, provided it wasn’t unique for every row. i.e. if a large table referenced the uniqueidentifier column of another smaller table it could be worthwhile, if it was a many-to-one (and I stress many).

As with anything research on this didn’t come up with a satisfactory answer so I decided to create the test scenario myself which I have below.

The database I used was SQL Server 2012 Enterprise Edition with SP3.

First step was to create test tables. I created four tables, two tables contained a uniqueidentifier which would be the same for each row, but each would have a different primary key one would be an int and the other a uniqueidentifier. The other two tables were control tables which only container the primary key. So I could compare compression.

CREATE TABLE [dbo].[UniqueIdentifierPrimaryKey](
 [UniqueIdentifier] [uniqueidentifier] NOT NULL,
 [SameIdentifier] [uniqueidentifier] NOT NULL,
 CONSTRAINT [PK_UniqueIdentifierPrimaryKey] PRIMARY KEY CLUSTERED 
(
 [UniqueIdentifier] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[UniqueIdentifierPrimaryKeyControl](
 [UniqueIdentifier] [uniqueidentifier] NOT NULL
 CONSTRAINT [PK_UniqueIdentifierPrimaryKeyControl] PRIMARY KEY CLUSTERED 
(
 [UniqueIdentifier] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[IntPrimaryKey](
 [Id] [int] NOT NULL,
 [SameIdentifier] [uniqueidentifier] NOT NULL,
 CONSTRAINT [PK_IntPrimaryKey] PRIMARY KEY CLUSTERED 
(
 [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[IntPrimaryKeyControl](
 [Id] [int] NOT NULL
 CONSTRAINT [PK_IntPrimaryKeyControl] PRIMARY KEY CLUSTERED 
(
 [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

I then created test data in each table, I am sure there is a more efficient way but I just used a naughty loop:

DECLARE @counter AS INT
SET @counter = 1
DECLARE @uniqueId AS UNIQUEIDENTIFIER
SET @uniqueId = NEWID();
WHILE (@counter < 100000)
BEGIN
 INSERT INTO UniqueIdentifierPrimaryKey VALUES (NEWID(), @uniqueId)
 INSERT INTO UniqueIdentifierPrimaryKeyControl VALUES (NEWID())
INSERT INTO IntPrimaryKey VALUES (@counter, @uniqueId)
 INSERT INTO IntPrimaryKeyControl VALUES (@counter)
SET @counter = @counter + 1
END

After that I tested the results using sp_estimate_data_compression_savings which will calculate savings based on the two compression types — Page and Row.

EXEC sp_estimate_data_compression_savings NULL, UniqueIdentifierPrimaryKey, NULL/*index*/, NULL, PAGE ;
EXEC sp_estimate_data_compression_savings NULL, UniqueIdentifierPrimaryKey, NULL/*index*/, NULL, ROW ;

Page: went from 8104 KB down to 5096 KB Row: went from 8104 KB up to 8552 KB

EXEC sp_estimate_data_compression_savings NULL, UniqueIdentifierPrimaryKeyControl, NULL/*index*/, NULL, PAGE ;
EXEC sp_estimate_data_compression_savings NULL, UniqueIdentifierPrimaryKeyControl, NULL/*index*/, NULL, ROW ;

Page: went from 5632 KB up to 5696 KB Row: went from 5632 KB up to 5696 KB

EXEC sp_estimate_data_compression_savings NULL, IntPrimaryKey, NULL/*index*/, NULL, PAGE ;
EXEC sp_estimate_data_compression_savings NULL, IntPrimaryKey, NULL/*index*/, NULL, ROW ;

Page: went from 4272 KB down to 1600 KB Row: went from 4272 KB down to 4120 KB

EXEC sp_estimate_data_compression_savings NULL, IntPrimaryKeyControl, NULL/*index*/, NULL, PAGE ;
EXEC sp_estimate_data_compression_savings NULL, IntPrimaryKeyControl, NULL/*index*/, NULL, ROW ;

Page: went from 2696 KB down to 2136 KB Row: went from 2696 KB up to 2136 KB

Significant compression on uniqueidentifier columns does happen when there are duplicates that allow for compression. As the results show there was very clear compression of uniqueidentifier column for non primary key value i.e. when there was ability for compression across multiple similiar values in a page. But for primary key i.e. unique every row, there was no benefit to compression. Also important to note that row compression had no effect, which is what I would expect since a uniqueidentifier is as compressed as possible within its datatype.