Columnstore indexes in SQL 2012:
Columnstore indexes in the SQL Server Database Engine can be used to significantly speed-up the processing time of common data warehousing queries.
Ref:
https://msdn.microsoft.com/en-us/library/gg492153(v=sql.110).aspx
It is available From SQL 2012 only.
Sample:
CREATE NONCLUSTERED COLUMNSTORE INDEX csindx_simple
ON SimpleTable
(OrderDateKey, DueDateKey, ShipDateKey);
GO
Choosing Columns for a Column Store Index
Sparse Columns in SQL 2008:
Script:
CREATE TABLE UnSparsed(ID INT IDENTITY(1,1),FirstCol INT,SecondCol VARCHAR(100),ThirdCol SmallDateTime)GOCREATE TABLE Sparsed(ID INT IDENTITY(1,1),FirstCol INT SPARSE,SecondCol VARCHAR(100) SPARSE,ThirdCol SmallDateTime SPARSE)GODECLARE @idx INT = 0WHILE @idx < 50000BEGIN
INSERT INTO UnSparsed VALUES (NULL,NULL, NULL)INSERT INTO Sparsed VALUES (NULL, NULL, NULL)SET @idx+=1ENDGOsp_spaceused 'UnSparsed'GOsp_spaceused 'Sparsed'GODROP TABLE UnSparsed
GODROP TABLE Sparsed
GO
Columnstore indexes in the SQL Server Database Engine can be used to significantly speed-up the processing time of common data warehousing queries.
Ref:
https://msdn.microsoft.com/en-us/library/gg492153(v=sql.110).aspx
It is available From SQL 2012 only.
Sample:
CREATE NONCLUSTERED COLUMNSTORE INDEX csindx_simple
ON SimpleTable
(OrderDateKey, DueDateKey, ShipDateKey);
GO
The following data types can be included in a columnstore index.
- char and varchar
- nchar and nvarchar (except varchar(max) and nvarchar(max))
- decimal (and numeric) (Except with precision greater than 18 digits.)
- int , bigint, smallint, and tinyint
- float (and real)
- bit
- money and smallmoney
- All date and time data types (except datetimeoffset with scale greater than 2)
The following data types cannot be included in a columnstore index.
- binary and varbinary
- ntext , text, and image
- varchar(max) and nvarchar(max)
- uniqueidentifier
- rowversion (and timestamp)
- sql_variant
- decimal (and numeric) with precision greater than 18 digits
- datetimeoffset with scale greater than 2
- CLR types (hierarchyid and spatial types)
- xml
Basic Restrictions
A columnstore index:
- Cannot have more than 1024 columns.
- Cannot be clustered. Only nonclustered columnstore indexes are available.
- Cannot be a unique index.
- Cannot be created on a view or indexed view.
- Cannot include a sparse column.
Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent
for more details about sparse column see here
http://sqlhints.com/2011/11/12/sparse-columns-in-sql-server-2008/
http://blog.sqlauthority.com/2008/07/10/sql-server-2008-introduction-to-sparse-columns/
http://blog.sqlauthority.com/2008/07/10/sql-server-2008-introduction-to-sparse-columns/
- Cannot act as a primary key or a foreign key.
- Cannot be changed using the ALTER INDEX statement. Drop and re-create the columnstore index instead. (You can use ALTER INDEX to disable and rebuild a columnstore index.)
- Cannot be created by with the INCLUDE keyword.
- Cannot include the ASC or DESC keywords for sorting the index. Columnstore indexes are ordered according to the compression algorithms. Sorting would eliminate many of the performance benefits.
Columnstore indexes cannot be combined with the following features:
- Page and row compression, and vardecimal storage format (A columnstore index is already compressed in a different format.)
- Replication
- Change tracking
- Change data capture
- Filestream
Tables that have a columnstore index cannot be updated.
To update a table with a columnstore index, drop the columnstore index, perform any required INSERT, DELETE, UPDATE, or MERGE operations, and then rebuild the columnstore index.
Choosing Columns for a Column Store Index
Some of the performance benefit of a columnstore index is derived from the compression techniques that reduce the number of data pages that must be read and manipulated to process the query. Compression works best on character or numeric columns that have large amounts of duplicated values. For example, dimension tables might have columns for postal codes, cities, and sales regions. If many postal codes are located in each city, and if many cities are located in each sales region, then the sales region column would be the most compressed, the city column would have somewhat less compression, and the postal code would have the least compression. Although all columns are good candidates for a columnstore index, adding the sales region code column to the columnstore index will achieve the greatest benefit from columnstore compression, and the postal code will achieve the least.
Dimension Table
Postalcodes cities salesregions
City Table
Cityname postalcodes
Salesregions Table
Salesregions Cityname
Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent
Sparse Column is one more new feature introduced in SQL SERVER 2008.
Storing a null value in a sparse column doesn’t take any space, but storing a non-null value in sparse column takes 4 bytes extra space than the non-sparse columns of the same data type.
SPARSE column are better at managing NULL and ZERO values in SQL Server. It does not take any space in database at all. If column is created with SPARSE clause with it and it contains ZERO or NULL it will be take lesser space then regular column (without SPARSE clause).
In SQL Server 2008 maximum column allowed per table is 1024. All the SPARSE columns does not count to this limit of 1024. The maximum limit of SPARSE column is 100,000.
In summary any table can have maximum of 100,000 SPARSE and 1024 regular columns.
GO CREATE TABLE SPARSECOLUMNTABLE ( col1 int identity(1,1), col2 datetime sparse, col3 int sparse ) CREATE TABLE NONSPARSECOLUMNTABLE ( col1 int identity(1,1), col2 datetime, col3 int ) GO INSERT INTO SPARSECOLUMNTABLE VALUES(NULL,NULL) INSERT INTO NONSPARSECOLUMNTABLE VALUES(NULL,NULL) GO 25000
Now check the space used by these tables by executing the below statements:
EXEC SP_Spaceused SPARSECOLUMNTABLE EXEC SP_Spaceused NONSPARSECOLUMNTABLE
Result:
name rows reserved data index_size unused SPARSECOLUMNTABLE 25000 392 KB 344 KB 8 KB 40 KB name rows reserved data index_size unused NONSPARSECOLUMNTABLE 25000 712 KB 656 KB 8 KB 48 KB
Script:
CREATE TABLE UnSparsed(ID INT IDENTITY(1,1),FirstCol INT,SecondCol VARCHAR(100),ThirdCol SmallDateTime)GOCREATE TABLE Sparsed(ID INT IDENTITY(1,1),FirstCol INT SPARSE,SecondCol VARCHAR(100) SPARSE,ThirdCol SmallDateTime SPARSE)GODECLARE @idx INT = 0WHILE @idx < 50000BEGIN
INSERT INTO UnSparsed VALUES (NULL,NULL, NULL)INSERT INTO Sparsed VALUES (NULL, NULL, NULL)SET @idx+=1ENDGOsp_spaceused 'UnSparsed'GOsp_spaceused 'Sparsed'GODROP TABLE UnSparsed
GODROP TABLE Sparsed
GO
No comments:
Post a Comment