Efficiently Partitioning Data Across Storage Structures

 Distributing and Partitioning Data

 Table partitioning means to split large tables across multiple storage
structures. Previously, objects were restricted to a single filegroup that could
contain multiple files. However, the placement of data within a filegroup was
still determined by SQL Server.
 Table partitioning allows tables, indexes, and indexed views to be created
on multiple filegroups while also allowing the database administrator (DBA)
to specify which portion of the object will be stored on a specific filegroup.
Table Partitioning

Table Partitioning
 For patitioning of a table, index or indexed view do the following:
1) Create a partition function.
2) Create a partition scheme mapped to a partition function.
3) Create the table, indexed , or indexed view on the partition
scheme.

 A partition function defines the set of boundary points for which data
will be partitioned and used to split data across a partition scheme.
 All data types that are valid for use as index columns can be used as a
partitioning column, except :
 text, ntext, image , varbinary(max) , timestamp, xml, varchar(max)
 Computed columns that participate in a partition function must be
explicitly marked PERSISTED.
 Any Columns are used to partition must be deterministic.
Table Partitioning

CREATE PARTITION FUNCTION
partfunctionname (datatype)
AS RANGE LEFT
FOR VALUES (10,20,30,40,50,60)
(1) PARTITION FUNCTION
CREATE PARTITION FUNCTION
partfunctionname(datatype)
AS RANGE RIGHT
FOR VALUES (10,20,30,40,50,60)

(1) PARTITION FUNCTION (Continue…)
 The AS clause allows you to specifiy whether the partition function you are
creating is RANGE LEFT or RANGE RIGHT .
 The LEFT and RIGHT parameters define which partition will include a
boundary point.
 CREATE PARTITION FUNCTION myRangePF1 (int) AS RANGE LEFT FOR VALUES (1, 100, 1000);
Partition 1 2 3 4
Values col1 <= 1 col1 > 1 AND col1 <= 100 col1 > 100 AND col1 <=1000 col1 > 1000
 CREATE PARTITION FUNCTION myRangePF2 (int) AS RANGE RIGHT FOR VALUES (1, 100, 1000);
Partition 1 2 3 4
Values col1 < 1 col1 >= 1 AND col1 < 100 col1 >= 100 AND col1 < 1000 col1 >= 1000

(2) PARTITION FUNCTION (Continue…)
 A partition function always maps the range of data ; no gaps are present.
 You cannot specify dublicate boundary points. This ensures that any value
stored in a column always evaluates to a single partition.
 Null values are always stored in the leftmost partition until you explicitly
specifiy Null as a boundary point and use the RANGE RIGHT syntax in
which case nulls are stored in the rightmost partition.
 Partition function is a stand-alone object that you can apply to multiply
tables, indexes , or indexed views.
 You can partition an existing object after it has populated with data.
 To partition an existing table, you need to :
 Drop the clustered index and re-create the clustered index on the
partition scheme.
 To partition an existing index or indexed view you need to:
 Drop the index and re-create the index on a partition scheme.
 You can have a maximum of (1000) partitions for an object , therefore you
are allowed to specify a maximum of (999) boundary points.

Creating a PARTITION Scheme
 A partition scheme defines the storage structure and collection of
filegroups that you want to use with a given partition function.
 Partition Schemes provide an alternate definition for storage.
 You define a partition scheme to encompass one or more filegroups.
CREATE PARTITION SCHEME partition_scheme_name
AS PARTITION partition_function_name
[ ALL ] TO ( { file_group_name | [ PRIMARY ] } [ ,...n ] )
 CREATE PARTITION SCHEME mypartscheme AS PARTITION
mypartfunction TO (Filegroup1, Filegroup2, Filegroup3, Filegroup4,
Filegroup5)
 Any filegroup specified in the CREATE PARTITION SCHEME statement
must already exist in the database.
 A partition scheme must be defined in such a way as to contain a
filegroup for each partition that is created by the partition function
mapped to the partition scheme.

Creating a Partitioned Tables and indexes
 SQL Server 2008 allows the use of the ALL keyword, as shown
previously, which allows you to create all partitions defined by the
partition function within a single filegroup.
 If you do not use the ALL keyword, the partition scheme must contain at
least one filegroup for each partition defined within the partition function.
Creating a PARTITION Scheme (Continue…)
Because a Partition Scheme is Just a definition for storage you should put
parition scheme name after ON cluase :
CREATE TABLE Employee (EmployeeID int NOT NULL,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL)
ON mypartscheme(EmployeeID);

 Run The following commands to check on results:
Select * From sys.partition_range_values ;
Select * From sys.partition_schemes ;
 The key is the ON clause.
 Instead of specifying a filegroup on which to create the table you specify a
partition scheme .
 The On clause is used to specify the storage structure , filegroup partition
scheme to store a table or index.
 The partitioning key that is specified must match the data type, length and
precision of the partition function.
 If the partitioning key is a computed column, the computed column must be
PERSISTED.

 When specifying the partitioning key for an index, you are not limited to
the columns that on which the index is defined.
 When you create an index on a partitioned table, SQL server
automatically includes the partitioning key in the definition of each index.
CREATE NONCLUSTERED INDEX idx_employeefirstname
ON dbo.Employee(FirstName) ON mypartscheme(EmployeeID);

Managing Partitions
 With data constantly changing , partitions are rarely static .
 Two operators are available to manage the boundary point definitions:
 SPLIT
 MERGE
 The SPLIT operator introduce a new boundary point into a partition function.
 The MERGE eliminates a boundary point from a partition function.
ALTER PARTITION FUNCTION
partition_function_name()
{SPLIT RANGE ( boundary_value )
| MERGE RANGE ( boundary_value ) } [ ; ]
 You must be very careful when using SPLIT and MERGE operators. You are
either adding or removing an entire partition from the partition function
 Data is not being removed from the table these operator, Only the partition.

Managing Partitions
 Because a partition can reside only in a single filegroup , a SPLIT or MERGE
could cause a significant amount of disk I/O as SQL server relocates rows on the
disk.
 ALTER PARTITION SCHEME
 You can add filegroups to an existing partition scheme to create more storage
space for a partition table.
ALTER PARTITION SCHEME part_name
NEXT USED [filegroupname] [;]
The Next USED clause has two purpose :
1) It adds a new filegroup to the partition scheme, if the specified filegroup is
not already part of partition scheme.
2) It marks the NEXT USED property for a filegroup.
 The filegroup that is marked with the NEXT USEDflag is the filegroup that
contains the next partition that is created when a SPLIToperation is executed.

Index Alignment
 You can partition a table and its associated indexes differently.
 SQL Server cannot store the clustered index in a structure separate from
the table.
 If a table and all its indexes are partitioned using the same partition
function, they are said to be aligned.
 If a table and all its indexes use the same partition function and the same
partition scheme, the storage is aligned as well.
 By aligning the storage, rows in a table along with the indexes dependent
upon the rows are stored in the same filegroups..
 This ensures that if a single partition is backed up or restored, the data
and corresponding indexes are kept together as a single unit.
 The purpose of partitioning is to split a table and its associated indexes
into multiple storage structures.
 Partitioning allows advanced data management features that go well
beyond simply storing a portion of a table in a filegroup.

Index Alignment
 SQL Server stores data on pages in a doubly linked list. To locate and
access data, SQL Server performs the following basic process:
1. Resolve the table name to an object ID.
2. Locate the entry for the object ID in sys.indexesto extract the first
page for the object.
3. Read the first page of the object.
4. Using the Next Page and Previous Page entries on each data page,
walk the page chain to locate the data required
 SWITH operator allows you to exchange partitions between tables in a
perfectly scalable manner with no locking, blocking, or deadlocking

Review
1) Contoso has a very high-volume transaction system. There is not enough
memory on the database server to hold the active data set, so a very high
number of read and write operations are hitting the disk drives directly. After
adding several additional indexes, the performance still does not meet
expectations. Unfortunately, the DBAs cannot find any more candidates for
additional indexes. There isn’t enough money in the budget for additional
memory, additional servers, or a server with more capacity.
However, a new storage area network (SAN) has recently been
implemented. What technology can Contoso use to increase performance?
A. Log shipping
B. Replication
C. Partitioning
D. Database mirroring

Review (Continue …)
2) Margie’s Travel wants to keep orders in their online transaction processing
database for a maximum of 30 days from the date an order is placed. The
orders table contains a column called OrderDate that contains the date an
order was placed. How can the DBAs at Margie’s Travel move orders that
are older than 30 days from the orders table with the least amount of impact
on user transactions? (Choose two. Each answer represents a part of the
solution.)
A. Use the SWITCH operator to move data partitions containing data that
is older than 30 days.
B. Create a stored procedure that deletes any orders that are older than
30 days.
C. Partition the order table using the partition function defined for a
datetimedata type using the OrderDatecolumn.
D. Create a job to delete orders that are older than 30 days

Review (Continue…)
3) Wide World Importers has a very large and active data warehouse that is
required to be accessible to users 24 hours a day, 7 days a week. The
DBA team needs to load new sets of data on a weekly basis to support
business operations. Inserting large volumes of data would affect users
unacceptably. Which feature should be used to minimize the impact while
still handling the weekly data loads?
A. Transactional replication
B. The SWITCH operator within partitioning
C. Database mirroring
D. Database snapshots

Review (Continue…)
4) Contoso Limited has a very high-volume order entry system. Management
has determined that orders should be maintained in the operational
system for a maximum of six months before being archived. After data is
archived from the table, it is loaded into the data warehouse. The data
load occurs once per month. Which technology is the most appropriate
choice for archiving data from the order entry system?
A. Database mirroring
B. Transactional replication
C. Database snapshots
D. Partitioning

Answers
1) Correct Answer: C
A. Incorrect: Although log shipping allows additional copies of the
database to be created, Contoso does not have any additional
hardware to use.
B. Incorrect: Although replication allows additional copies of the database
to be created, Contoso does not have any additional hardware to use
C. Correct: You could partition the most heavily used tables, thus
allowing you to spread the data across multiple files, which improves
performance.
D. Incorrect: Although database mirroring creates an additional copy of
the database, Contoso does not have any additional hardware to use

Answers (continue…)
2) Correct Answers: A and C
A. Correct: The SWITCH operator removes orders that are older than
30 days without causing any blocking.
B. Incorrect: You could execute a DELETE operation, but exclusive
locks would be acquired that would affect the ability of customers to
place orders.
C. Correct: Partitioning the OrderDate column allows you to use the
SWITCH operator to move data that is older than 30 days off the
table without causing any blocking.
D. Incorrect: You could execute a DELETE operation, but exclusive
locks would be acquired that would affect the ability of customers to
place orders.

3) Correct Answer: B
A. Incorrect: Transactional replication has the capability to move data
into the tables within the data warehouse; however, locks are
acquired during the insert process that would affect users.
B. Correct: The SWITCH operator is designed to move partitions of data
into a table without causing blocking.
C. Incorrect: Database mirroring requires the mirror database to be
offline and would not be a valid technology in this scenario.
D. Incorrect: Database snapshots provide a point in time, read only copy
of a database, and would not reflect any new data that is added

4) Correct Answer: D
A. Incorrect: Database mirroring keeps a secondary database
synchronized, but the mirror is inaccessible. Therefore, it is
inappropriate for archiving.
B. Incorrect: Transactional replication could be used to move the data to
another system for loading into the data warehouse; however, you
still need to delete data from the order entry system. This affects the
concurrency and performance of the order entry system.
C. Incorrect: Database snapshots maintain a read-only copy of the data
at that point in time and is inappropriate for archiving.
D. Correct: By designing the table with partitioning, you can remove
order data from the order entry system without affecting performance
or concurrency. After you remove the partition from the table, you can
load the data into the data warehouse.

Efficiently Partitioning Data Across Storage Structures

Efficiently Partitioning Data Across Storage Structures

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Efficiently Partitioning Data Across Storage Structures

Ähnlich wie Efficiently Partitioning Data Across Storage Structures (20)

Mehr von Ala Qunaibi

Mehr von Ala Qunaibi (15)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Efficiently Partitioning Data Across Storage Structures