1. COLLABORATE 14 â IOUG Forum
Engineered Systems
1 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
Migrate 10 TB to Exadata X3 â Tips and Tricks
Amin Adatia, KnowTech Solutions Inc.
ABSTRACT
Tips and Tricks for migrating 10TB data from AIX (2 Nodes and v10.2.0.4) to Exadata X3 (8 Nodes and v11.2.0.3) with only
6 hours of downtime. Database included Oracle Label Security, 40 Partitioned Tables, 4 Tables with Oracle Text Indexes, 5
Tables with a CLOB column and one table for which a BLOB column had to be migrated while dropping a CLOB column
during migration. On the Target Exadata, the indexes had to be in place because there was data loading being done just prior
to the migration and immediately afterwards.
TARGET AUDIENCE
The Intermediate to Advanced Level Developer/Database Administrator will find something useful and innovative when
tackling data migration to Exadata when the application makes use of Partitioned Tables, Sub Partitioned Tables, Oracle Text
and Oracle Label Security especially when there is limited downtime available and standard methods like transportable
tablespaces and data pump prove to be impractical with respect to the available downtime.
EXECUTIVE SUMMARY
Learner will be able to:
1. Find which Techniques worked best for different Types of Tables
a. Non Partitioned Tables
b. Partitioned Tables
c. Hash Sub Partitioned Tables
2. How to deal with Tables with LOB Data
3. How to deal with Oracle Text Indexes when Transportable Tablespaces approach is not practical
4. How to address Oracle Label Security setup when Custom Label Tags are not used
Migration to Exadata surpassed expectations! We did not expect nor did we get any issues with non-partitioned tables. Of the
Partitioned Tables, our optimistic expectation was that we would migrate 480 out of approximately 2500 partitions for each
table in the 6 hour downtime allowed. We managed to completely migrate 26 of 30 Partitioned Tables which did not have any
LOB columns. The remaining three took from 8, 15 and 20 hours to complete; partly because of diverting resources once the
threshold number of Partitions had been migrated.
Tables with LOB columns do not seem to be able to take advantage of parallelism. It appeared that while data from the non-
LOB columns moved with the parallelism applied, LOB column reverted to one process and in a record at a time manner!
This behaviour was masked probably by the CLOB size being less than 4000 characters.
When migrating into a table with 32 sub-Partitions the issue became significant the approach had to be abandoned for this
Table. Eventually, it was determined that the best approach was to not use Parallel on the Source for data migration but rather
to use up to 45 jobs on each node for the data migration part. Also instead of Partition Exchange, which locked the Table and
stopped ETL, a much better method was to insert into the table with input rows sorted by the OraHash Key. Table below
provides a summary of the data migration for each type on Table.
2. COLLABORATE 14 â IOUG Forum
Engineered Systems
2 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
Table Data Migration Summary
Object Type Estimate
Partitions
Completed/Total
Actual
Partitions within
6 Hour Downtime
Actual Time
Non Partitioned Tables (65) 30 minutes 12 minutes
Partitioned Tables (33 ~2500 Parts/Table)
Non LOB Columns (30) 480/2500 26 Tables completed
1 Table 880
2 Tables 440
1 Table 200
8 Hours
11 Hours
20 Hours
CLOB Column Tables (2 Tables) 1 Table 480/2500
1 Table 40/1800
1000
160
11 Hours
15 Hours
BLOB Column and Sub-Partitioned (1 Table) 28/1200 16 8 Days!!
The approach to Oracle Text was to rebuild the Text Indexes rather than use Transportable Tablespaces to preserve the
Oracle Text Index data. One of the main reasons was that the Transportable Tablespaces method would have taken about 20
days of downtime. Besides we would need the source document files if there were any documents added or deleted. Utilizing
the processing power in Exadata we were able to tune the Oracle Text Indexing so as to be able to SYNC_INDEX a Partition
within about 30 minutes which was well within the time to regenerate the XML clob and the combined documents file.
Sometimes we had to increase the Parallel Degree parameter to 96 but the normal setting was 32.
For Oracle Label Security deployment, rather than update the Label Tag value after the data was migrated it was simpler to use
the Target Label Tag values within the migration process to replace the Source Label Tag. An issue we ran into, that led to this
replacement approach, was that the update of the Label Tags, once the OLS Policy was applied to the Table, was an extremely
slow performance. A pre-requisite of this approach to OLS Setup is that the OLS Policy and Labels need to be defined on the
Target and that the labels be synchronized between the Source and the Target prior to data migration. We had a catch-all
Label Tag so that discrepancies could be corrected afterwards.
BACKGROUND
The source and target environment were as below. A month before the migration, Exadata X3 became available.
Source Target
Oracle 10gR2 (10.2.0.4)
ï· AIX
ï· Big Endian
ï· 2 Node RAC
ï· 48 CPU per Node
Oracle 11.2.0.3
ï· Exadata X3 (Linux)
ï· Little Endian
ï· 8 Node RAC
ï· 32 CPU per Node
ï· Tablespaces => 450 (9.8 TB Disk Space Used â measured as segment size)
ï· Table Partitions distributed across all Tablespaces
ï· Tables (for Migration)
o Partitioned => 33 (~ 2500 Partitions/Table)
o Non Partitioned => 65 (2 Large Number of Rows)
3. COLLABORATE 14 â IOUG Forum
Engineered Systems
3 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
o Oracle Text Indexes => 4
o Tables with LOBs => 4 (CLOB and BLOB)
ï· Oracle Label Security => ~ 280 Labels
ï· Network Link => 1 MBit and 10 Mbit
Migration Constraints and Conditions
ï· Users mostly query for the most recent 120 partitions
ï· Some queries span 1500 partitions
ï· XML (for Text Indexing)
o Built from 17 Tables
o Took 30 â 45 minutes
o Sync Index for a Partition took 20 â 30 minutes
ï· Documents stored as Files
o Took about 45-60 minutes to copy
o Sync Index took about 30 minutes
Migration Approach
ï· Migrate at least 240 partitions data before ETL resumed
ï· Downtime allowed was 6 hours
ï· Text Indexing done as an âassembly lineâ triggered by a control table which was updated as each Partition was
migrated and loaded on the Target
ï· Safety Net
o Dual ETL to Source and Target Environments
o Keep Users on Source Environment
o Journal User Actions and Periodically Apply to Target to Synchronize
o Switch Users when All Data Migrated
ï§ Stop Feed to both Source and Target
ï§ Apply User Actions Journal to Target
ï§ Switch Users to Target
ï§ Resume Dual Mode ETL for a few Days
TECHNICAL DISCUSSIONS AND EXAMPLES
Target Setup
The application environment consisted of four schemas, one of which acted as a proxy account for users. All the schema
definitions were exported from the Source Environment with Rows = N option. On the Target Environment, the schemas
were established, Oracle Text Preferences and Parameters created and Oracle Label Security Policy and Labels established.
Once this was completed, the schema definitions were imported. Invalid Objects were reviewed and recompiled and cross-
schema grants and privileges tested.
Three types of Tables into which data had to be migrated were
1. Non Partitioned â 65 Tables
4. COLLABORATE 14 â IOUG Forum
Engineered Systems
4 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
2. Partitioned Tables
a. Without LOB Columns â 30 Tables
b. With LOB Column â 2 Tables
3. Partitioned Table with 32 Sub Partitions by Hash and LOB Columns â 1 Table
Given the Source Environment it was not practical to make use of the Transportable Tablespaces option, for migration, to
save on the rebuilding of Oracle Text Indexes. A test for the method gave an estimate of 20 days downtime to complete the
migration. An alternate approach was developed whereby we would rebuild the Text indexes from the source data as it got
migrated.
The Migration
1. Non Partitioned Tables
For the most part, these got migrated in under 10 minutes. Three sessions were invoked each dealing with about 20 tables.
One of the two larger volume tables took 25 minutes using parallel 24 at the source. The other took 2 hours but it was
discovered too late that only parallel 12 had been designated. Appendix A1 shows the script used.
2. Partitioned Tables
There were two parts to migrating data into Partitioned Tables. One was to get the data across from the Source and the other
was to load the data into the appropriate partition on the Target using the Partition Exchange mechanism. Given that the
Source had 48 CPU on each Node and that one of the Nodes had a 10Mbit network, tables with more data were set to migrate
using the larger bandwidth. Scheduled Jobs were submitted for each table in a controlled manner such that the at most 46
parallel sessions were running on any on the source nodes. Once the data was migrated over to the Target Exadata, the vastly
more CPUs were utilized to parallelism as required. Sixteen of the tables were required to generate the XML used for Text
Indexing. Four of the 8 Nodes were used exclusively for data migration and Partition exchange. Three were used for three
Text Indexing Jobs.
Three Tables in this class of objects failed to complete within the downtime period. Primary reason was resource re-allocation
on the Source in favour of Tables with LOB Columns given that the minimum set of Partitions had already been migrated.
Appendix 2 shows the code snippets used for migration of the data. The deployment was such that each Partition was
migrated to a Queue Table. This was then copied to a Working Table for Partition Exchange. Indexes of the Working Table
were created corresponding to the Target Table Partition. The Working Table to Partition sequence was initiated by the
Partition Status updated in the control Table established to manage the migration.
Allocating more CPU at the Source increased the data migration speed so that we could divert resources to Tables with more
data to try and get everything to complete at about the same time. The technique was purely âgut feelâ and the number of
partitions already migrated. Graph below displays the time (Hours) for the migration of each of the Tables.
5. COLLABORATE 14 â IOUG Forum
Engineered Systems
5 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
Total Time to Migrate Data for each Table
Migration staggered as to the resources available on the AIX Source
3. Partitioned Tables (LOB Columns)
Tables with LOB columns did not seem to be able to take advantage of parallelism. It appeared that while data from the non-
LOB columns moved with the parallelism applied, LOB column reverted to one process and in a record at a time manner!
This behaviour was masked probably by the CLOB size being less than 4000 characters. Eventually, it was determined that the
best approach was to not use multiple Parallel processes on the Source for data migration but rather to use up to 45 separate
jobs, one for each Partition, on each node for the data migration part. The time to migrate dropped from 45-75 minutes to 10-
15 minutes per Partition. Graph below shows the relative time to migrate data from Partitioned Tables with LOB Columns.
Data Migration of LOB Column Tables
Same Partition using different Degree Parallel on the SELECT from Source
6. COLLABORATE 14 â IOUG Forum
Engineered Systems
6 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
4. Sub Partition by Hash (LOB Columns)
Instead of Partition Exchange, which locked the Table and thus stopped ETL, a much better method was to insert into the
table with input rows sorted by (ORA_HASH (<Key>, 31, 0) + 1) since we had 32 Hash partitions. Graph below shows the
relative time to migrate data using different degree of parallel settings for the data migration and methods for exchanging data
into the sub partitions.
Sub Partition Table Migrate and Exchange (Relative Times)
PX32 â Partitions Migrated â 36
PX8 â Partitions Migrated â 112
PX1 â Partitions Migrated â 1800
Graph below shows the relative Time per Record for inserting into Table with Sub-Partition by Hash for the input stream
sorted by the (ORA_HASH (<Key>, 31, 0) + 1). The data was already on the Exadata.
Relative Time/Record for Sorted vs Unsorted Input into Sub-Partitioned Table
7. COLLABORATE 14 â IOUG Forum
Engineered Systems
7 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
5. Solving for Oracle Text Indexing
Since we could not use the Transportable Tablespace approach to preserving the Oracle Text Index âDR#xxxnnnn$I, etc.
tables we were left with the massive memory and CPUs available to speed up the re-generation of the indexes. The Oracle
Text preferences were created prior to importing the schema from the source. We had two types of data to be Indexed using
Oracle Text; one was XML generated from 17 tables and the other was Documents in multi-lingual text stored on a file system
and generated by recombining individual child detail records. Since both of these data objects had to be in place, we could use
an assembly line approach to Text Indexing. The steps in the assembly line were
1. Migrate the Data for the Partition in the 17 tables
2. Copy the documents to the file system
3. Generate the Recombined Document
4. Generate the XML
5. Text Index the recombined Document file
6. Text Index the XML
The assembly line throughput was about 90 minutes. All of these steps were controlled via entries in a table to trigger the steps
based on the Partitions migrated.
Since we were dealing with partitioned tables, the CTX_DDL.SYNC_INDEX procedure requires passing in the Partition
Name. The view CTXSYS.CTX_USER_PENDING appeared to be well suited to get the partition name for the pending
index sync data. However, the following query used to get the Partition Name did not perform well, taking anywhere from 10
to 20 minutes. Appendix 4.1 shows the script used. An alternative approach had to be found to determine the Pending
Partitions. The CTXSYS tables that provided the data needed are
o CTXSYS.DR$PENDING
o CTXSYS.DR$INDEX
o CTXSYS.DR$INDEX_PARTITION
Appendix 4.2 show the script for the view created to provide the Partition Names for which records were pending
SYNC_INDEX operation. The Text Indexes were created with parameter SYNC (MANUAL) and the INDEX_SYNC was
invoked using a DBMS_SCHEDULER Job which with a 10 second interval. This was preferred in favour of the
SYNC(AUTO) option because of the large number of Partitions involved and consequently the large number of Jobs that
would be invoked and the management overhead involved in enabling/disabling these Jobs. Also there was no way to pre-
determine the Partitions for which data would be received. The SYNC_INDEX Parallel Degree parameter settings were as
below based on the type of data
o Text default was 8 (but had to use 16,24,32)
o XML default was 16 (have used 24,32,64,96)
In order to reduce the number of TOKENS generated during the SYNC_INDEX process as loaded in the DR#xxxnnnn$I
Table, the indexes need to be optimized. However, the SYNC_INDEX especially when running with Parallel Degree greater
than 1, conflicts and locks when OPTIMIZE_INDEX is also running. The SYNC_INDEX option
CTX_DDL.LOCK_NOWAIT did not resolve the problem. So we now perform OPTIMIZE_INDEX for Previous Partition
to the one currently loading. However, since we can get data for the Previous Partitions with the current Partition, the
Partitions involved in the OPTIMIZE_INDEX had to be removed from the PENDING list. Appendix 4.3 shows the script
used to identify the partitions involved in the OPTIMIZE_INDEX process.
8. COLLABORATE 14 â IOUG Forum
Engineered Systems
8 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
An unforeseen impact of OPTIMIZE_INDEX is the large number of redo logs generated. There was a three to four fold
increase in the switch log frequency. Increasing the size of the redo log files may solve this particular problem.
6. Deal with Oracle Label Security
The Label Tags used by Oracle Label Security (OLS) were not custom tag but generated using OLS procedure. Thus on the
Target environment we had to update the Label Tags coming in from the Source data. In testing the Label Tag update after
the data was migrated it was found that the update after the migration was extremely slow when the OLS Policy was enabled
on the Table. So we had to replace the Label Tags during the data migration. The method for finding the matching Label Tags
for a given Label was to create equivalent OLS environment on the Target Exadata using the Labels and Policy used on the
Source. The replacement of the Label Tags was done within the INSERT INTO ... SELECT FROM construct using the
following construct for the Label_Tag conversion
(CASE
WHEN <Label_Tag> = Source_Label_Tag THEN <Target_Label_Tag>
ELSE <not_matched_tag>
END)
APPENDICES
Appendix A1: Non Partition Tables Migrate
INSERT INTO <Table_Name>
SELECT /*+ PARALLEL (T1,p_Degreee) */
FROM <Source_Table>@DbLink_Pipe T1
Appendix A2: Migrate with Partition Exchange
v_Step := 'Create => '||v_Q_Table||' => '||p_Table_Name;
EXECUTE IMMEDIATE
'create table '||v_Q_Table
||chr(10)||'TABLESPACE '||v_Tablespace_Name
||chr(10)||'NOLOGGING'
||chr(10)||'as select'
||chr(10)||' /*+'
||chr(10)||' parallel (a,'||p_Parallel_Source||')'
||chr(10)||' no_index ('||p_No_Index_Hint||')'
||chr(10)||' */'
||chr(10)||' * from '||p_Table_Name||'@'||p_Data_Source||' a'
||chr(10)||'where a.PartKey
||chr(10)||â BETWEEN p_Start_Number - '||p_Offset_Start
||chr(10)||' AND p_Start_Number - '||p_Offset_End;
9. COLLABORATE 14 â IOUG Forum
Engineered Systems
9 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
v_Step := 'Create Matching Partition Indexes'; ïš If RECORDS > 0
WORKING_INDEX
(p_Table_Name => p_Table_Name
,p_Working_Table => v_Working_Table
,p_Tablespace_Name => v_Tablespace_Name
,p_Parallel_Target => p_Parallel_Target
);
v_Step := 'Create Index ('||v_Q_Table||')';
EXECUTE IMMEDIATE
'create index '||v_Q_Table||'_I on '||v_Q_Table
||chr(10)||'(PartKey,PartKey_Keep)'
||chr(10)||'TABLESPACE '||v_Tablespace_Name
||' NOLOGGING PARALLEL '||p_Parallel_Target ;
v_Step := 'Gather Stats => '||v_Q_Table;
DBMS_STATS.GATHER_TABLE_STATS
( OWNNAME => USER
,TABNAME => v_Q_Table
,GRANULARITY => 'AUTO'
,DEGREE => DBMS_STATS.AUTO_DEGREE
,ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE
,METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO'
,CASCADE => TRUE
);
v_Step := 'Exchange Partition => '||v_Q_Table;
EXECUTE IMMEDIATE
'ALTER TABLE '||p_Table_Name
||chr(10)||' EXCHANGE PARTITIONâ
||v_Partition_Name
||chr(10)||' WITH TABLE '
||v_Working_Table
||chr(10)||' INCLUDING INDEXES'
||chr(10)||' WITHOUT VALIDATION'
||chr(10)||' UPDATE GLOBAL INDEXES';
10. COLLABORATE 14 â IOUG Forum
Engineered Systems
10 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
Appendix A3: Migrate into Sub Partition Table
INSERT INTO HASH_LOB_TABLE
SELECT /*+ NO_INDEX(A) */
* FROM Q_TABLE A
ORDER BY
A.PartKey_Keep
,(ORA_HASH(A.<hashkey>,31,0) + 1);
Appendix A4.1: View for Partitions with Pending Text Index Sync using CTX_USER_PENDING
SELECT
T1.IDX_NAME
,T1.IDX_PARTITION_NAME
,COUNT(*) RECORDS
FROM CTX_USER_PENDING T1
GROUP BY
T1.IDX_NAME
,T1.IDX_PARTITION_NAME
/
Appendix A4.2: View for Partitions with Pending Text Index Sync using CTXSYS.DR$PENDING
CREATE OR REPLACE VIEW PARTS_PENDING_SYNC_V
AS
SELECT
Z1.INDEX_NAME
,Z1.PARTITION_NAME
,Z1.PENDING_RECORDS
FROM
(select /*+ parallel (t2,2) */
(select t3.idx_name from ctxsys.dr$index t3
where t3.idx_id = a1.pnd_cid
) index_name
,t2.ixp_name partition_name
,a1.records pending_records
from ctxsys.dr$index_partition t2
,(select
t1.pnd_cid
,t1.pnd_pid
,t1.records
from
(select /*+ parallel (t0,2) */
t0.pnd_cid
,t0.pnd_pid
,count(*) Records
from ctxsys.dr$pending t0
group by
t0.pnd_cid
,t0.pnd_pid
) t1
) a1
where t2.ixp_idx_id = a1.pnd_cid
11. COLLABORATE 14 â IOUG Forum
Engineered Systems
11 | P a g e Migrate 10 TB to Exadata X3 â Tips and Tricks
White Paper
and t2.ixp_id = a1.pnd_pid
) Z1
WHERE NOT EXISTS
(SELECT NULL FROM PARTS_OPTIMIZING_V Z2 -- See Appendix 4.3
WHERE Z2.PARTITION_NAME = Z1.PARTITION_NAME
)
/
Appendix A4.3: View for Partitions undergoing OPTIMIZE_INDEX (and DBMS_SCHEDULER Job)
create or replace view PARTS_OPTIMIZING_V
as
select
A1.table_name
,A1.partition_name
from user_tab_partitions A1
,(select
(CASE
WHEN t2.job_name = '<Optimize_index_job>'
THEN '<table_name>'
....
END) TABLE_NAME
,to_char(t2.last_start_date-1,' YYYYMMDD') partdate
from user_scheduler_jobs t2
where t2.job_name in
(<List of Optimize_Index Jobs>)
and t2.state = ' RUNNING'
) A2
where A1.table_name = A2.table_name
and substrb(A1.partition_name,12,8) = A2.partdate
/
REFERENCES
1. Expert Oracle Exadata
Kerry Osborne, Randy Johnson, Tanel PĂ”der â Apress â ISBN13: 978-1-4302-3392-3 â Published 2011-08-07
2. Oracle Exadata Recipes
John Clarke â Apress â ISBN13 : 978-1-4302-4914-6 â Published 2013-02-05
3. Oracle Database 11g Release 2 Performance Tuning Tips & Techniques
Richard Niemiec â Oracle Press â ISBN13: 978-0-07-178026-1 â Published 2012-02-27
4. Oracle Database Architecture Second Edition
Thomas Kyte â Apress â ISBN13: 978-1-4302-2946-9 â Published 2010-07-25