SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
1
Paper 106-29
Methods of Storing SAS® Data into Oracle Tables
Lois Levin, Independent Consultant, Bethesda, MD
ABSTRACT
There are several ways to create a DBMS table from a SAS dataset. This paper will discuss the SAS methods that may be
used to perform loads or updates, a comparison of times elapsed, and the INSERTBUFF option and the BULKLOAD option
that can be used to perform multiple row inserts and direct path loads to optimize performance.
INTRODUCTION
You may need to store data into a database to perform a full database load, to update an existing table, or you may just be
creating a temporary table for the purpose of joining to an existing table. But whenever you store data into a database you will
need to consider methods that are fast and also safe and auditable to protect the integrity of the database. We will discuss the
following 4 methods for creating or updating a table. They are
1) PROC DBLOAD
2) The DATA step
3) PROC SQL
4) PROC APPEND
Then we will see how these methods can be optimized using
1) Multiple row inserts
2) Bulk loads
All examples use Oracle 9i and the native Oracle loader SQL*Loader, Release 9.0.1.0.0. All text will refer to Oracle but most
examples will be compatible with other DBMS’s unless noted.
All examples were run using SAS V8.2.
All examples were run on a Compaq Tru64 UNIX V5.1 (Rev. 732).
The sample dataset contains 10 million observations and 20 variables.
The database user, password, and path names have been previously defined in %let statements and are referenced here as
&user, &pass, and &path.
TRANSACTIONAL UPDATING USING PROC DBLOAD
PROC DBLOAD performs updates using the transactional method of update. That is, it executes a standard Oracle insert, one
row at a time. An example of code for loading a sample file of 10 million records and 20 variables is:
*-----------------------*;
* Create a table *;
* with PROC DBLOAD *;
*-----------------------*;
proc dbload dbms=oracle data=testin;
path = &path;
user = &user;
pw = &pass;
table = lltemp;
commit= 1000000;
limit = 0;
load;
run;
SUGI 29 Data Warehousing, Management and Quality
2
Note that the COMMIT statement is set to 1000000, which is a very large number for this example. It indicates when to do a
save after inserting a given number of rows. A small number for a COMMIT will save often and a large number will issue fewer
saves. This large value for the COMMIT will result in a faster load but not necessarily the safest. You will have to choose the
appropriate amount for the COMMIT depending on your data and the system environment.
The log includes the following statements:
NOTE: Load completed. Examine statistics below.
NOTE: Inserted (10000000) rows into table (LLTEMP)
NOTE: Rejected (0) insert attempts see the log for details.
NOTE: There were 10000000 observations read from the dataset WORK.TESTIN.
NOTE: PROCEDURE DBLOAD used:
real time 4:58:13.51
cpu time 47:47.14
USING THE LIBNAME ENGINE
In SAS Version 7/8 the LIBNAME engine allows the user to specify the database using the LIBNAME statement. The database
can then be referenced by any DATA step or PROC statement as if it were a SAS library and a table can be referenced as a
SAS dataset.
So, just as a dataset can be created using the DATA step, a DBMS table can be created in exactly the same way.
Here is an example:
*-----------------------*;
* Create a table *;
* with DATA Step *;
*-----------------------*;
libname cdwdir oracle user = &user
password = &pass
path = &path;
data cdwdir.lltemp;
set testin;
run;
The log for this example includes:
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 10000000 observations read from the data set WORK.TESTIN.
NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables.
NOTE: DATA statement used:
real time 46:14.96
cpu time 15:25.09
Loads can also be performed using the LIBNAME statement and PROC SQL.
Here is an example:
*-----------------------*;
* Create a table *;
* with PROC SQL *;
*-----------------------*;
libname cdwdir oracle user = &user
password = &pass
path = &path;
SUGI 29 Data Warehousing, Management and Quality
3
proc sql;
create table cdwdir.lltemp as
select * from testin;
quit;
The log includes;
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: Table CDWDIR.LLTEMP created, with 10000000 rows and 20 columns.
21 quit;
NOTE: PROCEDURE SQL used:
real time 49:40.27
cpu time 16:12.06
A note about SQL Pass-Through: Normally, using the LIBNAME statement implies that the LIBNAME engine is in effect and
therefore all processing is performed in SAS. But since we are writing to the database, it would seem that we should use the
SQL pass-through method that moves all processing into the DBMS. In fact when we are doing these loads, since all
processing must happen in the DBMS, both approaches are actually equivalent. So for the sake of simplicity, all examples
here will use the LIBNAME statement.
The other PROC that may be used is PROC APPEND. PROC APPEND should be used to append data to an existing table
but it can also be used to load a new table.
Here is an example using PROC APPEND:
*------------------------*;
* Create a table *;
* with PROC APPEND *;
*------------------------*;
libname cdwdir oracle user = &user
password = &pass
path = &path;
proc append base=cdwdir.lltemp
data=testin;
run;
The log includes:
NOTE: Appending WORK.TESTIN to CDWDIR.LLTEMP.
NOTE: BASE data set does not exist. DATA file is being copied to BASE file.
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 10000000 observations read from the data set WORK.TESTIN.
NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables.
NOTE: PROCEDURE APPEND used:
real time 43:48.65
cpu time 15:45.76
Note the comment generated by the LIBNAME engine. If your SAS data set includes labels, formats, and lengths, they cannot
be written to the DBMS table. These are SAS constructs and are not understood by any DBMS. If your data set contains
indices, they also cannot be loaded into the DBMS. You will have to remove the indices, load the table, and then rebuild the
indices using the DBMS software.
SUGI 29 Data Warehousing, Management and Quality
4
MULTIPLE ROW INSERTS
The loads we have just seen are all transactional loads and insert one row (observation) at a time. Loads using the LIBNAME
engine can be optimized by including the INSERTBUFF option in the LIBNAME statement. This speeds the update by allowing
you to specify the number of rows to be included in a single Oracle insert.
The INSERTBUFF option uses the transactional insert method and is comparable to using the Oracle SQL*Loader
Conventional Path Load.
Note that this is different from the COMMIT statement in DBLOAD. The COMMIT statement just executes a save after each
individual row insert. INSERTBUFF actually inserts multiple observations within one Oracle insert activity.
To use the INSERTBUFF option, just include it in the LIBNAME statement like this:
libname cdwdir oracle user = &user
password = &pass
path = &path
INSERTBUFF=32767;
This example uses 32767, which is the maximum buffer size. In most cases using that maximum can actually slow the system and
it is suggested that a smaller value be used. The exact size of the buffer will depend on your operating environment and you will
have to experiment to find the best value.
INSERTBUFF with the DATA STEP
If we use the INSERTBUFF option, and then load with a DATA step, the log is as follows:
18 data cdwdir.lltemp;
19 set testin;
20 run;
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 10000000 observations read from the data set WORK.TESTIN.
NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables.
NOTE: DATA statement used:
real time 14:30.71
cpu time 8:33.30
INSERTBUFF with PROC SQL
If we use the INSERTBUFF option, and load with PROC SQL, the log includes:
18 proc sql;
19 create table cdwdir.lltemp as
20 select * from testin;
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: Table CDWDIR.LLTEMP created, with 10000000 rows and 20 columns.
21 quit;
NOTE: PROCEDURE SQL used:
real time 11:41.02
cpu time 8:40.06
INSERTBUFF with PROC APPEND
If we use the INSERTBUFF option and PROC APPEND, the log includes:
SUGI 29 Data Warehousing, Management and Quality
5
18 proc append base=cdwdir.lltemp
19 data=testing;
20 run;
NOTE: Appending WORK.TESTIN to CDWDIR.LLTEMP.
NOTE: BASE data set does not exist. DATA file is being copied to BASE file.
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 10000000 observations read from the data set WORK.TESTIN.
NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables.
NOTE: PROCEDURE APPEND used:
real time 19:29.32
cpu time 8:19.73
BULKLOAD
In Version 8, the SAS/ACCESS engine for Oracle can use the BULKLOAD option to call Oracle’s native load utility,
SQL*Loader. It activates the DIRECT=TRUE option to execute a direct path load which optimizes loading performance.
Direct path loading, or bulk loading, indicates that SQL*Loader is writing directly to the database, as opposed to executing
individual SQL INSERT statements. It is a direct path update in that it bypasses standard SQL statement processing and
therefore results in a significant improvement in performance.
The BULKLOAD option may also be used with DB2, Sybase, Teradata, ODBC, and OLE.DB. In all cases it makes use of the
native load utility for that particular DBMS. Bulk loading is available with the OS/390 operating system in SAS Version 8.2.
BULKLOAD is not available with the DBLOAD procedure (except with Sybase where you may use the BULKCOPY option).
DBLOAD can only use the transactional method of loading and BULKLOAD uses the direct load method. BULKLOAD can be
used with a DATA step, PROC SQL and with PROC APPEND to load or update data into tables.
There is some overhead involved in setting up the loader so BULKLOAD may not be as efficient for small table loads as it is
for very large tables. You will need to test this with your particular application.
BULKLOAD with the DATA Step
Here is an example using the BULKLOAD option with the DATA step:
*----------------------*;
* BULKLOAD *;
* with DATA Step *;
*----------------------*;
libname cdwdir oracle user = &user
password = &pass
path = &path;
data cdwdir.lltemp(BULKLOAD=YES
BL_DELETE_DATAFILE=YES);
set testin;
run;
The BULKLOAD=YES option invokes the DIRECT=TRUE option for SQL*Loader. Without this, a conventional load will be
performed.
Although there is a LIBNAME statement, we are not actually using the LIBNAME engine in SAS. The BULKLOAD=YES
statement directs Oracle SQL*Loader to take over. When this happens, we are no longer in SAS. All of the log statements and
messages will come from Oracle. When SQL*Loader is finished, SAS takes over again.
SUGI 29 Data Warehousing, Management and Quality
6
BULKLOAD will create several permanent files. They are the .dat file, the .ctl file and the .log file. The .dat file is a flat file
containing all of the data to be loaded into Oracle. The .ctl file contains the Oracle statements needed for the load. The .log file
is the SQL*Loader log. These files will remain after the load is completed, so if you don’t want to keep them, you must remove
them explicitly. By default they will be named BL_<tablename>_0.dat, BL_<tablename>_0.log, etc. and they will be located in
the current directory. The BULKLOAD options BL_DAT= and BL_LOG= allow you to specify fully qualified names for these
files and therefore you can use any name and any location you like.
The BL_DELETE_DATAFILE=YES option is used here because the .dat file can be very large (as large as your data). You
need to be sure you have enough storage space to hold it but you probably do not want to keep it after the load is complete,
so this option will delete it automatically after loading. (Other DBMS’s generate different default data file names (.ixt in DB2)
and set the default for the BL_DELETE_DATAFILE option to YES so you don’t have to specify it yourself.)
The SAS log file for a bulk load is much longer than a standard SAS log because it will contain the SQL*Loader log statements
as well as the SAS log statements. The log includes the following:
***** Begin: SQL*Loader Log File *****
.
. (statements omitted)
.
**** End: SQL*Loader Log File *****n
NOTE:
***************************************
Please look in the SQL*Loader log file for the load results.
SQL*Loader Log File location: -- BL_LLTEMP_0.log --
Note: In a Client/Server environment, the Log File is located on the Server. The log
file is also echoed in the Server SAS log file.
***************************************
NOTE: DATA statement used:
real time 15:26.99
cpu time 2:19.11
BULKLOAD with PROC SQL
Here is an example using BULKLOAD with PROC SQL. The BL_LOG= option is included to specify the name and location for
the BULKLOAD log.
*----------------------*;
* BULKLOAD *;
* with PROC SQL *;
*----------------------*;
libname cdwdir oracle user = &user
password = &pass
path = &path;
proc sql;
create table cdwdir.lltemp(BULKLOAD=YES
BL_LOG=”/full/path/name/bl_lltemp.log”
BL_DELETE_DATAFILE=YES)
as select * from testin;
quit;
The SAS log for this example includes:
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
***** Begin: SQL*Loader Log File *****
.
. (statements omitted)
.
***** End: SQL*Loader Log File ****%n
SUGI 29 Data Warehousing, Management and Quality
7
NOTE: Table CDWDIR.LLTEMP created, with 10000000 rows and 20 columns.
NOTE:
***************************************
Please look in the SQL*Loader log file for the load results.
SQL*Loader Log File location: -- /full/path/name/bl_lltemp.log --
Note: In a Client/Server environment, the Log File is located on the Server.
The log file is also echoed in the Server SAS log file.
****************************************
22 quit;
NOTE: PROCEDURE SQL used:
real time 15:21.37
cpu time 2:34.26
BULKLOAD with PROC APPEND
This next example shows the same load using PROC APPEND. Here we have also included the SQL*Loader option
ERRORS= to tell Oracle to stop loading after 50 errors. (The default is 50 but it is included here for an example.) All
SQL*Loader options are proceeded by the BL_OPTIONS= statement and the entire option list is enclosed in quotes. (Note
that the option ERRORS= should be spelled with an ‘S’. The documentation indicating ERROR= is incorrect.) Further
information on the SQL*Loader options can be found in the Oracle documentation.
Sample code is:
*----------------------*;
* BULKLOAD *;
* with PROC APPEND *;
*----------------------*;
libname cdwdir oracle user = &user
password = &pass
path = &path;
proc append base=cdwdir.lltemp (BULKLOAD=YES
BL_OPTIONS=’ERRORS=50’
BL_DELETE_DATAFILE=YES)
data=testin;
run;
The SAS log includes:
NOTE: Appending WORK.TESTIN to CDWDIR.LLTEMP.
NOTE: BASE data set does not exist. DATA file is being copied to BASE file.
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 10000000 observations read from the data set WORK.TESTIN.
NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables.
**** Begin: SQL*Loader Log File *****
.
. (statements omitted)
.
**** End: SQL*Loader Log File ***%n
NOTE:
***************************************
SUGI 29 Data Warehousing, Management and Quality
8
Please look in the SQL*Loader log file for the load results.
SQL*Loader Log File location: -- BL_LLTEMP_0.log --
Note: In a Client/Server environment, the Log File is located on the Server.
The log file is also echoed in the Server SAS log file.
**************************************
NOTE: PROCEDURE APPEND used:
real time 15:20.35
cpu time 2:02.73
VERIFICATION and DEBUGGING
If you need to debug a process or if you are just interested in what is going on between SAS and the DBMS, you can use the
SASTRACE option to see all of the statements that are passed to the DBMS. Of course they will vary with the DBMS used. It
is recommended that you use SASTRACE with a small test dataset so that you do not see every single internal COMMIT. You
may also wish to direct the output to the log or you might get excessive screen output.
Here is how to invoke the trace and direct it to the log file.
options sastrace=',,,d' sastraceloc=saslog;
BULKLOAD OPTIONS
The BULKLOAD statement allows several options to control the load and the output from the load. The examples above show
the use of BL_DELETE_DATAFILE to delete the .dat file after loading, BL_LOG_= to name and locate the .log file, and
BL_OPTIONS to begin the list of SQL*Loader options. Note that the BULKLOAD options and SQL*Loader options are
different. The BULKLOAD options are documented in the SAS documentation. The SQL*Loader options are documented in
Oracle documentation. The list of BULKLOAD options follows.
BL_BADFILE
! creates file BL_<tablename>.bad containing rejected rows
BL_CONTROL
! creates file BL_<tablename>.ctl
containing DDL statements for SQLLDR control file
BL_DATAFILE
! creates file BL_<tablename>.dat
containing all of the data to be loaded.
BL_DELETE_DATAFILE
! =YES
will delete the .dat file created for loading
! =NO
will keep the .dat file.
BL_DIRECT_PATH
! =YES
Sets the Oracle SQL*Loader option DIRECT to TRUE, enabling the SQL*Loader to use Direct Path Load to insert
rows into a table. This is the default.
! =NO
Sets the Oracle SQL*Loader option DIRECT to FALSE, enabling the SQL*Loader to use Conventional Path Load
to insert rows into a table.
BL_DISCARDFILE
! creates file BL_<tablename>.dsc
containing rows neither rejected nor loaded.
BL_LOAD_METHOD
! INSERT when loading data into an empty table
! APPEND when loading data into a table containing data
! REPLACE and TRUNCATE override the default of APPEND
SUGI 29 Data Warehousing, Management and Quality
9
BL_LOG
! creates file BL_<tablename>.log
containing the SQL*Loader log
BL_OPTIONS
! begins list of SQL*Loader options. These are fully documented in the Oracle documentation.
BL_SQLLDR_PATH
! path name of SQLLDR executable. This should be automatic.
SUMMARY
Here is a summary table of the load times for all of the different methods provided. Please note that these programs were run
on a very busy system and the real times vary greatly depending on the system load. The CPU time is more consistent and so
is a better unit of comparison.
CONCLUSION
Optimizing with multiple-row inserts and bulk loading is much faster than the transactional method of loading and far more
efficient than the old DBLOAD process.
BULKLOAD uses much less CPU time and is the fastest method for loading large databases.
REFERENCES
SAS® OnlineDoc, Version 8
Levine, Fred (2001), “Using the SAS/ACCESS Libname Technology to Get Improvements in Performance and Optimization in
SAS/SQL Queries”, Proceedings of the Twenty-Sixth Annual SAS ® Users Group International Conference, 26.
CONTACT INFORMATION
The author may be contacted at
Lois831@hotmail.com
TRANSACTIONAL MULTI-ROW BULKLOAD
DBLOAD
Real Time 4:58:13.51
CPU Time 47:47.14
DATA Step
Real Time 46:14.96 14:30.71 15:26.99
CPU Time 15:25.09 8:33.30 2:19.11
PROC SQL
Real Time 49:40.27 11:41.02 15:21.37
CPU Time 16:12.06 8:40.06 2:34.26
PROC APPEND
Real Time 43:48.65 19:29.32 15:20.35
CPU Time 15:45.76 8:19.73 2:02.73
SUGI 29 Data Warehousing, Management and Quality
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and
other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

Weitere ähnliche Inhalte

Was ist angesagt?

1 - Introduction to PL/SQL
1 - Introduction to PL/SQL1 - Introduction to PL/SQL
1 - Introduction to PL/SQLrehaniltifat
 
Sql(structured query language)
Sql(structured query language)Sql(structured query language)
Sql(structured query language)Ishucs
 
Why oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cWhy oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cSatishbabu Gunukula
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sashalasti
 
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIESORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIESLudovico Caldara
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouseKomal Choudhary
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Gabriele Baldassarre
 
Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Manjeet Singh
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
Oracle architecture with details-yogiji creations
Oracle architecture with details-yogiji creationsOracle architecture with details-yogiji creations
Oracle architecture with details-yogiji creationsYogiji Creations
 
Hybrid Data Guard to Cloud GEN2 ExaCS.pdf
Hybrid Data Guard to Cloud GEN2 ExaCS.pdfHybrid Data Guard to Cloud GEN2 ExaCS.pdf
Hybrid Data Guard to Cloud GEN2 ExaCS.pdfALI ANWAR, OCP®
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadatarehaniltifat
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichDatabricks
 
Er to tables
Er to tablesEr to tables
Er to tablesGGCMP3R
 

Was ist angesagt? (20)

1 - Introduction to PL/SQL
1 - Introduction to PL/SQL1 - Introduction to PL/SQL
1 - Introduction to PL/SQL
 
Sql(structured query language)
Sql(structured query language)Sql(structured query language)
Sql(structured query language)
 
Why oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cWhy oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19c
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sas
 
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIESORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
 
Introduction to Relational Databases
Introduction to Relational DatabasesIntroduction to Relational Databases
Introduction to Relational Databases
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
 
Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Oracle SQL Basics
Oracle SQL BasicsOracle SQL Basics
Oracle SQL Basics
 
Oracle architecture with details-yogiji creations
Oracle architecture with details-yogiji creationsOracle architecture with details-yogiji creations
Oracle architecture with details-yogiji creations
 
Hybrid Data Guard to Cloud GEN2 ExaCS.pdf
Hybrid Data Guard to Cloud GEN2 ExaCS.pdfHybrid Data Guard to Cloud GEN2 ExaCS.pdf
Hybrid Data Guard to Cloud GEN2 ExaCS.pdf
 
Oracle ASM Training
Oracle ASM TrainingOracle ASM Training
Oracle ASM Training
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
 
Er to tables
Er to tablesEr to tables
Er to tables
 

Andere mochten auch

Het Nieuwe Schoolplein (Kennisnet Summerschool)
Het Nieuwe Schoolplein (Kennisnet Summerschool)Het Nieuwe Schoolplein (Kennisnet Summerschool)
Het Nieuwe Schoolplein (Kennisnet Summerschool)Marcel Kesselring
 
Large scale sql server best practices
Large scale sql server   best practicesLarge scale sql server   best practices
Large scale sql server best practicesmprabhuram
 
อารยธรรมเมโสโปเตเมีย
อารยธรรมเมโสโปเตเมียอารยธรรมเมโสโปเตเมีย
อารยธรรมเมโสโปเตเมียKran Sirikran
 
What's Next in Growth? 2016
What's Next in Growth? 2016What's Next in Growth? 2016
What's Next in Growth? 2016Andrew Chen
 
Sociale psychologie van agressie en prosociaal gedrag
Sociale psychologie van agressie en prosociaal gedragSociale psychologie van agressie en prosociaal gedrag
Sociale psychologie van agressie en prosociaal gedragrenes002
 
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】MOCKS | Yuta Morishige
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your BusinessBarry Feldman
 

Andere mochten auch (7)

Het Nieuwe Schoolplein (Kennisnet Summerschool)
Het Nieuwe Schoolplein (Kennisnet Summerschool)Het Nieuwe Schoolplein (Kennisnet Summerschool)
Het Nieuwe Schoolplein (Kennisnet Summerschool)
 
Large scale sql server best practices
Large scale sql server   best practicesLarge scale sql server   best practices
Large scale sql server best practices
 
อารยธรรมเมโสโปเตเมีย
อารยธรรมเมโสโปเตเมียอารยธรรมเมโสโปเตเมีย
อารยธรรมเมโสโปเตเมีย
 
What's Next in Growth? 2016
What's Next in Growth? 2016What's Next in Growth? 2016
What's Next in Growth? 2016
 
Sociale psychologie van agressie en prosociaal gedrag
Sociale psychologie van agressie en prosociaal gedragSociale psychologie van agressie en prosociaal gedrag
Sociale psychologie van agressie en prosociaal gedrag
 
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Ähnlich wie Sas bulk load

Exam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationExam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationKylieJonathan
 
My sql with querys
My sql with querysMy sql with querys
My sql with querysNIRMAL FELIX
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...Alex Zaballa
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...Alex Zaballa
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slidesmetsarin
 
Oracle database 12.2 new features
Oracle database 12.2 new featuresOracle database 12.2 new features
Oracle database 12.2 new featuresAlfredo Krieg
 
Oracle Database 12c "New features"
Oracle Database 12c "New features" Oracle Database 12c "New features"
Oracle Database 12c "New features" Anar Godjaev
 
Schema replication using oracle golden gate 12c
Schema replication using oracle golden gate 12cSchema replication using oracle golden gate 12c
Schema replication using oracle golden gate 12cuzzal basak
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database DeploymentsMike Willbanks
 
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu   (obscure) tools of the trade for tuning oracle sq lsTony Jambu   (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu (obscure) tools of the trade for tuning oracle sq lsInSync Conference
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesDave Stokes
 
0396 oracle-goldengate-12c-tutorial
0396 oracle-goldengate-12c-tutorial0396 oracle-goldengate-12c-tutorial
0396 oracle-goldengate-12c-tutorialKlausePaulino
 
Steps for upgrading the database to 10g release 2
Steps for upgrading the database to 10g release 2Steps for upgrading the database to 10g release 2
Steps for upgrading the database to 10g release 2nesmaddy
 
Watch Re-runs on your SQL Server with RML Utilities
Watch Re-runs on your SQL Server with RML UtilitiesWatch Re-runs on your SQL Server with RML Utilities
Watch Re-runs on your SQL Server with RML Utilitiesdpcobb
 
Oracle database 12c intro
Oracle database 12c introOracle database 12c intro
Oracle database 12c intropasalapudi
 

Ähnlich wie Sas bulk load (20)

Exam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationExam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and Administration
 
My sql with querys
My sql with querysMy sql with querys
My sql with querys
 
Mysql
MysqlMysql
Mysql
 
My sql.ppt
My sql.pptMy sql.ppt
My sql.ppt
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
Oracle database 12.2 new features
Oracle database 12.2 new featuresOracle database 12.2 new features
Oracle database 12.2 new features
 
Oracle Database 12c "New features"
Oracle Database 12c "New features" Oracle Database 12c "New features"
Oracle Database 12c "New features"
 
Firebird
FirebirdFirebird
Firebird
 
Schema replication using oracle golden gate 12c
Schema replication using oracle golden gate 12cSchema replication using oracle golden gate 12c
Schema replication using oracle golden gate 12c
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database Deployments
 
Beginbackup
BeginbackupBeginbackup
Beginbackup
 
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu   (obscure) tools of the trade for tuning oracle sq lsTony Jambu   (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New Features
 
0396 oracle-goldengate-12c-tutorial
0396 oracle-goldengate-12c-tutorial0396 oracle-goldengate-12c-tutorial
0396 oracle-goldengate-12c-tutorial
 
Steps for upgrading the database to 10g release 2
Steps for upgrading the database to 10g release 2Steps for upgrading the database to 10g release 2
Steps for upgrading the database to 10g release 2
 
Watch Re-runs on your SQL Server with RML Utilities
Watch Re-runs on your SQL Server with RML UtilitiesWatch Re-runs on your SQL Server with RML Utilities
Watch Re-runs on your SQL Server with RML Utilities
 
Oracle database 12c intro
Oracle database 12c introOracle database 12c intro
Oracle database 12c intro
 
Oracle notes
Oracle notesOracle notes
Oracle notes
 

Kürzlich hochgeladen

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Sas bulk load

  • 1. 1 Paper 106-29 Methods of Storing SAS® Data into Oracle Tables Lois Levin, Independent Consultant, Bethesda, MD ABSTRACT There are several ways to create a DBMS table from a SAS dataset. This paper will discuss the SAS methods that may be used to perform loads or updates, a comparison of times elapsed, and the INSERTBUFF option and the BULKLOAD option that can be used to perform multiple row inserts and direct path loads to optimize performance. INTRODUCTION You may need to store data into a database to perform a full database load, to update an existing table, or you may just be creating a temporary table for the purpose of joining to an existing table. But whenever you store data into a database you will need to consider methods that are fast and also safe and auditable to protect the integrity of the database. We will discuss the following 4 methods for creating or updating a table. They are 1) PROC DBLOAD 2) The DATA step 3) PROC SQL 4) PROC APPEND Then we will see how these methods can be optimized using 1) Multiple row inserts 2) Bulk loads All examples use Oracle 9i and the native Oracle loader SQL*Loader, Release 9.0.1.0.0. All text will refer to Oracle but most examples will be compatible with other DBMS’s unless noted. All examples were run using SAS V8.2. All examples were run on a Compaq Tru64 UNIX V5.1 (Rev. 732). The sample dataset contains 10 million observations and 20 variables. The database user, password, and path names have been previously defined in %let statements and are referenced here as &user, &pass, and &path. TRANSACTIONAL UPDATING USING PROC DBLOAD PROC DBLOAD performs updates using the transactional method of update. That is, it executes a standard Oracle insert, one row at a time. An example of code for loading a sample file of 10 million records and 20 variables is: *-----------------------*; * Create a table *; * with PROC DBLOAD *; *-----------------------*; proc dbload dbms=oracle data=testin; path = &path; user = &user; pw = &pass; table = lltemp; commit= 1000000; limit = 0; load; run; SUGI 29 Data Warehousing, Management and Quality
  • 2. 2 Note that the COMMIT statement is set to 1000000, which is a very large number for this example. It indicates when to do a save after inserting a given number of rows. A small number for a COMMIT will save often and a large number will issue fewer saves. This large value for the COMMIT will result in a faster load but not necessarily the safest. You will have to choose the appropriate amount for the COMMIT depending on your data and the system environment. The log includes the following statements: NOTE: Load completed. Examine statistics below. NOTE: Inserted (10000000) rows into table (LLTEMP) NOTE: Rejected (0) insert attempts see the log for details. NOTE: There were 10000000 observations read from the dataset WORK.TESTIN. NOTE: PROCEDURE DBLOAD used: real time 4:58:13.51 cpu time 47:47.14 USING THE LIBNAME ENGINE In SAS Version 7/8 the LIBNAME engine allows the user to specify the database using the LIBNAME statement. The database can then be referenced by any DATA step or PROC statement as if it were a SAS library and a table can be referenced as a SAS dataset. So, just as a dataset can be created using the DATA step, a DBMS table can be created in exactly the same way. Here is an example: *-----------------------*; * Create a table *; * with DATA Step *; *-----------------------*; libname cdwdir oracle user = &user password = &pass path = &path; data cdwdir.lltemp; set testin; run; The log for this example includes: NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: There were 10000000 observations read from the data set WORK.TESTIN. NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables. NOTE: DATA statement used: real time 46:14.96 cpu time 15:25.09 Loads can also be performed using the LIBNAME statement and PROC SQL. Here is an example: *-----------------------*; * Create a table *; * with PROC SQL *; *-----------------------*; libname cdwdir oracle user = &user password = &pass path = &path; SUGI 29 Data Warehousing, Management and Quality
  • 3. 3 proc sql; create table cdwdir.lltemp as select * from testin; quit; The log includes; NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: Table CDWDIR.LLTEMP created, with 10000000 rows and 20 columns. 21 quit; NOTE: PROCEDURE SQL used: real time 49:40.27 cpu time 16:12.06 A note about SQL Pass-Through: Normally, using the LIBNAME statement implies that the LIBNAME engine is in effect and therefore all processing is performed in SAS. But since we are writing to the database, it would seem that we should use the SQL pass-through method that moves all processing into the DBMS. In fact when we are doing these loads, since all processing must happen in the DBMS, both approaches are actually equivalent. So for the sake of simplicity, all examples here will use the LIBNAME statement. The other PROC that may be used is PROC APPEND. PROC APPEND should be used to append data to an existing table but it can also be used to load a new table. Here is an example using PROC APPEND: *------------------------*; * Create a table *; * with PROC APPEND *; *------------------------*; libname cdwdir oracle user = &user password = &pass path = &path; proc append base=cdwdir.lltemp data=testin; run; The log includes: NOTE: Appending WORK.TESTIN to CDWDIR.LLTEMP. NOTE: BASE data set does not exist. DATA file is being copied to BASE file. NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: There were 10000000 observations read from the data set WORK.TESTIN. NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables. NOTE: PROCEDURE APPEND used: real time 43:48.65 cpu time 15:45.76 Note the comment generated by the LIBNAME engine. If your SAS data set includes labels, formats, and lengths, they cannot be written to the DBMS table. These are SAS constructs and are not understood by any DBMS. If your data set contains indices, they also cannot be loaded into the DBMS. You will have to remove the indices, load the table, and then rebuild the indices using the DBMS software. SUGI 29 Data Warehousing, Management and Quality
  • 4. 4 MULTIPLE ROW INSERTS The loads we have just seen are all transactional loads and insert one row (observation) at a time. Loads using the LIBNAME engine can be optimized by including the INSERTBUFF option in the LIBNAME statement. This speeds the update by allowing you to specify the number of rows to be included in a single Oracle insert. The INSERTBUFF option uses the transactional insert method and is comparable to using the Oracle SQL*Loader Conventional Path Load. Note that this is different from the COMMIT statement in DBLOAD. The COMMIT statement just executes a save after each individual row insert. INSERTBUFF actually inserts multiple observations within one Oracle insert activity. To use the INSERTBUFF option, just include it in the LIBNAME statement like this: libname cdwdir oracle user = &user password = &pass path = &path INSERTBUFF=32767; This example uses 32767, which is the maximum buffer size. In most cases using that maximum can actually slow the system and it is suggested that a smaller value be used. The exact size of the buffer will depend on your operating environment and you will have to experiment to find the best value. INSERTBUFF with the DATA STEP If we use the INSERTBUFF option, and then load with a DATA step, the log is as follows: 18 data cdwdir.lltemp; 19 set testin; 20 run; NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: There were 10000000 observations read from the data set WORK.TESTIN. NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables. NOTE: DATA statement used: real time 14:30.71 cpu time 8:33.30 INSERTBUFF with PROC SQL If we use the INSERTBUFF option, and load with PROC SQL, the log includes: 18 proc sql; 19 create table cdwdir.lltemp as 20 select * from testin; NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: Table CDWDIR.LLTEMP created, with 10000000 rows and 20 columns. 21 quit; NOTE: PROCEDURE SQL used: real time 11:41.02 cpu time 8:40.06 INSERTBUFF with PROC APPEND If we use the INSERTBUFF option and PROC APPEND, the log includes: SUGI 29 Data Warehousing, Management and Quality
  • 5. 5 18 proc append base=cdwdir.lltemp 19 data=testing; 20 run; NOTE: Appending WORK.TESTIN to CDWDIR.LLTEMP. NOTE: BASE data set does not exist. DATA file is being copied to BASE file. NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: There were 10000000 observations read from the data set WORK.TESTIN. NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables. NOTE: PROCEDURE APPEND used: real time 19:29.32 cpu time 8:19.73 BULKLOAD In Version 8, the SAS/ACCESS engine for Oracle can use the BULKLOAD option to call Oracle’s native load utility, SQL*Loader. It activates the DIRECT=TRUE option to execute a direct path load which optimizes loading performance. Direct path loading, or bulk loading, indicates that SQL*Loader is writing directly to the database, as opposed to executing individual SQL INSERT statements. It is a direct path update in that it bypasses standard SQL statement processing and therefore results in a significant improvement in performance. The BULKLOAD option may also be used with DB2, Sybase, Teradata, ODBC, and OLE.DB. In all cases it makes use of the native load utility for that particular DBMS. Bulk loading is available with the OS/390 operating system in SAS Version 8.2. BULKLOAD is not available with the DBLOAD procedure (except with Sybase where you may use the BULKCOPY option). DBLOAD can only use the transactional method of loading and BULKLOAD uses the direct load method. BULKLOAD can be used with a DATA step, PROC SQL and with PROC APPEND to load or update data into tables. There is some overhead involved in setting up the loader so BULKLOAD may not be as efficient for small table loads as it is for very large tables. You will need to test this with your particular application. BULKLOAD with the DATA Step Here is an example using the BULKLOAD option with the DATA step: *----------------------*; * BULKLOAD *; * with DATA Step *; *----------------------*; libname cdwdir oracle user = &user password = &pass path = &path; data cdwdir.lltemp(BULKLOAD=YES BL_DELETE_DATAFILE=YES); set testin; run; The BULKLOAD=YES option invokes the DIRECT=TRUE option for SQL*Loader. Without this, a conventional load will be performed. Although there is a LIBNAME statement, we are not actually using the LIBNAME engine in SAS. The BULKLOAD=YES statement directs Oracle SQL*Loader to take over. When this happens, we are no longer in SAS. All of the log statements and messages will come from Oracle. When SQL*Loader is finished, SAS takes over again. SUGI 29 Data Warehousing, Management and Quality
  • 6. 6 BULKLOAD will create several permanent files. They are the .dat file, the .ctl file and the .log file. The .dat file is a flat file containing all of the data to be loaded into Oracle. The .ctl file contains the Oracle statements needed for the load. The .log file is the SQL*Loader log. These files will remain after the load is completed, so if you don’t want to keep them, you must remove them explicitly. By default they will be named BL_<tablename>_0.dat, BL_<tablename>_0.log, etc. and they will be located in the current directory. The BULKLOAD options BL_DAT= and BL_LOG= allow you to specify fully qualified names for these files and therefore you can use any name and any location you like. The BL_DELETE_DATAFILE=YES option is used here because the .dat file can be very large (as large as your data). You need to be sure you have enough storage space to hold it but you probably do not want to keep it after the load is complete, so this option will delete it automatically after loading. (Other DBMS’s generate different default data file names (.ixt in DB2) and set the default for the BL_DELETE_DATAFILE option to YES so you don’t have to specify it yourself.) The SAS log file for a bulk load is much longer than a standard SAS log because it will contain the SQL*Loader log statements as well as the SAS log statements. The log includes the following: ***** Begin: SQL*Loader Log File ***** . . (statements omitted) . **** End: SQL*Loader Log File *****n NOTE: *************************************** Please look in the SQL*Loader log file for the load results. SQL*Loader Log File location: -- BL_LLTEMP_0.log -- Note: In a Client/Server environment, the Log File is located on the Server. The log file is also echoed in the Server SAS log file. *************************************** NOTE: DATA statement used: real time 15:26.99 cpu time 2:19.11 BULKLOAD with PROC SQL Here is an example using BULKLOAD with PROC SQL. The BL_LOG= option is included to specify the name and location for the BULKLOAD log. *----------------------*; * BULKLOAD *; * with PROC SQL *; *----------------------*; libname cdwdir oracle user = &user password = &pass path = &path; proc sql; create table cdwdir.lltemp(BULKLOAD=YES BL_LOG=”/full/path/name/bl_lltemp.log” BL_DELETE_DATAFILE=YES) as select * from testin; quit; The SAS log for this example includes: NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. ***** Begin: SQL*Loader Log File ***** . . (statements omitted) . ***** End: SQL*Loader Log File ****%n SUGI 29 Data Warehousing, Management and Quality
  • 7. 7 NOTE: Table CDWDIR.LLTEMP created, with 10000000 rows and 20 columns. NOTE: *************************************** Please look in the SQL*Loader log file for the load results. SQL*Loader Log File location: -- /full/path/name/bl_lltemp.log -- Note: In a Client/Server environment, the Log File is located on the Server. The log file is also echoed in the Server SAS log file. **************************************** 22 quit; NOTE: PROCEDURE SQL used: real time 15:21.37 cpu time 2:34.26 BULKLOAD with PROC APPEND This next example shows the same load using PROC APPEND. Here we have also included the SQL*Loader option ERRORS= to tell Oracle to stop loading after 50 errors. (The default is 50 but it is included here for an example.) All SQL*Loader options are proceeded by the BL_OPTIONS= statement and the entire option list is enclosed in quotes. (Note that the option ERRORS= should be spelled with an ‘S’. The documentation indicating ERROR= is incorrect.) Further information on the SQL*Loader options can be found in the Oracle documentation. Sample code is: *----------------------*; * BULKLOAD *; * with PROC APPEND *; *----------------------*; libname cdwdir oracle user = &user password = &pass path = &path; proc append base=cdwdir.lltemp (BULKLOAD=YES BL_OPTIONS=’ERRORS=50’ BL_DELETE_DATAFILE=YES) data=testin; run; The SAS log includes: NOTE: Appending WORK.TESTIN to CDWDIR.LLTEMP. NOTE: BASE data set does not exist. DATA file is being copied to BASE file. NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: There were 10000000 observations read from the data set WORK.TESTIN. NOTE: The data set CDWDIR.LLTEMP has 10000000 observations and 20 variables. **** Begin: SQL*Loader Log File ***** . . (statements omitted) . **** End: SQL*Loader Log File ***%n NOTE: *************************************** SUGI 29 Data Warehousing, Management and Quality
  • 8. 8 Please look in the SQL*Loader log file for the load results. SQL*Loader Log File location: -- BL_LLTEMP_0.log -- Note: In a Client/Server environment, the Log File is located on the Server. The log file is also echoed in the Server SAS log file. ************************************** NOTE: PROCEDURE APPEND used: real time 15:20.35 cpu time 2:02.73 VERIFICATION and DEBUGGING If you need to debug a process or if you are just interested in what is going on between SAS and the DBMS, you can use the SASTRACE option to see all of the statements that are passed to the DBMS. Of course they will vary with the DBMS used. It is recommended that you use SASTRACE with a small test dataset so that you do not see every single internal COMMIT. You may also wish to direct the output to the log or you might get excessive screen output. Here is how to invoke the trace and direct it to the log file. options sastrace=',,,d' sastraceloc=saslog; BULKLOAD OPTIONS The BULKLOAD statement allows several options to control the load and the output from the load. The examples above show the use of BL_DELETE_DATAFILE to delete the .dat file after loading, BL_LOG_= to name and locate the .log file, and BL_OPTIONS to begin the list of SQL*Loader options. Note that the BULKLOAD options and SQL*Loader options are different. The BULKLOAD options are documented in the SAS documentation. The SQL*Loader options are documented in Oracle documentation. The list of BULKLOAD options follows. BL_BADFILE ! creates file BL_<tablename>.bad containing rejected rows BL_CONTROL ! creates file BL_<tablename>.ctl containing DDL statements for SQLLDR control file BL_DATAFILE ! creates file BL_<tablename>.dat containing all of the data to be loaded. BL_DELETE_DATAFILE ! =YES will delete the .dat file created for loading ! =NO will keep the .dat file. BL_DIRECT_PATH ! =YES Sets the Oracle SQL*Loader option DIRECT to TRUE, enabling the SQL*Loader to use Direct Path Load to insert rows into a table. This is the default. ! =NO Sets the Oracle SQL*Loader option DIRECT to FALSE, enabling the SQL*Loader to use Conventional Path Load to insert rows into a table. BL_DISCARDFILE ! creates file BL_<tablename>.dsc containing rows neither rejected nor loaded. BL_LOAD_METHOD ! INSERT when loading data into an empty table ! APPEND when loading data into a table containing data ! REPLACE and TRUNCATE override the default of APPEND SUGI 29 Data Warehousing, Management and Quality
  • 9. 9 BL_LOG ! creates file BL_<tablename>.log containing the SQL*Loader log BL_OPTIONS ! begins list of SQL*Loader options. These are fully documented in the Oracle documentation. BL_SQLLDR_PATH ! path name of SQLLDR executable. This should be automatic. SUMMARY Here is a summary table of the load times for all of the different methods provided. Please note that these programs were run on a very busy system and the real times vary greatly depending on the system load. The CPU time is more consistent and so is a better unit of comparison. CONCLUSION Optimizing with multiple-row inserts and bulk loading is much faster than the transactional method of loading and far more efficient than the old DBLOAD process. BULKLOAD uses much less CPU time and is the fastest method for loading large databases. REFERENCES SAS® OnlineDoc, Version 8 Levine, Fred (2001), “Using the SAS/ACCESS Libname Technology to Get Improvements in Performance and Optimization in SAS/SQL Queries”, Proceedings of the Twenty-Sixth Annual SAS ® Users Group International Conference, 26. CONTACT INFORMATION The author may be contacted at Lois831@hotmail.com TRANSACTIONAL MULTI-ROW BULKLOAD DBLOAD Real Time 4:58:13.51 CPU Time 47:47.14 DATA Step Real Time 46:14.96 14:30.71 15:26.99 CPU Time 15:25.09 8:33.30 2:19.11 PROC SQL Real Time 49:40.27 11:41.02 15:21.37 CPU Time 16:12.06 8:40.06 2:34.26 PROC APPEND Real Time 43:48.65 19:29.32 15:20.35 CPU Time 15:45.76 8:19.73 2:02.73 SUGI 29 Data Warehousing, Management and Quality SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.