SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Recipes of
Data Warehouse
and
Business Intelligence
Massimo Cenci Data Warehouse Blog
http://massimocenci.blogspot.it
A messaging
system for Oracle
Data Warehouse
(part 1)
Micro ETL Foundation
Introduction (1)
• A goal of what I have called the Micro ETL Foundation (MEF) is to provide some simple and immediate solutions to the need of a
Data Warehouse. What could be more simple (and necessary) of a log message? It may seem strange to devote an entire article to
describe how to report the message "Hello world"; message, famous for all the programmers of the computer world. But we find
that what appears simple is actually a bit more complicated.
• If we take the first paragraph of any programming book, (eg. java) the solution is expressed with a single line:
System.out.println ( Hello Word! ");
• If we use another programming language (eg. PL / SQL Oracle), the solution is again expressed in a single line:
Dbms_output.put_line ('Hello World');
• What I want to emphasize is the incompleteness of the received message; In fact, the complexity lies not in the message itself: at
the end it is a simple function call. The complexity is the lack of context. Complexity is the metadata.
• Suppose that at this time are running more loading processes. The fact of receiving a warning or alert message is not enough.
What I want to know is when the message was sent, which job was running, how much time has elapsed since the previous
message, which was the current procedure at the time of the message.
Introduction (2)
• We find out that answer to all these questions is no longer trivial. When we go out from theoretical examples printed on the books
and enter into the reality of the daily work, everything becomes more complicated.
• We will see the theory and practice, then the code, which will give us an answer to contextual needs described above. And the
solution will be (surprise!), again expressed in a single line.
• The programming language used is the Oracle PL / SQL (simply because it is widely used in the Data Warehouse world and then I
turn to a wider audience), but the techniques exposed are easily playable on any other RDBMS. I forgot: we also see how to
ensure that the message is sent via e-mail.
• This, however, is not just for programmers. It 'an article that shows all the hard work behind a simple log message. Just imagine the
work that is behind the complete loading process of a Data Warehouse.
Definitions
Naming Convention
I spent a lot of words on the importance and need to always use a naming convention for the Data Warehouse projects. I summarize
some concepts. In general, the naming convention is the method by which you choose to assign a name to the various entities of the
system being designed. This method should produce a name that must be able to represent immediately the semantic nature of the
entities that we will use. It does not matter if they belong to a Data Warehouse system or an operational system.
ETL Process
The ETL process is the set of programs that load data from external systems into the Data Warehouse tables . Since this set can be very
complex and detailed, it is extremely important to have a messaging system that gives me the most information possible about the
process.
Job
A job is a unit of logic describe a very specific task, such as loading a dimension of analysis or of a fact table or both. A set of job-related
between them is in turn a job that is part of a schedule. Ie is activated at a preset time.
In turn, a job consists of simple and sequential processing units that we call unit.
Unit
The unit is the elementary processing unit (so it is code) which in turn can call other procedures and functions that can be defined in a
generic way, modules
Execution
A job usually runs at night, but it could also turn several times a day. Each execution of a job is a run that has to be identified with a
sequential number, which can be defined as exec counter.
For the time being be satisfied with these definitions very short. Later we will analyze in more detail
Definitions
Requirements
• The requirements are very simple. To have a basic messaging system that signals everything that it is useful to monitor the
execution of the ETL process.
• It is a task of the designer to decide the content of these messages, which may simply be informative, as the number of rows
processed, or of particular attention, such as the identification of errors or anomalies. In these cases it should be possible to send
the message via e-mail.
• The system should not merely store all messages generated, but must also have the context information.
• The context is described in the paragraph relative to the definition of the ETL process. More context information we insert, and
more we can control the system and we will be more efficient and faster to resolve the problems.
• I suppose a minimum knowledge of Oracle and SQL and PL/SQL languages.
Design
• The design of the messaging component of the Micro ETL Foundation (MEF), consists of 3 tables, 2 sequences and 1 package.
• We can see them as the basic components. The minimum necessary for the other MEF components that we will use in the future.
• The naming convention used is a smaller version of the one to be applied to all objects of a Data Warehouse. [See
http://www.slideshare.net/jackbim/recipes-6-of-data-warehouse-naming-convention-techniques].
• But it would be unreasonable to apply to MEF the same logic that is used to organize the hundreds of tables typical of a large Data
Warehouse. Also we can use MEF in any type of project. We will use this simplified structure:
<entity name>=<area code>_ <type name>_<logical code>
• The basic package, being common to all, will be further simplified by eliminating the <logical name>.
• The area code is "MEF".
• The table MEF_CFT is the table of system configuration.
• The table MEF_EMAIL_CFT is the specific configuration table for e-mail addresses, the table MEF_MSG_LOT is one that will keep
the text messages.
• The sequence MEF_MSG_SEQ gives a sequence number to each message.
• The sequence MEF_RUN_SEQ is used to numerically identify each job execution.
• The Oracle package MEF is a library of programs that manages those objects.
• All scripts for creating objects, can be downloaded from slideshare that show in a pragmatic and fun way the system
[http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2]. The scripts are very
minimal, with only the main structures, leaving the reader with the completion of all other accessory structures such as indexes,
constraints, etc..
• The tables will be created in the default tablespace, but you should always create ad-hoc tablespace.
The MEF_CFT table
• This table contains general information about the MEF and the Data Warehouse. Other columns will be added in the future (or on
your choice)
prj_cod: Code of the Data Warehouse project
user_cod: name of the Oracle user for ETL.
email_srv_cod: E-mail Server. Every corporate has always a server for managing e-mail, indicate here that server.
mef_root_txt: Path of the folder of MEF scripts.
mef_dir: Oracle directory pointing to mef_root_txt
The MEF_MSG_LOT table
• This table stores all log messages that are sent by loading process.
seq_num: Sequential number of the message. It is obtained from an Oracle sequence.
day_cod: Time stamp of the message insert in the format YYYYMMDD.
sched_cod: The identifier for the schedule to which the job belongs.
job_cod: job identifier. It is a logical entity in the sense that we think of it as the launch of a list of processing units.
unit_cod: Identifier of the processing unit within the job. We can think of it as a procedure or a function of the Oracle package.
module_cod: module Identifier. A unit, though complex, can in turn call the sub-routine or subfunctions, ie modules.
In this case, it is interesting to know yhid detail.
rows_num: Number of rows processed. Typically, this field is not set, but if we want to report the number of rows, for example,
inserted into a table, we also have this information.
line_txt: Message text
cline_txt: Message text in CLOB type.
ss_num: Number of seconds that have elapsed since the previous message. This information, together with the next two,
provides a summable data. If we wanted to know how long it took all the statements of a certain kind, we would be able to
calculate.
mi_num: Number of minutes that have elapsed since the previous message
hh_num: Number of hours that occurred since the previous message
elapsed_txt: Time elapsed since the previous message in the format HH24:MI:SS
stamp_dts: Time stamp insertion of the message.
exec_cnt: The identifier for the execution of the job. Every run of a job should be characterized by a number,
in turn, extracted from an Oracle sequence.
user_cod: Oracle user who posted the message. It is setted automatically by a session variable. It can be useful in cases
where multiple users contribute to the loading process(not recommended)
The MEF_EMAIL_CFT table
• This table configures the email addresses.
email_cod: Code to identify a group of recipients.
from_txt: Sender of the message. This name will appear as the sender of the e-mail message. Do not use special characters, nor
the blank between words. Eg. not set "Administrator ETL" but "Amministratore_ETL" otherwise you get a run-time error message
like:
ORA-29279: SMTP permanent error: 501
5.5.4 Invalid arguments
to_txt: Identifier of the message recipient
cc_txt: This e-mail address of the recipient in knowledge
subj_txt: Subject default message
status_cod: Status (1 = active, 0 = inactive) of the recipient
The MEF_MSG_SEQ sequence
• The sequence is an Oracle object. In practice it is a universal counter that increments each time you request it. Each message line
must have its sequential number. It is more functional than time stamp to sort the table.
• Because sometimes the messages are separated at a fraction of a second of each other, the time stamp might not be sufficiently
discriminating.
The MEF_RUN_SEQ sequence
• Sequence that indicates unambiguously, a run of a job.
The MEF package (1)
• This is the basic package. I suggest to develop all the code inside PL/SQL packages (they are basically libraries), which allow a better
management and use of the code. Now a short description of the units contained in the package.
• f_str: Utility function to generate a string after replacing the input variables.
• p_ins_msg_lot: This procedure perform the insertion of the message in the MEF_MSG_LOT table receiving a variable of row type
as input parameter. This procedure has the "pragma autonomous_transaction". It is very important and requires a thorough
description. It seems incredible, but if there were not, it would not be possible this messaging system. The pragma
autonomous_transaction allows us to commit (i.e. to validate into the database) only and exclusively the DML statements of the
unit which contains this compiler directive. This concept is crucial because it allows us to commit (so to insert data into the
database) without affecting the logic of the loading process. Let's clarify with an example. Typically, before to loading the daily data
into a table, you delete, first of all, the data of that day, and then load/reload the new data. (forget for a moment the partition
manipulation) You do the commit at the end, where the loading, (delete/insert), was successfully completed. If i ran a commit after
the delete and the insert has a problem, I could have the loss of data of the day. As the messaging needs to do a commit of the
message in the table, the execution of the innocent message "I have done the delete", would validate even the delete itself. And
this is a side effect not acceptable. The autonomous transaction solve the problem: the PL/SQL Oracle engine, produce a
"daughter" transaction who live an autonomous life, which validates the data in the MEF_MSG_LOT table without affecting the
logic of the parent transaction.
• p_rae: The management of the exception is standardized in the following manner. When any Oracle error happen, in the "when
others" instruction there is the call of the p_rae procedure that enriches the content of the error with other useful information,
such as, for example, where the exception occurred. The output will always be the standard error pv_error. The "when pv_error" is
used to make sure that you keep the original error.
• p_init: Other private procedure. It initializes the line_row variable to perform the insert in the table.
• delta_time: Procedure that, based on the input parameters, such as the date of the last message and the current date , calculates
all the delta-time information, how many seconds, minutes, hours have passed and a delta time in the 'HH24:MI:SS' format. The
ways in which we can calculate the delta time are numerous: one used is just one of many.
• f_get_seq_val: Function that extracts generalized, in a dynamic way, the next number of the Oracle sequence whose name is given
in the parameter.
• f_get_exec_cnt: Function that extracts the execution number of the job. Before calling the generic function f_get_seq_val, it verify
that has not already been set as a global variable: in this case, use the number of the current execution.
The MEF package (2)
• f_get_cft: Function that extracts the current configuration from the MEF_CFT table .
• p_esend: A function that performs the sending of email via the UTL_MAIL package.
• p_send: This is the procedure that sends the message.
• p_mail: Procedure for sending the e-mail. Using the email code in inputit looks for all the recipients in the MEF_EMAIL_CFT table
and calls the p_esend passing all required parameters.
System configuration (1)
• Before you can create all the structures described above, you must perform some environment check.
• First of all it is necessary to verify that the Oracle RDBMS is designed for sending emails.
• So you have to make sure that the Oracle user that sends the messages has all grant necessary for its operations. Let's see in detail.
SMTP check
• We need to verify that the Oracle RDBMS is designed for sending emails. In fact the sending is activated by calling a procedure that
is part of a package of the RDBMS. To verify this, connect to SQL * Plus with SYS user and check the package UTL_MAIL (from
Oracle 11):
Sqlplus / as sysdba
SQL> descr utl_mail
ERROR:
ORA-04043: object utl_mail does not exist
• If you get this error message, you need to install the package system UTL_MAIL and give the execute permission to the user.
• Always as user SYS, run the script to install the package from the rdbms/admin folder of the Oracle home, and give the execution
permissions to the ETL user.
• So you have to check the smtp server in the Oracle initialization parameter file; in the following example it is not setted and I
indicate you how to do it. Here is the sequence of instructions for making these checks (remember to replace the file path of the
utlmail.sql file with the relative path to your Oracle installation):
System configuration (2)
D:>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 25 15:35:49 2014
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning option
SQL> @...RDBMSADMINutlmail.sql
SQL> @...RDBMSADMINprvtmail.plb
Package created.
Synonym created.
SQL> grant execute on UTL_MAIL to <ETL user>;
Grant succeeded.
SQL> sho parameters smtp
NAME TYPE VALUE
------------------------------------ ----------- -------
smtp_out_server string
SQL> alter system set smtp_out_server = <email server> scope=both;
System altered.
System configuration (3)
• Almost always, in the company there is a mail server, the value should be include the domain, for example. exch.dev.com.
• The option "scope = both" make the change effective immediately and permanently.
• We must also create and configure an ACL (from Oracle11) using the system package dbms_network_acl_admin. The ACL (Access
Control List) is only one way to define external resources to the RDBMS (such as email servers) and allow access to users.
• Now open slideshare [http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2]
where you will find all the instructions on how to download and run the installation script of this messaging system. The script will
execute all the necessary settings for you.
• All this will take no longer than 5 minutes of work.
Test
• In addition to the tests showned in the slides of slideshare, we perform another test now, certainly more exhaustive, which clearly
shows the functionality simulating a piece of ETL load.
• We start by creating a test table, initializing it from a system table. At this point we run an anonymous block of code (anonymous
block is defined as the set of SQL statements included between a begin and an end) that will simulate a real load of a table with
insert and delete of data. The sequence of steps is quite simple: you initialize global variables package to better identify the various
steps in the messaging table.
create table SALES as select * from tabs;
begin
mef.pv_sched_cod := 'Daily';
mef.pv_job_cod := 'Staging tables';
mef.pv_unit_cod := 'Load sales table';
mef.pv_exec_cnt := 10;
mef.p_send('proc_prova','Load of SALES table');
mef.p_send('proc_prova','Deleting...');
delete from sales;
mef.p_send('proc_prova','Deleted');
mef.p_send('proc_prova','Loading...');
insert into sales select * from tabs;
mef.p_send('proc_prova','Loaded');
mef.p_mail('MEF','ETL_administrator','Sales table loaded');
mef.p_send('proc_prova','Load ended');
end;
/
• The final result obtained, which can be seen in the table MEF_MSG_LOT, it is very interesting, and gives you the wealth of
contextual information that is talked about in the beginning.
Conclusion
• We saw in detail the steps required to build a messaging system, simple, but very useful for all the people working in the Data
Warehouse projects.
• This implementation, which is the basis of my Micro-ETL-Foundation, obviously works on any Oracle-PL/SQL project, is non-
invasive, and can be applied at any time inserting simple procedure calls in an existing ETL process.

Weitere ähnliche Inhalte

Was ist angesagt?

Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Massimo Cenci
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Massimo Cenci
 
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Massimo Cenci
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Massimo Cenci
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 Richie Rump
 
MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017Dave Stokes
 
Getting Started with MySQL I
Getting Started with MySQL IGetting Started with MySQL I
Getting Started with MySQL ISankhya_Analytics
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsDave Stokes
 
Multiple files single target single interface
Multiple files single target single interfaceMultiple files single target single interface
Multiple files single target single interfaceDharmaraj Borse
 
Oracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New FeaturesOracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New FeaturesAlex Zaballa
 
Oracle Data Redaction
Oracle Data RedactionOracle Data Redaction
Oracle Data RedactionAlex Zaballa
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain ExplainedJeremy Coates
 
Oracle Database 12c - Data Redaction
Oracle Database 12c - Data RedactionOracle Database 12c - Data Redaction
Oracle Database 12c - Data RedactionAlex Zaballa
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System iChuck Walker
 

Was ist angesagt? (20)

Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
 
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012
 
DataBase Management System Lab File
DataBase Management System Lab FileDataBase Management System Lab File
DataBase Management System Lab File
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
Dbms lab Manual
Dbms lab ManualDbms lab Manual
Dbms lab Manual
 
MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017MySQL Replication Evolution -- Confoo Montreal 2017
MySQL Replication Evolution -- Confoo Montreal 2017
 
Getting Started with MySQL I
Getting Started with MySQL IGetting Started with MySQL I
Getting Started with MySQL I
 
Less08 Schema
Less08 SchemaLess08 Schema
Less08 Schema
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
Multiple files single target single interface
Multiple files single target single interfaceMultiple files single target single interface
Multiple files single target single interface
 
Oracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New FeaturesOracle Database 12.1.0.2 New Features
Oracle Database 12.1.0.2 New Features
 
Oracle Data Redaction
Oracle Data RedactionOracle Data Redaction
Oracle Data Redaction
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain Explained
 
Oracle Database 12c - Data Redaction
Oracle Database 12c - Data RedactionOracle Database 12c - Data Redaction
Oracle Database 12c - Data Redaction
 
Sql loader good example
Sql loader good exampleSql loader good example
Sql loader good example
 
Oracle sql loader utility
Oracle sql loader utilityOracle sql loader utility
Oracle sql loader utility
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
 

Ähnlich wie ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for Oracle Data Warehouse (part 1)

Ähnlich wie ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for Oracle Data Warehouse (part 1) (20)

stigbot_beta
stigbot_betastigbot_beta
stigbot_beta
 
Mca2050 computer architecture
Mca2050  computer architectureMca2050  computer architecture
Mca2050 computer architecture
 
Lab6 rtos
Lab6 rtosLab6 rtos
Lab6 rtos
 
Concurrency and parallel in .net
Concurrency and parallel in .netConcurrency and parallel in .net
Concurrency and parallel in .net
 
Bsc it winter 2013 2nd sem
Bsc it  winter 2013 2nd semBsc it  winter 2013 2nd sem
Bsc it winter 2013 2nd sem
 
++Matlab 14 sesiones
++Matlab 14 sesiones++Matlab 14 sesiones
++Matlab 14 sesiones
 
Nt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language AnalysisNt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language Analysis
 
The best ETL questions in a nut shell
The best ETL questions in a nut shellThe best ETL questions in a nut shell
The best ETL questions in a nut shell
 
Buffer overflow attacks
Buffer overflow attacksBuffer overflow attacks
Buffer overflow attacks
 
BigDataDebugging
BigDataDebuggingBigDataDebugging
BigDataDebugging
 
ELT Publishing Tool Overview V3_Jeff
ELT Publishing Tool Overview V3_JeffELT Publishing Tool Overview V3_Jeff
ELT Publishing Tool Overview V3_Jeff
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Multithreading by rj
Multithreading by rjMultithreading by rj
Multithreading by rj
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
Introduction to sql server
Introduction to sql serverIntroduction to sql server
Introduction to sql server
 
5010
50105010
5010
 
Linux Assignment 3
Linux Assignment 3Linux Assignment 3
Linux Assignment 3
 
Chapter 3 chapter reading task
Chapter 3 chapter reading taskChapter 3 chapter reading task
Chapter 3 chapter reading task
 
Wireless Communication Network Communication
Wireless Communication Network CommunicationWireless Communication Network Communication
Wireless Communication Network Communication
 
DLL Tutor maXbox starter28
DLL Tutor maXbox starter28DLL Tutor maXbox starter28
DLL Tutor maXbox starter28
 

Mehr von Massimo Cenci

Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaMassimo Cenci
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Massimo Cenci
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniMassimo Cenci
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongMassimo Cenci
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmerMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiMassimo Cenci
 

Mehr von Massimo Cenci (16)

Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging area
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etl
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmer
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sql
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
 

Kürzlich hochgeladen

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 

Kürzlich hochgeladen (20)

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 

ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for Oracle Data Warehouse (part 1)

  • 1. Recipes of Data Warehouse and Business Intelligence Massimo Cenci Data Warehouse Blog http://massimocenci.blogspot.it A messaging system for Oracle Data Warehouse (part 1) Micro ETL Foundation
  • 2. Introduction (1) • A goal of what I have called the Micro ETL Foundation (MEF) is to provide some simple and immediate solutions to the need of a Data Warehouse. What could be more simple (and necessary) of a log message? It may seem strange to devote an entire article to describe how to report the message "Hello world"; message, famous for all the programmers of the computer world. But we find that what appears simple is actually a bit more complicated. • If we take the first paragraph of any programming book, (eg. java) the solution is expressed with a single line: System.out.println ( Hello Word! "); • If we use another programming language (eg. PL / SQL Oracle), the solution is again expressed in a single line: Dbms_output.put_line ('Hello World'); • What I want to emphasize is the incompleteness of the received message; In fact, the complexity lies not in the message itself: at the end it is a simple function call. The complexity is the lack of context. Complexity is the metadata. • Suppose that at this time are running more loading processes. The fact of receiving a warning or alert message is not enough. What I want to know is when the message was sent, which job was running, how much time has elapsed since the previous message, which was the current procedure at the time of the message.
  • 3. Introduction (2) • We find out that answer to all these questions is no longer trivial. When we go out from theoretical examples printed on the books and enter into the reality of the daily work, everything becomes more complicated. • We will see the theory and practice, then the code, which will give us an answer to contextual needs described above. And the solution will be (surprise!), again expressed in a single line. • The programming language used is the Oracle PL / SQL (simply because it is widely used in the Data Warehouse world and then I turn to a wider audience), but the techniques exposed are easily playable on any other RDBMS. I forgot: we also see how to ensure that the message is sent via e-mail. • This, however, is not just for programmers. It 'an article that shows all the hard work behind a simple log message. Just imagine the work that is behind the complete loading process of a Data Warehouse.
  • 4. Definitions Naming Convention I spent a lot of words on the importance and need to always use a naming convention for the Data Warehouse projects. I summarize some concepts. In general, the naming convention is the method by which you choose to assign a name to the various entities of the system being designed. This method should produce a name that must be able to represent immediately the semantic nature of the entities that we will use. It does not matter if they belong to a Data Warehouse system or an operational system. ETL Process The ETL process is the set of programs that load data from external systems into the Data Warehouse tables . Since this set can be very complex and detailed, it is extremely important to have a messaging system that gives me the most information possible about the process. Job A job is a unit of logic describe a very specific task, such as loading a dimension of analysis or of a fact table or both. A set of job-related between them is in turn a job that is part of a schedule. Ie is activated at a preset time. In turn, a job consists of simple and sequential processing units that we call unit. Unit The unit is the elementary processing unit (so it is code) which in turn can call other procedures and functions that can be defined in a generic way, modules Execution A job usually runs at night, but it could also turn several times a day. Each execution of a job is a run that has to be identified with a sequential number, which can be defined as exec counter. For the time being be satisfied with these definitions very short. Later we will analyze in more detail
  • 6. Requirements • The requirements are very simple. To have a basic messaging system that signals everything that it is useful to monitor the execution of the ETL process. • It is a task of the designer to decide the content of these messages, which may simply be informative, as the number of rows processed, or of particular attention, such as the identification of errors or anomalies. In these cases it should be possible to send the message via e-mail. • The system should not merely store all messages generated, but must also have the context information. • The context is described in the paragraph relative to the definition of the ETL process. More context information we insert, and more we can control the system and we will be more efficient and faster to resolve the problems. • I suppose a minimum knowledge of Oracle and SQL and PL/SQL languages.
  • 7. Design • The design of the messaging component of the Micro ETL Foundation (MEF), consists of 3 tables, 2 sequences and 1 package. • We can see them as the basic components. The minimum necessary for the other MEF components that we will use in the future. • The naming convention used is a smaller version of the one to be applied to all objects of a Data Warehouse. [See http://www.slideshare.net/jackbim/recipes-6-of-data-warehouse-naming-convention-techniques]. • But it would be unreasonable to apply to MEF the same logic that is used to organize the hundreds of tables typical of a large Data Warehouse. Also we can use MEF in any type of project. We will use this simplified structure: <entity name>=<area code>_ <type name>_<logical code> • The basic package, being common to all, will be further simplified by eliminating the <logical name>. • The area code is "MEF". • The table MEF_CFT is the table of system configuration. • The table MEF_EMAIL_CFT is the specific configuration table for e-mail addresses, the table MEF_MSG_LOT is one that will keep the text messages. • The sequence MEF_MSG_SEQ gives a sequence number to each message. • The sequence MEF_RUN_SEQ is used to numerically identify each job execution. • The Oracle package MEF is a library of programs that manages those objects. • All scripts for creating objects, can be downloaded from slideshare that show in a pragmatic and fun way the system [http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2]. The scripts are very minimal, with only the main structures, leaving the reader with the completion of all other accessory structures such as indexes, constraints, etc.. • The tables will be created in the default tablespace, but you should always create ad-hoc tablespace.
  • 8. The MEF_CFT table • This table contains general information about the MEF and the Data Warehouse. Other columns will be added in the future (or on your choice) prj_cod: Code of the Data Warehouse project user_cod: name of the Oracle user for ETL. email_srv_cod: E-mail Server. Every corporate has always a server for managing e-mail, indicate here that server. mef_root_txt: Path of the folder of MEF scripts. mef_dir: Oracle directory pointing to mef_root_txt
  • 9. The MEF_MSG_LOT table • This table stores all log messages that are sent by loading process. seq_num: Sequential number of the message. It is obtained from an Oracle sequence. day_cod: Time stamp of the message insert in the format YYYYMMDD. sched_cod: The identifier for the schedule to which the job belongs. job_cod: job identifier. It is a logical entity in the sense that we think of it as the launch of a list of processing units. unit_cod: Identifier of the processing unit within the job. We can think of it as a procedure or a function of the Oracle package. module_cod: module Identifier. A unit, though complex, can in turn call the sub-routine or subfunctions, ie modules. In this case, it is interesting to know yhid detail. rows_num: Number of rows processed. Typically, this field is not set, but if we want to report the number of rows, for example, inserted into a table, we also have this information. line_txt: Message text cline_txt: Message text in CLOB type. ss_num: Number of seconds that have elapsed since the previous message. This information, together with the next two, provides a summable data. If we wanted to know how long it took all the statements of a certain kind, we would be able to calculate. mi_num: Number of minutes that have elapsed since the previous message hh_num: Number of hours that occurred since the previous message elapsed_txt: Time elapsed since the previous message in the format HH24:MI:SS stamp_dts: Time stamp insertion of the message. exec_cnt: The identifier for the execution of the job. Every run of a job should be characterized by a number, in turn, extracted from an Oracle sequence. user_cod: Oracle user who posted the message. It is setted automatically by a session variable. It can be useful in cases where multiple users contribute to the loading process(not recommended)
  • 10. The MEF_EMAIL_CFT table • This table configures the email addresses. email_cod: Code to identify a group of recipients. from_txt: Sender of the message. This name will appear as the sender of the e-mail message. Do not use special characters, nor the blank between words. Eg. not set "Administrator ETL" but "Amministratore_ETL" otherwise you get a run-time error message like: ORA-29279: SMTP permanent error: 501 5.5.4 Invalid arguments to_txt: Identifier of the message recipient cc_txt: This e-mail address of the recipient in knowledge subj_txt: Subject default message status_cod: Status (1 = active, 0 = inactive) of the recipient
  • 11. The MEF_MSG_SEQ sequence • The sequence is an Oracle object. In practice it is a universal counter that increments each time you request it. Each message line must have its sequential number. It is more functional than time stamp to sort the table. • Because sometimes the messages are separated at a fraction of a second of each other, the time stamp might not be sufficiently discriminating. The MEF_RUN_SEQ sequence • Sequence that indicates unambiguously, a run of a job.
  • 12. The MEF package (1) • This is the basic package. I suggest to develop all the code inside PL/SQL packages (they are basically libraries), which allow a better management and use of the code. Now a short description of the units contained in the package. • f_str: Utility function to generate a string after replacing the input variables. • p_ins_msg_lot: This procedure perform the insertion of the message in the MEF_MSG_LOT table receiving a variable of row type as input parameter. This procedure has the "pragma autonomous_transaction". It is very important and requires a thorough description. It seems incredible, but if there were not, it would not be possible this messaging system. The pragma autonomous_transaction allows us to commit (i.e. to validate into the database) only and exclusively the DML statements of the unit which contains this compiler directive. This concept is crucial because it allows us to commit (so to insert data into the database) without affecting the logic of the loading process. Let's clarify with an example. Typically, before to loading the daily data into a table, you delete, first of all, the data of that day, and then load/reload the new data. (forget for a moment the partition manipulation) You do the commit at the end, where the loading, (delete/insert), was successfully completed. If i ran a commit after the delete and the insert has a problem, I could have the loss of data of the day. As the messaging needs to do a commit of the message in the table, the execution of the innocent message "I have done the delete", would validate even the delete itself. And this is a side effect not acceptable. The autonomous transaction solve the problem: the PL/SQL Oracle engine, produce a "daughter" transaction who live an autonomous life, which validates the data in the MEF_MSG_LOT table without affecting the logic of the parent transaction. • p_rae: The management of the exception is standardized in the following manner. When any Oracle error happen, in the "when others" instruction there is the call of the p_rae procedure that enriches the content of the error with other useful information, such as, for example, where the exception occurred. The output will always be the standard error pv_error. The "when pv_error" is used to make sure that you keep the original error. • p_init: Other private procedure. It initializes the line_row variable to perform the insert in the table. • delta_time: Procedure that, based on the input parameters, such as the date of the last message and the current date , calculates all the delta-time information, how many seconds, minutes, hours have passed and a delta time in the 'HH24:MI:SS' format. The ways in which we can calculate the delta time are numerous: one used is just one of many. • f_get_seq_val: Function that extracts generalized, in a dynamic way, the next number of the Oracle sequence whose name is given in the parameter. • f_get_exec_cnt: Function that extracts the execution number of the job. Before calling the generic function f_get_seq_val, it verify that has not already been set as a global variable: in this case, use the number of the current execution.
  • 13. The MEF package (2) • f_get_cft: Function that extracts the current configuration from the MEF_CFT table . • p_esend: A function that performs the sending of email via the UTL_MAIL package. • p_send: This is the procedure that sends the message. • p_mail: Procedure for sending the e-mail. Using the email code in inputit looks for all the recipients in the MEF_EMAIL_CFT table and calls the p_esend passing all required parameters.
  • 14. System configuration (1) • Before you can create all the structures described above, you must perform some environment check. • First of all it is necessary to verify that the Oracle RDBMS is designed for sending emails. • So you have to make sure that the Oracle user that sends the messages has all grant necessary for its operations. Let's see in detail. SMTP check • We need to verify that the Oracle RDBMS is designed for sending emails. In fact the sending is activated by calling a procedure that is part of a package of the RDBMS. To verify this, connect to SQL * Plus with SYS user and check the package UTL_MAIL (from Oracle 11): Sqlplus / as sysdba SQL> descr utl_mail ERROR: ORA-04043: object utl_mail does not exist • If you get this error message, you need to install the package system UTL_MAIL and give the execute permission to the user. • Always as user SYS, run the script to install the package from the rdbms/admin folder of the Oracle home, and give the execution permissions to the ETL user. • So you have to check the smtp server in the Oracle initialization parameter file; in the following example it is not setted and I indicate you how to do it. Here is the sequence of instructions for making these checks (remember to replace the file path of the utlmail.sql file with the relative path to your Oracle installation):
  • 15. System configuration (2) D:>sqlplus / as sysdba SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 25 15:35:49 2014 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Partitioning option SQL> @...RDBMSADMINutlmail.sql SQL> @...RDBMSADMINprvtmail.plb Package created. Synonym created. SQL> grant execute on UTL_MAIL to <ETL user>; Grant succeeded. SQL> sho parameters smtp NAME TYPE VALUE ------------------------------------ ----------- ------- smtp_out_server string SQL> alter system set smtp_out_server = <email server> scope=both; System altered.
  • 16. System configuration (3) • Almost always, in the company there is a mail server, the value should be include the domain, for example. exch.dev.com. • The option "scope = both" make the change effective immediately and permanently. • We must also create and configure an ACL (from Oracle11) using the system package dbms_network_acl_admin. The ACL (Access Control List) is only one way to define external resources to the RDBMS (such as email servers) and allow access to users. • Now open slideshare [http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2] where you will find all the instructions on how to download and run the installation script of this messaging system. The script will execute all the necessary settings for you. • All this will take no longer than 5 minutes of work.
  • 17. Test • In addition to the tests showned in the slides of slideshare, we perform another test now, certainly more exhaustive, which clearly shows the functionality simulating a piece of ETL load. • We start by creating a test table, initializing it from a system table. At this point we run an anonymous block of code (anonymous block is defined as the set of SQL statements included between a begin and an end) that will simulate a real load of a table with insert and delete of data. The sequence of steps is quite simple: you initialize global variables package to better identify the various steps in the messaging table. create table SALES as select * from tabs; begin mef.pv_sched_cod := 'Daily'; mef.pv_job_cod := 'Staging tables'; mef.pv_unit_cod := 'Load sales table'; mef.pv_exec_cnt := 10; mef.p_send('proc_prova','Load of SALES table'); mef.p_send('proc_prova','Deleting...'); delete from sales; mef.p_send('proc_prova','Deleted'); mef.p_send('proc_prova','Loading...'); insert into sales select * from tabs; mef.p_send('proc_prova','Loaded'); mef.p_mail('MEF','ETL_administrator','Sales table loaded'); mef.p_send('proc_prova','Load ended'); end; / • The final result obtained, which can be seen in the table MEF_MSG_LOT, it is very interesting, and gives you the wealth of contextual information that is talked about in the beginning.
  • 18. Conclusion • We saw in detail the steps required to build a messaging system, simple, but very useful for all the people working in the Data Warehouse projects. • This implementation, which is the basis of my Micro-ETL-Foundation, obviously works on any Oracle-PL/SQL project, is non- invasive, and can be applied at any time inserting simple procedure calls in an existing ETL process.