1. Failure Recovery in Informatica
Restartable ETL jobs are very crucial to job failure recovery, supportability and data quality of any
ETL system. In one of our prior articles we discussed different design techniques for ETL
restartability, independent of the ETL tool used. We can also implement restartability in an ETL job
using Informatica PowerCenter workflow recovery capabilities. In this article lets see what is required
to setup an informatica workflow for recovery.
What is Workflow Recovery
Workflow recovery allows you to continue processing the workflow and workflow tasks from the point
of interruption. During the workflow recovery process Integration Service access the workflow state,
which is stored in memory or on disk based on the recovery configuration. The workflow state
of operation includes the status of tasks in the workflow and workflow variable values. The
configuration includes.
1. Workflow Configuration for Recovery
2. Session and Tasks Configuration for Recovery
3. Recovering the Workflow from Failure
1. Workflow Configuration for Recovery
To configure a workflow for recovery, we must enable the workflow for recovery or configure the
workflow to suspend on task error.
Enable Recovery : When you enable a workflow for recovery, the Integration Service saves the
workflow state of operation in a shared location. You can recover the workflow if it terminates,
stops, or aborts. The workflow does not have to be running.
We can set up the automatic recovery in the workflow as shown in below image.
2. Note : An optional High Availability (HA) license is required for this check box to be available
for selection. Without the HA option, workflows must be recovered manually. That is, you must
locate the failed workflow in the Workflow Monitor client and manually tell PowerCenter to
recover the workflow or use the command line to recover the workflow.
Suspend : When you configure a workflow to suspend on error, the Integration Service stores
the workflow state of operation in memory. You can recover the suspended workflow if a task
fails. You can fix the task error and recover the workflow. If the workflow is not able to recover
automatically from failure with in the maximum allowed number of attempts, it goes to
'suspended' state.
3. We can set up the workflow to suspend on error as shown in below image.
2. Session and Tasks Configuration for Recovery
Each session or task in a workflow has its own recovery strategy. When the Integration Service
recovers a workflow, it recovers tasks based on the recovery strategy of each task or session
specified. Three different options are available.
Restart task
Fail task and continue workflow
Resume from the last checkpoint
4. Restart task : This recovery strategy is available for all type of workflow tasks. When the Integration
Service recovers a workflow, it restarts each recoverable task that is configured with a restart
strategy. You can configure Session and Command tasks with a restart recovery strategy. All other
tasks have a restart recovery strategy by default.
Fail task and continue workflow : This recovery strategy is only available for session and
command tasks. When the Integration Service recovers a workflow, it does not recover the task. The
task status becomes failed, and the Integration Service continues running the workflow. Configure a
fail recovery strategy if you want to complete the workflow, but you do not want to recover the task.
Resume from the last checkpoint : This recovery strategy is only available for session tasks. The
Integration Service saves the session state of operation and maintains target recovery tables. If the
session aborts, stops, or terminates, the Integration Service uses the saved recovery information to
resume the session from the point of interruption.
When you configure the session recovery strategy to resume from the last checkpoint, Integration
Service stores the session state of operation in the shared location, $PMStorageDir. And also it is
written to the recovery tables (PM_RECOVERY, PM_TGT_RUN_ID, PM_REC_STATE) to
determine where to begin loading data to target tables, in case of a recovery.
You can see the available session recovery options as shown in below image.
5. You can see the available command task recovery options as shown in below image.
6. 3. Recovering the Workflow from Failure
Workflow can be either recovered automatically or manually depending on the workflow recovery
strategy
Recovering Automatically
If you have High Availability (HA) licence and the workflow is configured to recover automatically as
described above, Integration service automatically attempts to recover the workflow based on the
recovery strategy set of each session or task in the workflow. If the workflow is not able to recover
automatically from failure with in the maximum allowed number of attempts, it goes to 'suspended'
state, which can be then manually recovered.
Recovering Manually
7. If you do not have High Availability (HA) licence, you can manually recover the workflow or individual
tasks with in a workflow separately. You can access the options as shown in below image from the
workflow manager or from the workflow monitor.
Recover workflow :- Continue processing the workflow from the point of interruption.
Recover Task :- Recover a session but not the rest of the workflow.
Recover workflow from a task :- Recover a session and continue processing a workflow.