General 05 integration design vs migration design

Integration Design vs. Migration
Design
Why you might need to use a traditional migration
design for high data volume integration projects and
how to do it.

4
Session Abstract
When you have a large amount of data to integrate as part of an initial
synchronization, using a traditional transactional integration design
might not have fast enough performance to get the job done within the
required timeframe. During this session you will learn why sometimes
you may need to use a migration type pattern on an integration project.
We will also share some tips on migration design techniques that can
help you import data with fast performance.

5
Agenda
• Best practices for performance history
• Overview of integration design and migration design patterns
• 5 Migration design practices

6
Performance Best Practice History
Message queue
based integration
was the best for
transaction design
and for speed
Cloud APIs
introduce more
latency to
integration
Batch capabilities
are added to
application APIs
Data sets get
larger
Integration design
and migration
might be needed
on one project

7
Integration Design – Transactional Pattern
• Part of many Scribe Insight templates
• Good for referential integrity and processing transactions as units of
work
• Good for duplicate avoidance, merging
• Leverages Microsoft Message Queuing for retry and performance
• Uses many convenience features built into Insight and that use
lookups
• But not always the fastest pattern now because it does not take full
advantage of batch processing

8
Message Queue Pattern
Message Queue
Publisher
Source
Data or
App
Queue
Integration
Processes
MP 1
MP 2
MP 3
MP 4
MP 5
MP 31
MP …
MP 0

9
Migration Design – Batch Pattern
• Optimized for the fastest performance by using bulk capabilities and
reducing lookups at run time
• More difficult to deal with referential integrity
• Not the easiest way, but the fastest measured in rows/second
• Can be used for initial sync phase or regular large scale imports

11
Migration Design Practices
• Using bulk when possible
• Upsert
• Using local resources for lookups
• Staging data
• Multi-processing

12
Bulk Processing
• Available with Insight and Scribe Online
• Insight
◦ Dynamics CRM, Salesforce, SQL Server
• Scribe Online
◦ Dynamics CRM, Salesforce, Eloqua, Marketo
• Note: Enabling bulk in Scribe Online even for transactional designs
usually runs faster

13
Bulk Processing
Select Source
Data
Is Bulk
Container
Filled?
Store Data in
Bulk Container
Transform Data
From Source
Row
Write Bulk Rows
to Target
No
Yes
You can control the bulk record size

14
Comparison of Non-Bulk and Bulk - Salesforce
Operation Source Rows
Records Per Second
(Non-Bulk)
Records Per Second
(Bulk) % Increase
Salesforce Insert 5000 6.9 81 1074
Salesforce Update 5000 5.6 140 2400
Salesforce Delete 5000 4.9 103 2002
Salesforce Upsert 5000 2.3 95 4030

15
Comparison of Non-Bulk and Bulk – Dynamics CRM
Operation
Source
Rows
Records Per Second
(Non-Bulk)
Records Per Second
(Bulk) % Increase
CRM Online Insert 5000 3.2 92 2775
CRM Online Update 5000 1.5 46 2967
CRM Online Delete 5000 1.5 45 2900
CRM On-Premise Insert 5000 41.8 317 658
CRM On-Premise Update 5000 27.2 202 643
CRM On-Premise Delete 5000 19 175 821

16
Enabling Bulk In Insight
• Go into Configure
Steps…
• Select a step
• Click the Operation tab
• Check Use Bulk Mode
* Scribe Insight

17
Enabling Bulk In Scribe Online
• Edit solution
• Edit map
• Edit block
• Check Process this
operation in batches
of
* Scribe Online

18
“Upsert”
• Fast because of server side processing
• Limits back and forth trips to make decisions in the mapping logic
• Reduces API hits, which Salesforce may charge extra for
• Compatible with bulk
• Insight
◦ Salesforce
◦ Adapter for Dynamics CRM will support CRM Online Upsert in November
• Scribe Online
◦ Salesforce
◦ Connector for Dynamics CRM will support CRM Online Upsert coming after Insight adapter work
complete
• Salesforce has “Upsert with Relationships” that allows for referential integrity to be supported
without costly client side lookups

19
Non-Upsert Processing
Was
Match
Found in
Target?
Find Match
Prepare Row
Update DataNo
Yes
Insert Data
Target System
Lookup
Lookup Result
Do Update
Update Result
Do Insert
Insert Result

20
Upsert Processing
Was
Match
Found in
Target?
Do Upsert
Prepare Row
Update Data No
Yes
Insert Data
Target System

21
Use Local Resources for Lookups
• Use multi-target to do lookups against a local DB when working with
a slower connection for your inserts and updates
• Faster than API/Web Service lookups
• Native local lookup features
◦ Insight: FILELOOKUP
◦ Scribe Online: lookup tables
• Works for:
◦ Domain table, picklist translations
◦ Primary ID pair mapping, “key cross reference”

22
Local Lookup Scenario 1
• Dynamics CRM on-premise installation
• Insert contact and relate contact to its account
• Lookup account name in Dynamics CRM to get account ID back
• Use account ID in contact insert to relate the contact to its account

23
Dynamics CRM
Database
Lookup Against Web Service
Lookup Account
Account Data
Contact Data
Dynamics CRM
SDK/Web Service
Return Account ID
Insert Contact
Lookup Account
Contact Insert Result
Insert Contact

24
Dynamics CRM
Database
Lookup Against Local Database
Lookup Account
Account Data
Contact Data Dynamics CRM
SDK/Web Service
Return Account ID
Insert Contact
Lookup Account
Contact Insert Result
Insert Contact

25
Local Lookup Performance Improvement
Design
Source
Rows
Records Per
Second
Time to
Process(Min)
Time To
Process(Hours)
Update OR Insert, API Lookup 5,000 18 5 Min 14 Sec
Update OR Insert, SQL Lookup 5,000 44 2 Min 28 Sec
Update OR Insert, API Lookup 50,000 18 47 Min 14 Sec
Update OR Insert, SQL Lookup 50,000 44 19 Min 16 Sec
Update OR Insert, API Lookup 1,000,000 18 926 Min 15 Hours
Update OR Insert, SQL Lookup 1,000,000 44 379 Min 6 Hours

26
Local Lookup Scenario 2
• Cloud CRM system
• Insert sales order data in cloud CRM near real time
• Inserting sales orders requires looking up price book information in
cloud CRM
• Replicate cloud CRM price book data to a lookup table in a local
database
• Local lookups will be faster than lookups into the cloud CRM system

27
Lookup Using Cloud CRM System
Cloud CRM SystemGet Price Book
Information
Insert Sales
Order
Firewall
Get Source Row

28
Lookup Using a Local Database
Cloud CRM System
Price Book
Information
Get Price Book
Information
Insert Sales
Order
Firewall
Get Source Row
• Do your lookups against the local database
• Sync price book data to a local database
• Insert sales orders faster

29
Staging
• Staging data in a local database where it can be prepared for batch
processing
• May take multiple passes through the data to prepare it
• Works better with an empty target and de-duped source data
• Determine Inserts and Updates ahead of time
• Maintain referential integrity by bringing data down in passes
• Usually done in a SQL Server or Oracle database

30
Staging Workflow
Staging
A
B
C
Remove Duplicates
Staging
A U
B U
C I
Determine Updates
or Inserts
Source
A
B
B
C
C
Staging
A
B
B
C
C
Import Data
from Source to
Staging
Staging
A U
B U
C I
Cloud
Target
Bulk Upload to
Target

31
Multi-Processing
• Run integration across several processes
• Test to find the point of diminishing returns
◦ Understand resources and adjust accordingly
◦ Monitor memory utilization and CPU utilization
• With Insight 7.9, file/time/query integration multi-processes
• Plan for referential integrity rules of the target

32
Multi-Threading with Message Queue Pattern
Message Queue
Publisher
Source
Data or
App
Queue
Integration
Processes
MP 1
MP 2
MP 3
MP 4
MP 5
MP 31
MP …
MP 0
* Scribe Insight

33
Multi-Processing File/Time/Query Integrations
File/Time/Query
Integrations
MP 1
MP 2
MP 3
MP 4
MP 5
MP 31
MP …
MP 0
Before Insight 7.9
* Scribe Insight

34
Multi-Threading File/Time/Query Integration
Processes
File/Time/Query
Integration
Processes
MP 1
MP 2
MP 3
MP 4
MP 5
MP 31
MP …
MP 0 “Serial” Group – Single Threaded
“Default” Group – Multi-Process
Insight 7.9
*Available with all Insight licenses
* Scribe Insight

35
Multiple Workbenches
• Segment source data
• Be careful of referential integrity rules
• Can do this one one computer or multiple computers
Job 1 Job 2 Job 3
* Scribe Insight

36
Multiple Solutions
Job 1 Job 2 Job 3
• Segment source data
• Be careful of referential integrity rules
• One Scribe Online agent can run multiple solutions at the same time
* Scribe Online

37
Wrap Up
• You can use traditional migration and batch design patterns when
maximum performance is needed for integration jobs
• Using these design patterns is not the easy way, but it can be the fastest
way to move data
• Major patterns:
◦ Bulk processing
◦ Upsert
◦ Local resources for lookups
◦ Data staging
◦ Multi-processing

General 05 integration design vs migration design

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (19)

Ähnlich wie General 05 integration design vs migration design

Ähnlich wie General 05 integration design vs migration design (20)

Mehr von Scribe Software Corp.

Mehr von Scribe Software Corp. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

General 05 integration design vs migration design