1. Integration Design vs. Migration
Design
Why you might need to use a traditional migration
design for high data volume integration projects and
how to do it.
2. 4
Session Abstract
When you have a large amount of data to integrate as part of an initial
synchronization, using a traditional transactional integration design
might not have fast enough performance to get the job done within the
required timeframe. During this session you will learn why sometimes
you may need to use a migration type pattern on an integration project.
We will also share some tips on migration design techniques that can
help you import data with fast performance.
3. 5
Agenda
• Best practices for performance history
• Overview of integration design and migration design patterns
• 5 Migration design practices
4. 6
Performance Best Practice History
Message queue
based integration
was the best for
transaction design
and for speed
Cloud APIs
introduce more
latency to
integration
Batch capabilities
are added to
application APIs
Data sets get
larger
Integration design
and migration
might be needed
on one project
5. 7
Integration Design – Transactional Pattern
• Part of many Scribe Insight templates
• Good for referential integrity and processing transactions as units of
work
• Good for duplicate avoidance, merging
• Leverages Microsoft Message Queuing for retry and performance
• Uses many convenience features built into Insight and that use
lookups
• But not always the fastest pattern now because it does not take full
advantage of batch processing
7. 9
Migration Design – Batch Pattern
• Optimized for the fastest performance by using bulk capabilities and
reducing lookups at run time
• More difficult to deal with referential integrity
• Not the easiest way, but the fastest measured in rows/second
• Can be used for initial sync phase or regular large scale imports
9. 11
Migration Design Practices
• Using bulk when possible
• Upsert
• Using local resources for lookups
• Staging data
• Multi-processing
10. 12
Bulk Processing
• Available with Insight and Scribe Online
• Insight
◦ Dynamics CRM, Salesforce, SQL Server
• Scribe Online
◦ Dynamics CRM, Salesforce, Eloqua, Marketo
• Note: Enabling bulk in Scribe Online even for transactional designs
usually runs faster
11. 13
Bulk Processing
Select Source
Data
Is Bulk
Container
Filled?
Store Data in
Bulk Container
Transform Data
From Source
Row
Write Bulk Rows
to Target
No
Yes
You can control the bulk record size
12. 14
Comparison of Non-Bulk and Bulk - Salesforce
Operation Source Rows
Records Per Second
(Non-Bulk)
Records Per Second
(Bulk) % Increase
Salesforce Insert 5000 6.9 81 1074
Salesforce Update 5000 5.6 140 2400
Salesforce Delete 5000 4.9 103 2002
Salesforce Upsert 5000 2.3 95 4030
13. 15
Comparison of Non-Bulk and Bulk – Dynamics CRM
Operation
Source
Rows
Records Per Second
(Non-Bulk)
Records Per Second
(Bulk) % Increase
CRM Online Insert 5000 3.2 92 2775
CRM Online Update 5000 1.5 46 2967
CRM Online Delete 5000 1.5 45 2900
CRM On-Premise Insert 5000 41.8 317 658
CRM On-Premise Update 5000 27.2 202 643
CRM On-Premise Delete 5000 19 175 821
14. 16
Enabling Bulk In Insight
• Go into Configure
Steps…
• Select a step
• Click the Operation tab
• Check Use Bulk Mode
* Scribe Insight
15. 17
Enabling Bulk In Scribe Online
• Edit solution
• Edit map
• Edit block
• Check Process this
operation in batches
of
* Scribe Online
16. 18
“Upsert”
• Fast because of server side processing
• Limits back and forth trips to make decisions in the mapping logic
• Reduces API hits, which Salesforce may charge extra for
• Compatible with bulk
• Insight
◦ Salesforce
◦ Adapter for Dynamics CRM will support CRM Online Upsert in November
• Scribe Online
◦ Salesforce
◦ Connector for Dynamics CRM will support CRM Online Upsert coming after Insight adapter work
complete
• Salesforce has “Upsert with Relationships” that allows for referential integrity to be supported
without costly client side lookups
19. 21
Use Local Resources for Lookups
• Use multi-target to do lookups against a local DB when working with
a slower connection for your inserts and updates
• Faster than API/Web Service lookups
• Native local lookup features
◦ Insight: FILELOOKUP
◦ Scribe Online: lookup tables
• Works for:
◦ Domain table, picklist translations
◦ Primary ID pair mapping, “key cross reference”
20. 22
Local Lookup Scenario 1
• Dynamics CRM on-premise installation
• Insert contact and relate contact to its account
• Lookup account name in Dynamics CRM to get account ID back
• Use account ID in contact insert to relate the contact to its account
21. 23
Dynamics CRM
Database
Lookup Against Web Service
Lookup Account
Account Data
Contact Data
Dynamics CRM
SDK/Web Service
Return Account ID
Insert Contact
Lookup Account
Contact Insert Result
Insert Contact
22. 24
Dynamics CRM
Database
Lookup Against Local Database
Lookup Account
Account Data
Contact Data Dynamics CRM
SDK/Web Service
Return Account ID
Insert Contact
Lookup Account
Contact Insert Result
Insert Contact
23. 25
Local Lookup Performance Improvement
Design
Source
Rows
Records Per
Second
Time to
Process(Min)
Time To
Process(Hours)
Update OR Insert, API Lookup 5,000 18 5 Min 14 Sec
Update OR Insert, SQL Lookup 5,000 44 2 Min 28 Sec
Update OR Insert, API Lookup 50,000 18 47 Min 14 Sec
Update OR Insert, SQL Lookup 50,000 44 19 Min 16 Sec
Update OR Insert, API Lookup 1,000,000 18 926 Min 15 Hours
Update OR Insert, SQL Lookup 1,000,000 44 379 Min 6 Hours
24. 26
Local Lookup Scenario 2
• Cloud CRM system
• Insert sales order data in cloud CRM near real time
• Inserting sales orders requires looking up price book information in
cloud CRM
• Replicate cloud CRM price book data to a lookup table in a local
database
• Local lookups will be faster than lookups into the cloud CRM system
25. 27
Lookup Using Cloud CRM System
Cloud CRM SystemGet Price Book
Information
Insert Sales
Order
Firewall
Get Source Row
26. 28
Lookup Using a Local Database
Cloud CRM System
Price Book
Information
Get Price Book
Information
Insert Sales
Order
Firewall
Get Source Row
• Do your lookups against the local database
• Sync price book data to a local database
• Insert sales orders faster
27. 29
Staging
• Staging data in a local database where it can be prepared for batch
processing
• May take multiple passes through the data to prepare it
• Works better with an empty target and de-duped source data
• Determine Inserts and Updates ahead of time
• Maintain referential integrity by bringing data down in passes
• Usually done in a SQL Server or Oracle database
29. 31
Multi-Processing
• Run integration across several processes
• Test to find the point of diminishing returns
◦ Understand resources and adjust accordingly
◦ Monitor memory utilization and CPU utilization
• With Insight 7.9, file/time/query integration multi-processes
• Plan for referential integrity rules of the target
33. 35
Multiple Workbenches
• Segment source data
• Be careful of referential integrity rules
• Can do this one one computer or multiple computers
Job 1 Job 2 Job 3
* Scribe Insight
34. 36
Multiple Solutions
Job 1 Job 2 Job 3
• Segment source data
• Be careful of referential integrity rules
• One Scribe Online agent can run multiple solutions at the same time
* Scribe Online
35. 37
Wrap Up
• You can use traditional migration and batch design patterns when
maximum performance is needed for integration jobs
• Using these design patterns is not the easy way, but it can be the fastest
way to move data
• Major patterns:
◦ Bulk processing
◦ Upsert
◦ Local resources for lookups
◦ Data staging
◦ Multi-processing