AWS Cloud Migration Case Study for Content Collection System
1. AWS Cloud Migration – Case study
Madhusoodanan K M, Enterprise Architect –IBM GTS lab
Hybrid Integration Solution - Innovation and Automation
March 11, 2017
2. AGENDA
•Business case - 5 min
•Pre migration landscape- 5 min
•Migration approach - 5 min
•Target architecture - 5 min
•Lesson learned - 10 min
•Summary - 5 min
•Q&A - 10 min
2
3. About the Business case
Raw Materials Finished Product
Governments
Corporations
Institutions
Publishers
3
4. Pre-migration Land scape of content collection system
2 - Content Master
As Collected
Derived Value Added
Internal interface
3 - Data Interface
4 - Application Database
ManualDataEntry
3tdPartyfeed
5 - Service
Interface (R/R)
6 - Datafeed API
1 - Collection Interface
Products
Products
• Collects 1- 2 million files per day ~20 GB per day from
~1000+ sources across the world
• Database(DB) storage - 5 TB, ( expecting to grow to
20 TB in 3 years )
• SAN file storage used - 8 TB
• 500 GB data distributed to products per day
Challenges
• Performance and scalability challenges
– Vertical Scalability limit reached for the DB server, Index rebuilding
not getting completed within the maintenance windows
– Longer wait & back-log for product (out put) files generation
• DR – recovery time beyond business RTO
• Extensive rework required to scale DB horizontally
4
5. Migration Approach
Phased approach
• Phase 1
• Migrated the content collection interface (item-1) to (EC2
+ S3)
• Migrated Archival content to Glacier
• Migrated final content-for-distribution to Aurora (part of
item-4)
• Phase 2
• Analytics DB and reporting to be migrated to redshift
(part of item-4 )
• Use lambda for final content formatting and distribution (
HTTP request / response loads) ( item -5 )
• Phase 3
• Migrate the core DB to Aurora and migrate business rule
engines to AWS
2 - Content Master
As Collected
Derived Value Added
Internal interface
3 - Data Interface
4 - Application Database
ManualDataEntry
3tdPartyfeed
5 - Service
Interface (R/R)
6 - Datafeed API
1 - Collection Interface
Products
Products
5
7. Lesson learned
• Focus on application logging, separation of application vs
infrastructure issues. (Lift-n-Shift: watch out for land mines)
• The consolidated error log should enable diagnosing the
failure from end-2end and swiftly resolve it.
• Don’t under estimate the complexity of re engineering few config
files to make legacy application to “cloud ready”
• Encapsulate all possible service as API
• Streamline Release and change management
• Ensure component self healing (automate start up, health
check, )
• Data life cycle management
• Need to cultivate the culture and discipline of ownership and
accountability on value of resource usage
• Any resource usage should justify in terms of business value
• Identity and Accesses management may have to enhanced
7
8. 8
• Need to re-engineer apps
• Loss of control
• Security & compliance
• IAM had to be supplemented with
custom access management
• Data transfer overhead between
AWS and legacy data center
• Faster time-to-market
• Flexible cost model
• Flexible capacity
• Economies of scale
Constrains
Benefits
Benefits Vs constrain
SUMMARY