SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Migrating to Cloud: Inhouse
Hadoop to Databricks
Modernize your Enterprise Data Lake to Serverless Data Lake,
where data, workloads, and orchestrations can be automatically
migrated to the cloud-native infrastructure.
Migration of applications is a good thing. It forces the organization to clean up junk, that is never used. It adds a lot of
innovation and new ideas to your engineering teams. It is important to build confidence in our teams that future
migrations are not stressful and pushes teams to design systems to be flexible. It sends a message to vendors that
you are not bluffing about pulling the plug if you don’t see the results you expect
Some of the benefits of migrating (Our customers achieved) in case of the on-premise solution to databricks include
Commercial License and Maintenance cost
Tangible Benefits
Intangible Benefits
Reduced cluster costs, as you can leverage databricks auto-scale up/down and spot instance pricing
Reduced labor cost of creating new infrastructure
Avail cloud-based services (Azure data factory, Azure DevOps for example) and all the cloud-native services, like
lambda, EKS, S3/AZFS, etc
Reduced maintenance costs
Easier version upgrades
Improved performance due to databricks file system performance innovations
www.knoldus.com
Easier development with notebooks
The list goes on
But, it is also important that the migration delivers something tangible for business. Keeping your business partners aware
of the migration goals, expected results will enormously increase confidence in your capability and fosters team spirit.
Following is the Knoldus Migration Framework that has been tried and tested, and covers the most important points of a
typical migration:
www.knoldus.com
Planning and Communication phase
Phase 1
In this phase you will achieve the following:
Just like the white house coronavirus task force, form a team of experienced project managers, architects, business
users. Ensure there is sufficient technical expertise (Since this is primarily a technical project)
Establish a communication plan with the impacted teams. More often, migrations impact multiple organizational teams,
which could be a group of application owning teams and/or internal teams (Security, infrastructure, database, etc).
Collect inventory of applications with thorough details including application complexities, critical blackout periods that
impact schedules, critical people needs, etc.
Publish a roadmap, with tentative dates that are subject to change based on the application complexities.
Establish the KPIs.
Business KPIs ( eg. Accuracy of predictions.)
www.knoldus.com
Performance KPIs (Total run time)
Financial KPIs (Total monthly cost reduction)
Operational KPIs (Number of people required for maintenance)
Define the organization structure
Establishing a team involves several different factors. For a large organization, we established the following structure,
however, you should consider your own organizational factors before designing the migration team.
Central Migration Team
www.knoldus.com
What is the key goal of this migration?Ques 1.
What is the size, nature of the data that needs to be migrated?Ques 2.
What is a high-level of data ingress and egress needs?Ques 3.
Is GitHub, Jenkins, Jira, and Confluence setups locations identified?Ques 4.
Who has to approve the merge requests?Ques 5.
Sun setting Cloudera to save license cost?
Improve pipeline performance (Total end to end time-lapsed)?
Cloudera cluster needs more capacity, hence want a flexible resource model?
Intend to leverage other cloud services (For example Azure data factory)
Better automation?
Ease of use for data scientists? (Ie new features using notebooks)
Reduce infrastructure maintenance costs?
Sample Questions to Ask for Cloudera-Databricks
www.knoldus.com
Engage an experienced ‘Target System Specialist’ to take a look at the current applications, from an architecture
standpoint.
Identify mismatches in architecture
Prescribe target architecture by collaborating with the target system vendor
Define projects to re-engineer the current system, if that is required prior to migration
Adjust and publish schedules back to the teams based on this detailed assessment. At this point schedules tend to
be much more clearer and detailed
Architecture Detailing Phase
Phase 2
This is by far the most critical phase, and the success heavily depends on what happens during this phase.
One of the most important decisions in-migration of any application is whether to make it ‘Cloud Native’ or ‘Lift and
Shift’ or something in between. This decision should be taken after understanding the current application in detail.
www.knoldus.com
Example:
One of our customers has recently migrated from Cloudera to databricks. The customer is a large successful American
Grocer, who needed to predict future sales based on historic sales data and promotions. These predictions happened at
an item category level. The current pipeline accomplished this, by running the entire data related to one category in a
large R application, which is single-threaded with extensive use of Memory.
The architectural choices were to rewrite the code to use Spark parallelized algorithms, which means, the entire pipeline
needs to be rearchitected from the ground up. Or, use lapply, a pseudo parallelization construct in spark, that lets us run
the code in its entirety, in native R run-time without having to rewrite. Upon discussion internally, due to time constraints,
we decided to migrate without rewriting the code, though it would be a better choice in the long run.
The bottom line is, such decisions should be done well before, if you have the luxury of expertise and time, failing which,
you would put the team in extreme pressure, which may result in production failures and failed projects.
www.knoldus.com
Lift and Shift
Far too often the companies, with the stress of migration resort to a lift and shift approach. Knoldus highly recommends
a cloud-native approach, wherein the application leverage the full potential of cloud-based architectures to gain long
term customer delight and reduction in support costs.
Lift and Shift Migration
www.knoldus.com
However, should you decide to go with lift and shift, consider the following.
Is the application of incoming data-intensive or outgoing data? this has implications on data transfer costs.
What are the key components used?
Do you intend to plug in local or cloud-based monitoring systems?
How much of intermediary storage is required?
How do you manage the configurations of the application to tune the behavior of the application?
What kind of integrations are necessary?
Ques 1.
Sample questions to ask
www.knoldus.com
ML
External libraries and Enrichment of data
ETL
Security / Data Redaction
Programming languages used
Observe current spark job output for high shuffle memory usage, task failures
Are applications enabled with CICD
Are applications use logging extensively
What parts of code will be in notebooks vs what part in Jars
Are there any monitoring tools or logging tools currently that. also needs migration
Job Dependencies
Criticality of output
Common Errors
www.knoldus.com
High RAM requirements
Joins that are too large
Broadcasts that are too large
Are there any non-standard architectures or procedures used?Ques 2.
Single-threaded apps
At knoldus, we use the SAFe Agile process for managing multiple projects at the same time.
Conduct a program increment planning, that plans and identifies relationships and dependencies between
multiple teams.
Breakdown overall goals into sprint goals
Identify EPICs, features, stories, and spikes
Create your Jira board
Provide sufficient time for teams to understand their next 3-week sprint goals and discuss issues raised. Use the inputs to
adjust the stories.
Some level of estimations is important to recognize large tasks. Too large tasks need to be split so that they are
manageable within the sprint.
Document key architectures, and pipelines on confluence. Do an architecture review with key stakeholders.
Document environment strategy? Are clusters dedicated to testing, stage, and production?
Architecture detailing will give sufficient details to build the Jira board.
Pre Execution (Build Jira board)
Phase 3
www.knoldus.com
Document Spikes and their potential scenarios. For example, if we want to convert a critical piece of logic from R to
scala, what. will be the plan if it succeeds or fails?
www.knoldus.com
What is the current collaboration design ? for example, can multiple users execute the same job?Ques 1.
Is this collaboration transferable to databricks notebooks?Ques 2.
What is the definition of done? Is CI/CD pipelines includedQues 3.
How do we test the output accuracy? Do we need to write code to automatically test results on a new platform?Ques 4.
What is the testing process? Are test scripts prepared and ready?Ques 5.
Sample questions to ask:
Is the foundation laid well?
Are users trained on the new technology?
Ensure Jira board updates are reflecting on each team’s Jira boards.
Scrum master to check with other scrum teams if the dependencies expected to be complete are on track or if that
will impact the sprint deliverables.
Is Unit testing is being rigorously followed?
Are we following true agile where in some functionality is being demonstrated in demos?
Are there any overlap issues in using the infrastructure
Are we using slack to effectively notify all teams of the potential shut down?
Execution
Phase 4
This is the easy part. Its time to just execute based on the jira board.
www.knoldus.com
Are clusters deployed?
For example, if a job is run by two different users, what is the damage.
Security setup in place? which notebook folders are open for which users? How do users share code and data?
Measure and understand if KPIs are met.
If not met, introspect, and identify what needs to be done.
Are basic essential KPIs met, so that we can go live and address the technical debt?
Identify all technical debt, document.
Define a plan to address technical debt.
Is a new system up and running for sufficient time to hand over for production support.
Celebrate.
Once you are in the cloud, you will have access to several tools, frameworks, and new architecture patterns at your
disposal and immensely increases your ability to respond to business needs.
Closure
Phase 5
Cloud managed services
www.knoldus.com
https://www.knoldus.com/connect/contact-us
We encourage to work with experienced application architects and teams who have exposure to
cloud-native and reactive architectures to continue the journey of digital transformation.
We hope Knoldus can be a partner in your journey. Get in touch with us to schedule a call with
our expert or drop us a line at hello@knoldus.com.
Let’s
Talk
www.knoldus.com
For more such insights, follow us here:
https://www.linkedin.com/company/knoldus/about/ https://twitter.com/Knolspeak https://www.youtube.com/channel/UCP4g5qGeUSY7OokXfim1QCQ

Weitere ähnliche Inhalte

Was ist angesagt?

Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Emerging Trends in Data Engineering
Emerging Trends in Data EngineeringEmerging Trends in Data Engineering
Emerging Trends in Data EngineeringAnanth PackkilDurai
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies SnapLogic
 
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudCloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudNew Relic
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Getting started on your AWS migration journey
Getting started on your AWS migration journeyGetting started on your AWS migration journey
Getting started on your AWS migration journeyAmazon Web Services
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesCarole Gunst
 
Data cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flowsData cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flowsMark Kromer
 
Cloud Adoption Framework - Overview_partner.pptx
Cloud Adoption Framework - Overview_partner.pptxCloud Adoption Framework - Overview_partner.pptx
Cloud Adoption Framework - Overview_partner.pptxabhishek22611
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureDATAVERSITY
 
Getting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceGetting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceAmazon Web Services
 
Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...
Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...
Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...Amazon Web Services
 

Was ist angesagt? (20)

Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Emerging Trends in Data Engineering
Emerging Trends in Data EngineeringEmerging Trends in Data Engineering
Emerging Trends in Data Engineering
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies
 
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudCloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Getting started on your AWS migration journey
Getting started on your AWS migration journeyGetting started on your AWS migration journey
Getting started on your AWS migration journey
 
Modern Data Platform on AWS
Modern Data Platform on AWSModern Data Platform on AWS
Modern Data Platform on AWS
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data Pipelines
 
Data cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flowsData cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flows
 
Cloud Adoption Framework - Overview_partner.pptx
Cloud Adoption Framework - Overview_partner.pptxCloud Adoption Framework - Overview_partner.pptx
Cloud Adoption Framework - Overview_partner.pptx
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data Architecture
 
Getting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceGetting Started with AWS Database Migration Service
Getting Started with AWS Database Migration Service
 
Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...
Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...
Migrating Databases to the Cloud: Introduction to AWS DMS - SRV215 - Chicago ...
 

Ähnlich wie Migrating to Cloud: Inhouse Hadoop to Databricks (3)

Planning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter Warmer
Planning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter WarmerPlanning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter Warmer
Planning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter WarmerJoe Conlin
 
Re-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudRe-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudCarter Wickstrom
 
7 Essential Steps to Cloud Adoption.pdf
7 Essential Steps to Cloud Adoption.pdf7 Essential Steps to Cloud Adoption.pdf
7 Essential Steps to Cloud Adoption.pdfAnil
 
Modernizing Mainframe Applications For The Cloud Environment.pdf
Modernizing Mainframe Applications For The Cloud Environment.pdfModernizing Mainframe Applications For The Cloud Environment.pdf
Modernizing Mainframe Applications For The Cloud Environment.pdfPetaBytz Technologies
 
IT 8003 Cloud ComputingFor this activi.docx
IT 8003 Cloud ComputingFor this activi.docxIT 8003 Cloud ComputingFor this activi.docx
IT 8003 Cloud ComputingFor this activi.docxvrickens
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startupsSekhar Mohanty
 
Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...
Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...
Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...Amazon Web Services
 
Cloud-Migration-Methodology v1.0
Cloud-Migration-Methodology v1.0Cloud-Migration-Methodology v1.0
Cloud-Migration-Methodology v1.0b3535840
 
Best practices for application migration to public clouds interop presentation
Best practices for application migration to public clouds interop presentationBest practices for application migration to public clouds interop presentation
Best practices for application migration to public clouds interop presentationesebeus
 
DEVSECOPS ON CLOUD STORAGE SECURITY
DEVSECOPS ON CLOUD STORAGE SECURITYDEVSECOPS ON CLOUD STORAGE SECURITY
DEVSECOPS ON CLOUD STORAGE SECURITYIRJET Journal
 
Cloud native fundamentals
Cloud native fundamentalsCloud native fundamentals
Cloud native fundamentalsVictor Morales
 
Achieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsAchieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsSense Corp
 
Making the Journey_ 7 Essential Steps to Cloud Adoption.pdf
Making the Journey_ 7 Essential Steps to Cloud Adoption.pdfMaking the Journey_ 7 Essential Steps to Cloud Adoption.pdf
Making the Journey_ 7 Essential Steps to Cloud Adoption.pdfAnil
 
Cloud Computing Courses Online.pptx Join Now
Cloud Computing Courses Online.pptx Join NowCloud Computing Courses Online.pptx Join Now
Cloud Computing Courses Online.pptx Join Nowasmeerana605
 
Migrating to the cloud
Migrating to the cloudMigrating to the cloud
Migrating to the cloudIdeaca
 
Cloud migration
Cloud migrationCloud migration
Cloud migrationRaj Raj
 
A Practical Guide to Cloud Migration
A Practical Guide to Cloud MigrationA Practical Guide to Cloud Migration
A Practical Guide to Cloud MigrationMarianne Harness
 
Dynamics 365 saturday 2018 - data migration story
Dynamics 365 saturday   2018 - data migration storyDynamics 365 saturday   2018 - data migration story
Dynamics 365 saturday 2018 - data migration storyAndre Margono
 

Ähnlich wie Migrating to Cloud: Inhouse Hadoop to Databricks (3) (20)

Planning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter Warmer
Planning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter WarmerPlanning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter Warmer
Planning for a (Mostly) Hassle-Free Cloud Migration | VTUG 2016 Winter Warmer
 
Re-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudRe-Platforming Applications for the Cloud
Re-Platforming Applications for the Cloud
 
7 Essential Steps to Cloud Adoption.pdf
7 Essential Steps to Cloud Adoption.pdf7 Essential Steps to Cloud Adoption.pdf
7 Essential Steps to Cloud Adoption.pdf
 
Modernizing Mainframe Applications For The Cloud Environment.pdf
Modernizing Mainframe Applications For The Cloud Environment.pdfModernizing Mainframe Applications For The Cloud Environment.pdf
Modernizing Mainframe Applications For The Cloud Environment.pdf
 
Cloud capability for startups
Cloud capability for startupsCloud capability for startups
Cloud capability for startups
 
IT 8003 Cloud ComputingFor this activi.docx
IT 8003 Cloud ComputingFor this activi.docxIT 8003 Cloud ComputingFor this activi.docx
IT 8003 Cloud ComputingFor this activi.docx
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startups
 
Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...
Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...
Best Practices for Data Center Migration Planning - August 2016 Monthly Webin...
 
Cloud-Migration-Methodology v1.0
Cloud-Migration-Methodology v1.0Cloud-Migration-Methodology v1.0
Cloud-Migration-Methodology v1.0
 
Best practices for application migration to public clouds interop presentation
Best practices for application migration to public clouds interop presentationBest practices for application migration to public clouds interop presentation
Best practices for application migration to public clouds interop presentation
 
DEVSECOPS ON CLOUD STORAGE SECURITY
DEVSECOPS ON CLOUD STORAGE SECURITYDEVSECOPS ON CLOUD STORAGE SECURITY
DEVSECOPS ON CLOUD STORAGE SECURITY
 
Cloud native fundamentals
Cloud native fundamentalsCloud native fundamentals
Cloud native fundamentals
 
Achieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsAchieve New Heights with Modern Analytics
Achieve New Heights with Modern Analytics
 
Making the Journey_ 7 Essential Steps to Cloud Adoption.pdf
Making the Journey_ 7 Essential Steps to Cloud Adoption.pdfMaking the Journey_ 7 Essential Steps to Cloud Adoption.pdf
Making the Journey_ 7 Essential Steps to Cloud Adoption.pdf
 
Cloud Computing Courses Online.pptx Join Now
Cloud Computing Courses Online.pptx Join NowCloud Computing Courses Online.pptx Join Now
Cloud Computing Courses Online.pptx Join Now
 
Migrating to the cloud
Migrating to the cloudMigrating to the cloud
Migrating to the cloud
 
Cloud migration
Cloud migrationCloud migration
Cloud migration
 
A Practical Guide to Cloud Migration
A Practical Guide to Cloud MigrationA Practical Guide to Cloud Migration
A Practical Guide to Cloud Migration
 
8.cloud migration
8.cloud migration8.cloud migration
8.cloud migration
 
Dynamics 365 saturday 2018 - data migration story
Dynamics 365 saturday   2018 - data migration storyDynamics 365 saturday   2018 - data migration story
Dynamics 365 saturday 2018 - data migration story
 

Mehr von Knoldus Inc.

Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxKnoldus Inc.
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingKnoldus Inc.
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionKnoldus Inc.
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxKnoldus Inc.
 
Introduction to Redis and its features.pptx
Introduction to Redis and its features.pptxIntroduction to Redis and its features.pptx
Introduction to Redis and its features.pptxKnoldus Inc.
 
GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfKnoldus Inc.
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxKnoldus Inc.
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingKnoldus Inc.
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesKnoldus Inc.
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxKnoldus Inc.
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxKnoldus Inc.
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationKnoldus Inc.
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIsKnoldus Inc.
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II PresentationKnoldus Inc.
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAKnoldus Inc.
 

Mehr von Knoldus Inc. (20)

Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
 
Introduction to Redis and its features.pptx
Introduction to Redis and its features.pptxIntroduction to Redis and its features.pptx
Introduction to Redis and its features.pptx
 
GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptx
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 

Kürzlich hochgeladen

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Kürzlich hochgeladen (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Migrating to Cloud: Inhouse Hadoop to Databricks (3)

  • 1. Migrating to Cloud: Inhouse Hadoop to Databricks Modernize your Enterprise Data Lake to Serverless Data Lake, where data, workloads, and orchestrations can be automatically migrated to the cloud-native infrastructure.
  • 2. Migration of applications is a good thing. It forces the organization to clean up junk, that is never used. It adds a lot of innovation and new ideas to your engineering teams. It is important to build confidence in our teams that future migrations are not stressful and pushes teams to design systems to be flexible. It sends a message to vendors that you are not bluffing about pulling the plug if you don’t see the results you expect Some of the benefits of migrating (Our customers achieved) in case of the on-premise solution to databricks include Commercial License and Maintenance cost Tangible Benefits Intangible Benefits Reduced cluster costs, as you can leverage databricks auto-scale up/down and spot instance pricing Reduced labor cost of creating new infrastructure Avail cloud-based services (Azure data factory, Azure DevOps for example) and all the cloud-native services, like lambda, EKS, S3/AZFS, etc Reduced maintenance costs Easier version upgrades Improved performance due to databricks file system performance innovations www.knoldus.com
  • 3. Easier development with notebooks The list goes on But, it is also important that the migration delivers something tangible for business. Keeping your business partners aware of the migration goals, expected results will enormously increase confidence in your capability and fosters team spirit. Following is the Knoldus Migration Framework that has been tried and tested, and covers the most important points of a typical migration: www.knoldus.com
  • 4. Planning and Communication phase Phase 1 In this phase you will achieve the following: Just like the white house coronavirus task force, form a team of experienced project managers, architects, business users. Ensure there is sufficient technical expertise (Since this is primarily a technical project) Establish a communication plan with the impacted teams. More often, migrations impact multiple organizational teams, which could be a group of application owning teams and/or internal teams (Security, infrastructure, database, etc). Collect inventory of applications with thorough details including application complexities, critical blackout periods that impact schedules, critical people needs, etc. Publish a roadmap, with tentative dates that are subject to change based on the application complexities. Establish the KPIs. Business KPIs ( eg. Accuracy of predictions.) www.knoldus.com Performance KPIs (Total run time) Financial KPIs (Total monthly cost reduction) Operational KPIs (Number of people required for maintenance)
  • 5. Define the organization structure Establishing a team involves several different factors. For a large organization, we established the following structure, however, you should consider your own organizational factors before designing the migration team. Central Migration Team www.knoldus.com
  • 6. What is the key goal of this migration?Ques 1. What is the size, nature of the data that needs to be migrated?Ques 2. What is a high-level of data ingress and egress needs?Ques 3. Is GitHub, Jenkins, Jira, and Confluence setups locations identified?Ques 4. Who has to approve the merge requests?Ques 5. Sun setting Cloudera to save license cost? Improve pipeline performance (Total end to end time-lapsed)? Cloudera cluster needs more capacity, hence want a flexible resource model? Intend to leverage other cloud services (For example Azure data factory) Better automation? Ease of use for data scientists? (Ie new features using notebooks) Reduce infrastructure maintenance costs? Sample Questions to Ask for Cloudera-Databricks www.knoldus.com
  • 7. Engage an experienced ‘Target System Specialist’ to take a look at the current applications, from an architecture standpoint. Identify mismatches in architecture Prescribe target architecture by collaborating with the target system vendor Define projects to re-engineer the current system, if that is required prior to migration Adjust and publish schedules back to the teams based on this detailed assessment. At this point schedules tend to be much more clearer and detailed Architecture Detailing Phase Phase 2 This is by far the most critical phase, and the success heavily depends on what happens during this phase. One of the most important decisions in-migration of any application is whether to make it ‘Cloud Native’ or ‘Lift and Shift’ or something in between. This decision should be taken after understanding the current application in detail. www.knoldus.com
  • 8. Example: One of our customers has recently migrated from Cloudera to databricks. The customer is a large successful American Grocer, who needed to predict future sales based on historic sales data and promotions. These predictions happened at an item category level. The current pipeline accomplished this, by running the entire data related to one category in a large R application, which is single-threaded with extensive use of Memory. The architectural choices were to rewrite the code to use Spark parallelized algorithms, which means, the entire pipeline needs to be rearchitected from the ground up. Or, use lapply, a pseudo parallelization construct in spark, that lets us run the code in its entirety, in native R run-time without having to rewrite. Upon discussion internally, due to time constraints, we decided to migrate without rewriting the code, though it would be a better choice in the long run. The bottom line is, such decisions should be done well before, if you have the luxury of expertise and time, failing which, you would put the team in extreme pressure, which may result in production failures and failed projects. www.knoldus.com
  • 9. Lift and Shift Far too often the companies, with the stress of migration resort to a lift and shift approach. Knoldus highly recommends a cloud-native approach, wherein the application leverage the full potential of cloud-based architectures to gain long term customer delight and reduction in support costs. Lift and Shift Migration www.knoldus.com
  • 10. However, should you decide to go with lift and shift, consider the following. Is the application of incoming data-intensive or outgoing data? this has implications on data transfer costs. What are the key components used? Do you intend to plug in local or cloud-based monitoring systems? How much of intermediary storage is required? How do you manage the configurations of the application to tune the behavior of the application? What kind of integrations are necessary? Ques 1. Sample questions to ask www.knoldus.com ML External libraries and Enrichment of data ETL Security / Data Redaction Programming languages used
  • 11. Observe current spark job output for high shuffle memory usage, task failures Are applications enabled with CICD Are applications use logging extensively What parts of code will be in notebooks vs what part in Jars Are there any monitoring tools or logging tools currently that. also needs migration Job Dependencies Criticality of output Common Errors www.knoldus.com High RAM requirements Joins that are too large Broadcasts that are too large Are there any non-standard architectures or procedures used?Ques 2. Single-threaded apps
  • 12. At knoldus, we use the SAFe Agile process for managing multiple projects at the same time. Conduct a program increment planning, that plans and identifies relationships and dependencies between multiple teams. Breakdown overall goals into sprint goals Identify EPICs, features, stories, and spikes Create your Jira board Provide sufficient time for teams to understand their next 3-week sprint goals and discuss issues raised. Use the inputs to adjust the stories. Some level of estimations is important to recognize large tasks. Too large tasks need to be split so that they are manageable within the sprint. Document key architectures, and pipelines on confluence. Do an architecture review with key stakeholders. Document environment strategy? Are clusters dedicated to testing, stage, and production? Architecture detailing will give sufficient details to build the Jira board. Pre Execution (Build Jira board) Phase 3 www.knoldus.com Document Spikes and their potential scenarios. For example, if we want to convert a critical piece of logic from R to scala, what. will be the plan if it succeeds or fails?
  • 13. www.knoldus.com What is the current collaboration design ? for example, can multiple users execute the same job?Ques 1. Is this collaboration transferable to databricks notebooks?Ques 2. What is the definition of done? Is CI/CD pipelines includedQues 3. How do we test the output accuracy? Do we need to write code to automatically test results on a new platform?Ques 4. What is the testing process? Are test scripts prepared and ready?Ques 5. Sample questions to ask:
  • 14. Is the foundation laid well? Are users trained on the new technology? Ensure Jira board updates are reflecting on each team’s Jira boards. Scrum master to check with other scrum teams if the dependencies expected to be complete are on track or if that will impact the sprint deliverables. Is Unit testing is being rigorously followed? Are we following true agile where in some functionality is being demonstrated in demos? Are there any overlap issues in using the infrastructure Are we using slack to effectively notify all teams of the potential shut down? Execution Phase 4 This is the easy part. Its time to just execute based on the jira board. www.knoldus.com Are clusters deployed? For example, if a job is run by two different users, what is the damage. Security setup in place? which notebook folders are open for which users? How do users share code and data?
  • 15. Measure and understand if KPIs are met. If not met, introspect, and identify what needs to be done. Are basic essential KPIs met, so that we can go live and address the technical debt? Identify all technical debt, document. Define a plan to address technical debt. Is a new system up and running for sufficient time to hand over for production support. Celebrate. Once you are in the cloud, you will have access to several tools, frameworks, and new architecture patterns at your disposal and immensely increases your ability to respond to business needs. Closure Phase 5 Cloud managed services www.knoldus.com
  • 16. https://www.knoldus.com/connect/contact-us We encourage to work with experienced application architects and teams who have exposure to cloud-native and reactive architectures to continue the journey of digital transformation. We hope Knoldus can be a partner in your journey. Get in touch with us to schedule a call with our expert or drop us a line at hello@knoldus.com. Let’s Talk www.knoldus.com For more such insights, follow us here: https://www.linkedin.com/company/knoldus/about/ https://twitter.com/Knolspeak https://www.youtube.com/channel/UCP4g5qGeUSY7OokXfim1QCQ