Modernize your Enterprise Data Lake to Serverless Data Lake, where data, workloads, and orchestrations can be automatically migrated to the cloud-native infrastructure.
How to Troubleshoot Apps for the Modern Connected Worker
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
1. Migrating to Cloud: Inhouse
Hadoop to Databricks
Modernize your Enterprise Data Lake to Serverless Data Lake,
where data, workloads, and orchestrations can be automatically
migrated to the cloud-native infrastructure.
2. Migration of applications is a good thing. It forces the organization to clean up junk, that is never used. It adds a lot of
innovation and new ideas to your engineering teams. It is important to build confidence in our teams that future
migrations are not stressful and pushes teams to design systems to be flexible. It sends a message to vendors that
you are not bluffing about pulling the plug if you don’t see the results you expect
Some of the benefits of migrating (Our customers achieved) in case of the on-premise solution to databricks include
Commercial License and Maintenance cost
Tangible Benefits
Intangible Benefits
Reduced cluster costs, as you can leverage databricks auto-scale up/down and spot instance pricing
Reduced labor cost of creating new infrastructure
Avail cloud-based services (Azure data factory, Azure DevOps for example) and all the cloud-native services, like
lambda, EKS, S3/AZFS, etc
Reduced maintenance costs
Easier version upgrades
Improved performance due to databricks file system performance innovations
www.knoldus.com
3. Easier development with notebooks
The list goes on
But, it is also important that the migration delivers something tangible for business. Keeping your business partners aware
of the migration goals, expected results will enormously increase confidence in your capability and fosters team spirit.
Following is the Knoldus Migration Framework that has been tried and tested, and covers the most important points of a
typical migration:
www.knoldus.com
4. Planning and Communication phase
Phase 1
In this phase you will achieve the following:
Just like the white house coronavirus task force, form a team of experienced project managers, architects, business
users. Ensure there is sufficient technical expertise (Since this is primarily a technical project)
Establish a communication plan with the impacted teams. More often, migrations impact multiple organizational teams,
which could be a group of application owning teams and/or internal teams (Security, infrastructure, database, etc).
Collect inventory of applications with thorough details including application complexities, critical blackout periods that
impact schedules, critical people needs, etc.
Publish a roadmap, with tentative dates that are subject to change based on the application complexities.
Establish the KPIs.
Business KPIs ( eg. Accuracy of predictions.)
www.knoldus.com
Performance KPIs (Total run time)
Financial KPIs (Total monthly cost reduction)
Operational KPIs (Number of people required for maintenance)
5. Define the organization structure
Establishing a team involves several different factors. For a large organization, we established the following structure,
however, you should consider your own organizational factors before designing the migration team.
Central Migration Team
www.knoldus.com
6. What is the key goal of this migration?Ques 1.
What is the size, nature of the data that needs to be migrated?Ques 2.
What is a high-level of data ingress and egress needs?Ques 3.
Is GitHub, Jenkins, Jira, and Confluence setups locations identified?Ques 4.
Who has to approve the merge requests?Ques 5.
Sun setting Cloudera to save license cost?
Improve pipeline performance (Total end to end time-lapsed)?
Cloudera cluster needs more capacity, hence want a flexible resource model?
Intend to leverage other cloud services (For example Azure data factory)
Better automation?
Ease of use for data scientists? (Ie new features using notebooks)
Reduce infrastructure maintenance costs?
Sample Questions to Ask for Cloudera-Databricks
www.knoldus.com
7. Engage an experienced ‘Target System Specialist’ to take a look at the current applications, from an architecture
standpoint.
Identify mismatches in architecture
Prescribe target architecture by collaborating with the target system vendor
Define projects to re-engineer the current system, if that is required prior to migration
Adjust and publish schedules back to the teams based on this detailed assessment. At this point schedules tend to
be much more clearer and detailed
Architecture Detailing Phase
Phase 2
This is by far the most critical phase, and the success heavily depends on what happens during this phase.
One of the most important decisions in-migration of any application is whether to make it ‘Cloud Native’ or ‘Lift and
Shift’ or something in between. This decision should be taken after understanding the current application in detail.
www.knoldus.com
8. Example:
One of our customers has recently migrated from Cloudera to databricks. The customer is a large successful American
Grocer, who needed to predict future sales based on historic sales data and promotions. These predictions happened at
an item category level. The current pipeline accomplished this, by running the entire data related to one category in a
large R application, which is single-threaded with extensive use of Memory.
The architectural choices were to rewrite the code to use Spark parallelized algorithms, which means, the entire pipeline
needs to be rearchitected from the ground up. Or, use lapply, a pseudo parallelization construct in spark, that lets us run
the code in its entirety, in native R run-time without having to rewrite. Upon discussion internally, due to time constraints,
we decided to migrate without rewriting the code, though it would be a better choice in the long run.
The bottom line is, such decisions should be done well before, if you have the luxury of expertise and time, failing which,
you would put the team in extreme pressure, which may result in production failures and failed projects.
www.knoldus.com
9. Lift and Shift
Far too often the companies, with the stress of migration resort to a lift and shift approach. Knoldus highly recommends
a cloud-native approach, wherein the application leverage the full potential of cloud-based architectures to gain long
term customer delight and reduction in support costs.
Lift and Shift Migration
www.knoldus.com
10. However, should you decide to go with lift and shift, consider the following.
Is the application of incoming data-intensive or outgoing data? this has implications on data transfer costs.
What are the key components used?
Do you intend to plug in local or cloud-based monitoring systems?
How much of intermediary storage is required?
How do you manage the configurations of the application to tune the behavior of the application?
What kind of integrations are necessary?
Ques 1.
Sample questions to ask
www.knoldus.com
ML
External libraries and Enrichment of data
ETL
Security / Data Redaction
Programming languages used
11. Observe current spark job output for high shuffle memory usage, task failures
Are applications enabled with CICD
Are applications use logging extensively
What parts of code will be in notebooks vs what part in Jars
Are there any monitoring tools or logging tools currently that. also needs migration
Job Dependencies
Criticality of output
Common Errors
www.knoldus.com
High RAM requirements
Joins that are too large
Broadcasts that are too large
Are there any non-standard architectures or procedures used?Ques 2.
Single-threaded apps
12. At knoldus, we use the SAFe Agile process for managing multiple projects at the same time.
Conduct a program increment planning, that plans and identifies relationships and dependencies between
multiple teams.
Breakdown overall goals into sprint goals
Identify EPICs, features, stories, and spikes
Create your Jira board
Provide sufficient time for teams to understand their next 3-week sprint goals and discuss issues raised. Use the inputs to
adjust the stories.
Some level of estimations is important to recognize large tasks. Too large tasks need to be split so that they are
manageable within the sprint.
Document key architectures, and pipelines on confluence. Do an architecture review with key stakeholders.
Document environment strategy? Are clusters dedicated to testing, stage, and production?
Architecture detailing will give sufficient details to build the Jira board.
Pre Execution (Build Jira board)
Phase 3
www.knoldus.com
Document Spikes and their potential scenarios. For example, if we want to convert a critical piece of logic from R to
scala, what. will be the plan if it succeeds or fails?
13. www.knoldus.com
What is the current collaboration design ? for example, can multiple users execute the same job?Ques 1.
Is this collaboration transferable to databricks notebooks?Ques 2.
What is the definition of done? Is CI/CD pipelines includedQues 3.
How do we test the output accuracy? Do we need to write code to automatically test results on a new platform?Ques 4.
What is the testing process? Are test scripts prepared and ready?Ques 5.
Sample questions to ask:
14. Is the foundation laid well?
Are users trained on the new technology?
Ensure Jira board updates are reflecting on each team’s Jira boards.
Scrum master to check with other scrum teams if the dependencies expected to be complete are on track or if that
will impact the sprint deliverables.
Is Unit testing is being rigorously followed?
Are we following true agile where in some functionality is being demonstrated in demos?
Are there any overlap issues in using the infrastructure
Are we using slack to effectively notify all teams of the potential shut down?
Execution
Phase 4
This is the easy part. Its time to just execute based on the jira board.
www.knoldus.com
Are clusters deployed?
For example, if a job is run by two different users, what is the damage.
Security setup in place? which notebook folders are open for which users? How do users share code and data?
15. Measure and understand if KPIs are met.
If not met, introspect, and identify what needs to be done.
Are basic essential KPIs met, so that we can go live and address the technical debt?
Identify all technical debt, document.
Define a plan to address technical debt.
Is a new system up and running for sufficient time to hand over for production support.
Celebrate.
Once you are in the cloud, you will have access to several tools, frameworks, and new architecture patterns at your
disposal and immensely increases your ability to respond to business needs.
Closure
Phase 5
Cloud managed services
www.knoldus.com
16. https://www.knoldus.com/connect/contact-us
We encourage to work with experienced application architects and teams who have exposure to
cloud-native and reactive architectures to continue the journey of digital transformation.
We hope Knoldus can be a partner in your journey. Get in touch with us to schedule a call with
our expert or drop us a line at hello@knoldus.com.
Let’s
Talk
www.knoldus.com
For more such insights, follow us here:
https://www.linkedin.com/company/knoldus/about/ https://twitter.com/Knolspeak https://www.youtube.com/channel/UCP4g5qGeUSY7OokXfim1QCQ