Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Kubeflow: portable and scalable machine learning using Jupyterhub and Kubernetes [PyData Delhi 2018]

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 30 Anzeige

Kubeflow: portable and scalable machine learning using Jupyterhub and Kubernetes [PyData Delhi 2018]

Herunterladen, um offline zu lesen

ML solutions in production start from data ingestion and extend upto the actual deployment step. We want this workflow to be scalable, portable and simple. Containers and kubernetes are great at the former two but not the latter if you aren't a devops practitioner. We'll explore how you can leverage the Kubeflow project to deploy best-of-breed open-source systems for ML to diverse infrastructures.

ML solutions in production start from data ingestion and extend upto the actual deployment step. We want this workflow to be scalable, portable and simple. Containers and kubernetes are great at the former two but not the latter if you aren't a devops practitioner. We'll explore how you can leverage the Kubeflow project to deploy best-of-breed open-source systems for ML to diverse infrastructures.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Kubeflow: portable and scalable machine learning using Jupyterhub and Kubernetes [PyData Delhi 2018] (20)

Anzeige

Aktuellste (20)

Kubeflow: portable and scalable machine learning using Jupyterhub and Kubernetes [PyData Delhi 2018]

  1. 1. Kubeflow: scalable and portable ML Akash Tandon, Data engineering@SocialCops Github: @analyticalmonk Twitter: @AkashTandon
  2. 2. Agenda - Need of DevOps for ML and Data Science (DataOps) - Containers and Kubernetes for ML - Opportunities and challenges - Kubeflow: composable, portable and scalable ML - Components - Low bar, high ceiling - Issues and roadmap - Summary and demo
  3. 3. Current ML workflow What you think
  4. 4. Current ML workflow The reality Source: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  5. 5. DataOps - DevOps in Data Science and ML DataOps is an automated, process-oriented methodology, used by analytic and data teams to improve the quality and reduce the cycle time of data analytics. DataOps manifesto: http://dataopsmanifesto.org
  6. 6. DataOps - DevOps in Data Science and ML
  7. 7. We need tools that are great at DevOps
  8. 8. Enter containers and Kubernetes
  9. 9. Containers ● Containers allow you to easily package an application's code, configurations, and dependencies into easy to use building blocks. ● These building blocks deliver environmental consistency, operational efficiency, developer productivity, and version control. ● To put it simply, your code runs in any environment!
  10. 10. But managing multiple containers can be a pain. That’s where K8s steps in.
  11. 11. Kubernetes ● Kubernetes is an orchestration manager for containers. ● It orchestrates computing, network and storage. ● Simply put, it makes your life easier when working with containers.
  12. 12. Sample K8s manifest
  13. 13. But there’s a catch.
  14. 14. Steep DevOps learning curve ● Containers ● Kubernetes primitives ● Persistent storage ● APIs ● Cloud platforms ● and it goes on...
  15. 15. DevOps practitioners don’t know enough Data Science. Data Scientists don’t know enough DevOps. And we don’t want them to!
  16. 16. How do we get DevOps goodness Without driving data teams crazy?!
  17. 17. Enter Kubeflow
  18. 18. Kubeflow ● ML toolkit for Kubernetes ● Open-source and community-driven ● Support for multiple ML frameworks ● End-to-end workflows which can be shared, scaled and deployed Source: https://github.com/kubeflow/kubeflow/issues/187
  19. 19. Low bar, high ceiling ● Low bar: allow data science practitioners to get up and running on Kubernetes cluster even without DevOps know-how. ● High ceiling: allow sysdmins and DevOps practitioners to modify defaults and extend the framework as needed.
  20. 20. Components ● Jupyterhub (collaboration and interactivity) ● K8s- native tensorflow controller (model building) ● K8s- native tensorflow serving deployment (model deployment) ● Ambassador (reverse proxy) ● Current and upcoming components for model tuning, model building and much more... ● Out-of-the-box setup for putting all of this together!
  21. 21. Jupyterhub
  22. 22. Tensorflow - Open source numerical computing and ML - Developed by Google, open-sourced in 2015 - Huge community and ecosystem - Support for multiple ML models - Tf-serving (model deployment), tensorboard (training visualization), etc. - Supports distributed training and deployment of models
  23. 23. Why Kubeflow? Based on current functionality you should consider using Kubeflow if: ● You want to train/serve TensorFlow models in different environments (e.g. local, on prem, and cloud) ● You want to use Jupyter notebooks to manage TensorFlow training jobs ● You want to launch training jobs that use resources – such as additional CPUs or GPUs – that aren’t available on your personal computer ● You want to combine TensorFlow with other processes ○ For example, you may want to use tensorflow/agents to run simulations to generate data for training reinforcement learning models. Refer https://www.kubeflow.org/docs/started/getting-started/ for more info.
  24. 24. Demo - Kubeflow tutorial using a sequence-to-sequence model - Based on Hamel Husain’s wonderful post: How to create data products that are magical using sequence-to-sequence models - Github repo: https://github.com/kubeflow/examples/tree/master/github_issue_summarization - Let’s get started!
  25. 25. Demo
  26. 26. Road ahead - Get the entry (bar)rier lower - Multi-tenancy on Kubernetes - Support for different ML libraries/packages - PyTorch - Caffe2 - Mxnet - v1.0 to be launched by December 2018
  27. 27. Find out more - Official website: https://www.kubeflow.org/ - Github: https://github.com/kubeflow/kubeflow - Katacoda tutorials: https://www.katacoda.com/kubeflow/
  28. 28. Reach out at Email: akashtndn.acm@gmail.com, akash@socialcops.com Twitter: @AkashTandon Github: @analyticalmonk
  29. 29. is hiring! https://socialcops.com/careers/
  30. 30. That’s all, folks! Questions?!

×