Lindsey Tulloch, Software Engineer Intern at Red Hat, presented 'The Salmon Algorithm Spawning with Kubernetes' at Eastern Canada's Kubernetes and Cloud Native Meetups in 2019.
To see upcoming Kubernetes and Cloud Native meetups in Eastern Canada, please visit https://www.cloudops.com/workshop-calendar/#meetups
2. Lindsey Tulloch
Also likes: books, cats, coffee, bicycles, adventures
https://github.com/kubernetes-sigs/federation-v2
- Software Engineering Intern
@ Red Hat
- Multicluster May 2017 - Aug 2018
- Tekton Extension May 2019 - Thursday
- Kubernetes Release Team
- 1.12, 1.14, 1.15, 1.16 Release Notes
- Masters Student
@ University of Waterloo (as of next week!)
3. Spawning with Kubernetes
â Introduction and rationale
â Salmon
â Practical Considerations
â Containerization
â Compute Canada
â Implementation
â Experimental results
â Conclusion and Future
Work
6. The Salmon Algorithm
â John Orth, 2012
â metaheuristic algorithm for combinatorial optimization problems inspired by:
â Genetic Algorithms (population, generations, parent and child memories)
â Ant Colony optimization (ïŹow of water attracts salmon)
â Salmon spawning behaviour*
Each salmon contains two lists:
- tabu list: vertices in the current path under construction
- memory list: copy of the parent's completed tabu list.
*The behaviour of the software salmon is idealized. We make no claim that real salmon behave in exactly this fashion.
7. The Salmon Algorithm
Previous work:
- Orth (2012): Error Correcting Codes, Code Clique Equivalence, TSP
- Orth, Houghten,Tulloch (2017) and Tulloch, Houghten (2017): Param-ILS to
tune Parameters, TSP, DNA Fragment Assembly
- Shows promising results for TSP
- might be useful on other combinatorial optimization problems.
8. DNA Fragment Assembly Problem to TSP
The layout phase of FAP can be translated into a TSP problem[19]
(Mallen-Fullerton,Fernandez-Anaya) with the cities representing fragments and
the distances representing overlap scores.
Two minor changes:
Negate Total: Minimization to Maximization
Remove distance to last city since FAP is not circular (insertion of dummy city is
one solution)
9. Why Kube?
(if it ainât broke. . .)
- Increased use of containers (including
Singularity, Charliecloud, Docker).
- Research moving away from strict âjobâ style
workïŹows.
- Adoption of data-streaming and in-ïŹight
processing.
- Greater use of interactive Science Gateways.
- Dependence on other more persistent
services--not aligned with HPC systems
- Increasing demand for reproducibility.
(if it ainât broke. . .)
optimal, ïŹx it!
10. 10
Advantages:
- Same API across ALL cloud resources with predictable results
- Promise of gitops: abstract away the running and processing of
research jobs
- Very popular, lots of talented engineers working on it, thriving
community
- Long-term outlook looks good
Does Kubernetes make sense as a research platform?
14. Compute Canada
â Not-for-proïŹt corporation
â Membership includes most of Canadaâs
major research universities
â All Canadian faculty members have access to Compute Canada systems and can sponsor
others:
- students
- postdocs
- external collaborators
â No fee for Canadian university faculty
â Reduced fee for federal laboratories and
not-for-proïŹt orgs
15. Compute Canada
â Compute and storage resources, data centres
â Team of ~200 experts in utilization of advanced
research computing
â 100s of research software packages
â Cloud compute and storage (openstack, owncloud)
â 5-10 Data Centres
â 300,000 cores
â 12 PïŹops, 50+ PB
16. Compute Canada
Researchers drive innovation
â The CC user base is broadening,
bringing a broader set of needs.
â Tremendous interest in services
enabling Research Data Management
(RDM)
20. 20
Advantages:
- Same API across ALL cloud resources with predictable results
- GitOps
- Very popular,
Disadvantages:
- Complete overhaul of existing systems (Kubernetes and HPC
are not compatible at present)
- Huge learning curve - + yaml, docker, github, argo, kubeïŹow
- Desert of Expertise in academia
- Concern that corporate open source may not be as friendly for
academia as weâd like to think
Does Kubernetes make sense as a research platform?
31. Ease of
Deployment
For a 12 month project. . .
August
- Started preliminary brainstorming ahead of time
September
- OïŹcial project start
October
- Meetings with Compute Canada staff interested
in Kubernetes
November
- Request RAS
- End of November--actual RAS allocation
32. Ease of
Deployment
For a 12 month project. . .
December
- Kubecon Seattle, meet Compute Canada staff
working on Kubernetes F2F
- Ryan Taylor (@ Compute Canada) and Bob Killen
(@ U of Michigan) offer support
January
- Figuring out how to get Kubernetes up and
running on Openstack
February
- Need more resources than originally requested
- Still working through oddities in the Kubernetes
deployment
33. Ease of
Deployment
For a 12 month project. . .
March
- Kubernetes deployed!!! đđđđ
April
- Containerizing the salmon algorithm for TSP
- Figuring out and deploying Argo workïŹows
May
- Argo workïŹows successful
- DNA Fragment Assembly containers
- Salmon container git repo with CI
June
- Cluster Flakes due to maintenance
- Paper writing
34. Ease of
Deployment
For a 12 month project. . .
July
- Paper writing and submissions
- Presentation
- Graduate. . .?
NO KUBERNETES
FOR 7/11 MONTHS
36. Future of Kubernetes at CC
â Learning curve is steep and time is precious (installing Kubernetes on bare
metal just to run your workflow is not worth it)
â Lack of expertise with essential tools (yaml, docker, github)
â Given an existing Kubernetes cluster, with a knowledgeable admin that can
assist you--Kubernetes offers a lot of benefits worth exploring
â Completely automated, reproducible research workflows
â Tools optimized for containerized, persistent services
â Dependency hell eliminated
â Roll back of Kubernetes versions can be fairly painless
To Kube or not to Kube?
37. Providers
â Offer Kubernetes for people to consume
â Get involved with the Kube community
â Learn as much as you can
â Provide outreach to researchers and
anyone that might need to be ramped up
38. Researchers
â Engage with research institutions
â Get involved with the Kube community
â Learn as much as you can
â Share your findings widely!
39. Future Research
- Performance Comparisons
- Git ops
- Syslabs.ioâs âSlurm operatorâ
- Best practices for deploying
Kubernetes on prem
41. Useful Links
â CNCF Research User Group
â CNCF Academic Mailing List
â CNCF Academic Slack (#academia)
â Batch Jobs Channel (#kubernetes-batch-jobs)
â Kubernetes Big Data User Group
â Kubernetes Machine Learning Working Group
42. References
[1] G. M. Kurtzer, V. Sochat, and M. W. Bauer, âSingularity: Scientific containers for mobility of compute,â PLOS ONE, vol. 12, no. 5,
p. e0177459, May 2017. [Online]. Available: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0177459
[2] R. Priedhorsky and T. Randles, âCharliecloud: Unprivileged containers for user-defined software stacks in HPC,â p. 12.
[3] B. Burns, âThe History of Kubernetes & the Community Behind It,â 2018. [Online]. Available:
https://kubernetes.io/blog/2018/07/20/
the-history-of-kubernetes--the-community-behind-it/
[4] B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, âBorg, Omega, and Kubernetes,â p. 24.
[5] âCERN Case Study,â jun 2019. [Online]. Available: https://kubernetes.io/case-studies/cern/
[6] K. Sheets and S. Telfer, âKubernetes, HPC and MPI,â Nov. 2018. [Online]. Available: https://www.stackhpc.com/k8s-mpi.html
[7] A. Ingersoll, âThe full-time job of keeping up with Kubernetes,â https://gravitational.com/blog/kubernetes-release-cycle/. [Online].
Available: https://gravitational.com/blog/kubernetes-release-cycle/
[8] J. Orth, S. Houghten, and L. Tulloch, âEvaluation of the salmon algorithm,â in 2017 IEEE Conference on Computational
Intelligence
in Bioinformatics and Computational Biology (CIBCB), Aug. 2017, pp. 1â8.
[9] J. Orth, âThe Salmon Algorithm-A New Population Based Search Metaheuristic,â Masterâs Thesis, Brock University, 2012.
[Online].
Available: https://dr.library.brocku.ca/handle/10464/3929
[10] C. Pahl, A. Brogi, J. Soldani, and P. Jamshidi, âCloud Container Technologies: a State-of-the-Art Review,â IEEE Transactions
on Cloud
Computing, vol. PP, pp. 1â1, May 2017.
[11] âOpen Container Initiative, About,â https://www.opencontainers.org/faq, url = https://www.opencontainers.org/faq, language =
en-US, urldate = 2019-07-01, author = The Linux Foundation, journal = Open Containers Initiative, file = Snap-
shot:/home/banana/.zotero/zotero/7pdexnl0.default/zotero/storage/CJNUMT6R/faq.html:text/html.
45. References
[29] ââ, âBase code and container build repository for the Salmon Algorithm and basic DNA Fragment Assembly benchmark
problems: onyiny-ang/salmon,â Jul. 2019, original-date: 2019-05-10T05:53:00Z. [Online]. Available:
https://github.com/onyiny-ang/salmon
[30] S. Team, âIntroducing HPC Affinities to the Enterprise: A New Open Source Project Integrates Singularity and Slurm via
Kubernetes,â May 2019. [Online]. Available: https://www.sylabs.io/2019/05/
introducing-hpc-affinities-to-the-enterprise-a-new-open-source-project-integrates-singularity-and-slurm-via-kubernetes/
[31] F. da Veiga Leprevost, B. A. GrĂŒning, S. Alves Aflitos, H. L. Röst, J. Uszkoreit, H. Barsnes, M. Vaudel, P. Moreno, L. Gatto,
J. Weber, M. Bai, R. C. Jimenez, T. Sachsenberg, J. Pfeuffer, R. Vera Alvarez, J. Griss, A. I. Nesvizhskii, and Y. Perez-Riverol,
âBioContainers: an open-source and community-driven framework for software standardization,â Bioinformatics, vol. 33, no. 16, pp.
2580â2582, Aug. 2017. [Online]. Available: https://academic.oup.com/bioinformatics/article/33/16/2580/3096437
[32] J. A. Novella, P. Emami Khoonsari, S. Herman, D. Whitenack, M. Capuccini, J. Burman, K. Kultima, and O. Spjuth,
âContainer-based bioinformatics with Pachyderm,â bioRxiv, Apr. 2018. [Online]. Available:
http://biorxiv.org/lookup/doi/10.1101/299032
[33] O. Spjuth, M. Capuccini, M. Carone, A. Larsson, W. Schaal, J. Novella, P. Di Tommaso, C. Notredame, P. Moreno, P. E.
Khoonsari, S. Herman, K. Kultima, and S. Lampa, âApproaches for containerized scientific workflows in cloud environments with
applications in life science,â August 2018. [Online]. Available: https://peerj.com/preprints/27141
[34] S. Turol, C. Gutierrez, and S. Matykevich, âA Multitude of Kubernetes Deployment Tools: Kubespray, kops, and kubeadm,â https:
//www.altoros.com/blog/a-multitude-of-kubernetes-deployment-tools-kubespray-kops-and-kubeadm/, Jun. 2018. [Online]. Available:
https://www.altoros.com/blog/a-multitude-of-kubernetes-deployment-tools-kubespray-kops-and-kubeadm/