SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Why are we going to talk to you
about environments/containers?
Phil Ashton, PGI Advanced Training - KEMRI Wellcome, April 2022
There are two* universally painful parts of
being a bioinformatician:
● Software installation
● Reproducibility
Environments/containers can help solve
both these problems
*at least two
Ease of use for
developers
Ease of use
for users
Hard
Easy
Hard
Easy
apt-get/yum
Some scripts and a
list of dependencies
Traditionally (i.e. before 2016) there was a trade off between the simplicity of making software available and
the simplicity of installing software
Virtual environments and containers provide a “happy medium” - relatively straightforward for both users
and developers
Installation
Reproducibility
“It works on my machine”
“We build our computer systems the way we build our cities: over time, without a plan,
on top of ruins” - Ellen Ullman
The solution is to make our machines as similar as possible, and to start from fresh each time
- environments and containers help us do this
Conda and Conda
environments
(Python) package management is (was?) a mess
https://xkcd.com/1987/
Introduction
to Conda
What is the problem?
● We want to avoid “dependency hell”:
○ I want to run “software A”
○ Software A depends on Software B, C, and D.
○ Software B depends on Software E and F
○ Software C depends on G and H
○ Software D depends on I and J
○ 

What is the solution?
● “Package management”
Enter Anaconda
● Anaconda is a distribution of
Python & R aimed at scientists
that aims to simplify package
management and deployment.
● When you install Anaconda you
get:
○ Python + common scientific
packages
○ Conda (language agnostic)
■ Package manager
■ Environment
manager
● Miniconda is:
○ Python
○ Conda
conda
● Thousands of packages available
to install via conda
● Different “channels” that are the
location where packages are
hosted
● Channels include
○ Bioconda - for bioinformatics software
○ Conda-forge - miscellaneous
community provided packages
● Channels are searched in order
Bioconda
● Conda became a victim of its own success
- as the number of packages increases,
the time taken to “solve the
dependencies” has increased.
○ What is the current state of the software
environment on the computer?
○ What is the desired state of the software
environment (for the new software)?
○ Make an upgrade plan that allows installation of
new software, without breaking existing software.
● Two things slow down conda:
○ Increasing numbers of packages installed on
your computer
○ Increasing number of packages available for
installation.
Mamba
● Mamba is a “drop in” replacement for
conda that is much quicker:
○ `mamba install`
○ `mamba create`
● Uses the libsolv dependency solver and
optimised code in C++
Mamba
Conda environments
● Conda is also an environment manager
● Environments (also known as virtual environments) are a way to have
separate and independent installations of software on the same computer.
● They work by modifying the $PATH variable (very simple explanation).
● Advantages of environments:
○ Software installs quickly (if there isn’t much inside your environment)
○ It’s possible to have software that requires e.g. different versions of the same dependencies.
● Default environment is called `base`
What do you want your base environment to look like?
vs
Limitations of Conda
● Can be differences in dependency resolution solutions between platforms
● Conda environments are not inherently cross-platform
I.e. doesn’t *necessarily* solve the “works on my machine” problem.
Outline of practical
1. Install miniconda
2. Use conda to install mamba
3. Use mamba to install a package into the base environment
4. Use mamba to create a new environment and install software into it
5. Install software into an existing environment
6. Install software from a yaml file
7. Export the specification for an existing environment to a yaml file, and create
a new environment from that file
Acknowledgements & further reading
Anna Price, Cardiff/CLIMB - https://www.youtube.com/watch?v=qORviM_ELdk
Open Software Packaging for Science | by QuantStack | Medium
Python Virtual Environments: A Primer

Weitere Àhnliche Inhalte

Was ist angesagt?

Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted MalaskaTop 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Spark Summit
 

Was ist angesagt? (20)

So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector?
 
KAFKA 3.1.0.pdf
KAFKA 3.1.0.pdfKAFKA 3.1.0.pdf
KAFKA 3.1.0.pdf
 
Epoll - from the kernel side
Epoll -  from the kernel sideEpoll -  from the kernel side
Epoll - from the kernel side
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Reactive Microservices with Spring 5: WebFlux
Reactive Microservices with Spring 5: WebFlux Reactive Microservices with Spring 5: WebFlux
Reactive Microservices with Spring 5: WebFlux
 
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and FanoutOpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practices
 
[Pgday.Seoul 2017] 2. PostgreSQL을 위한 ëŠŹëˆ…ìŠ€ 컀널 씜적화 - êč€ìƒìš±
[Pgday.Seoul 2017] 2. PostgreSQL을 위한 ëŠŹëˆ…ìŠ€ 컀널 씜적화 - êč€ìƒìš±[Pgday.Seoul 2017] 2. PostgreSQL을 위한 ëŠŹëˆ…ìŠ€ 컀널 씜적화 - êč€ìƒìš±
[Pgday.Seoul 2017] 2. PostgreSQL을 위한 ëŠŹëˆ…ìŠ€ 컀널 씜적화 - êč€ìƒìš±
 
Reactive Programming for Real Use Cases
Reactive Programming for Real Use CasesReactive Programming for Real Use Cases
Reactive Programming for Real Use Cases
 
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examples
 
MongoDB Backups and PITR
MongoDB Backups and PITRMongoDB Backups and PITR
MongoDB Backups and PITR
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted MalaskaTop 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
 

Ähnlich wie 2022.03.23 Conda and Conda environments.pptx

OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
Russell Pavlicek
 

Ähnlich wie 2022.03.23 Conda and Conda environments.pptx (20)

Thinking inside the box (shared)
Thinking inside the box (shared)Thinking inside the box (shared)
Thinking inside the box (shared)
 
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with conda
 
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary SlidesRise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
Rise of the machines: Continuous Delivery at SEEK - YOW! Night Summary Slides
 
Build and deploy scientific Python Applications
Build and deploy scientific Python Applications  Build and deploy scientific Python Applications
Build and deploy scientific Python Applications
 
Magento Docker Setup.pdf
Magento Docker Setup.pdfMagento Docker Setup.pdf
Magento Docker Setup.pdf
 
Installing Software, Part 3: Command Line
Installing Software, Part 3: Command LineInstalling Software, Part 3: Command Line
Installing Software, Part 3: Command Line
 
Why you should use native packages to install PostgreSQL on Linux
Why you should use native packages to install PostgreSQL on LinuxWhy you should use native packages to install PostgreSQL on Linux
Why you should use native packages to install PostgreSQL on Linux
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012
 
anaconda.pptx
anaconda.pptxanaconda.pptx
anaconda.pptx
 
Applying the Unix Philosophy to Django projects: a report from the real world
Applying the Unix Philosophy to Django projects: a report from the real worldApplying the Unix Philosophy to Django projects: a report from the real world
Applying the Unix Philosophy to Django projects: a report from the real world
 
Headless Android
Headless AndroidHeadless Android
Headless Android
 
PHP Mega Meetup, Sep, 2020, Anti patterns in php
PHP Mega Meetup, Sep, 2020, Anti patterns in phpPHP Mega Meetup, Sep, 2020, Anti patterns in php
PHP Mega Meetup, Sep, 2020, Anti patterns in php
 
Polstra 44con2012
Polstra 44con2012Polstra 44con2012
Polstra 44con2012
 
Hacking and Forensics on the Go - 44CON 2012
Hacking and Forensics on the Go - 44CON 2012Hacking and Forensics on the Go - 44CON 2012
Hacking and Forensics on the Go - 44CON 2012
 
Conda environment system & how to use it on CSUC machines
Conda environment system & how to use it on CSUC machinesConda environment system & how to use it on CSUC machines
Conda environment system & how to use it on CSUC machines
 
Docker 101
Docker 101Docker 101
Docker 101
 
What's new in p2 (2009)?
What's new in p2 (2009)?What's new in p2 (2009)?
What's new in p2 (2009)?
 
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
 
Managing Software Dependencies and the Supply Chain_ MIT EM.S20.pdf
Managing Software Dependencies and the Supply Chain_ MIT EM.S20.pdfManaging Software Dependencies and the Supply Chain_ MIT EM.S20.pdf
Managing Software Dependencies and the Supply Chain_ MIT EM.S20.pdf
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 

Mehr von Philip Ashton (6)

2022.09.26 iNTS household survey.pptx
2022.09.26 iNTS household survey.pptx2022.09.26 iNTS household survey.pptx
2022.09.26 iNTS household survey.pptx
 
2022.03.24 Snakemake.pptx
2022.03.24 Snakemake.pptx2022.03.24 Snakemake.pptx
2022.03.24 Snakemake.pptx
 
2016.11.08 doherty institute symposium
2016.11.08 doherty institute symposium2016.11.08 doherty institute symposium
2016.11.08 doherty institute symposium
 
2016.03.08 immem talk salmonella outbreaks
2016.03.08 immem talk   salmonella outbreaks2016.03.08 immem talk   salmonella outbreaks
2016.03.08 immem talk salmonella outbreaks
 
SnapperDB ASM NGS conference
SnapperDB ASM NGS conferenceSnapperDB ASM NGS conference
SnapperDB ASM NGS conference
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiology
 

KĂŒrzlich hochgeladen

SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
levieagacer
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
SĂ©rgio Sacani
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

KĂŒrzlich hochgeladen (20)

Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Kochi ❀CALL GIRL 84099*07087 ❀CALL GIRLS IN Kochi ESCORT SERVICE❀CALL GIRL
Kochi ❀CALL GIRL 84099*07087 ❀CALL GIRLS IN Kochi ESCORT SERVICE❀CALL GIRLKochi ❀CALL GIRL 84099*07087 ❀CALL GIRLS IN Kochi ESCORT SERVICE❀CALL GIRL
Kochi ❀CALL GIRL 84099*07087 ❀CALL GIRLS IN Kochi ESCORT SERVICE❀CALL GIRL
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
❀Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💩✅.
❀Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💩✅.❀Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💩✅.
❀Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💩✅.
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 

2022.03.23 Conda and Conda environments.pptx

  • 1. Why are we going to talk to you about environments/containers? Phil Ashton, PGI Advanced Training - KEMRI Wellcome, April 2022
  • 2. There are two* universally painful parts of being a bioinformatician: ● Software installation ● Reproducibility Environments/containers can help solve both these problems *at least two
  • 3. Ease of use for developers Ease of use for users Hard Easy Hard Easy apt-get/yum Some scripts and a list of dependencies Traditionally (i.e. before 2016) there was a trade off between the simplicity of making software available and the simplicity of installing software Virtual environments and containers provide a “happy medium” - relatively straightforward for both users and developers Installation
  • 4. Reproducibility “It works on my machine” “We build our computer systems the way we build our cities: over time, without a plan, on top of ruins” - Ellen Ullman The solution is to make our machines as similar as possible, and to start from fresh each time - environments and containers help us do this
  • 6. (Python) package management is (was?) a mess https://xkcd.com/1987/
  • 8. What is the problem? ● We want to avoid “dependency hell”: ○ I want to run “software A” ○ Software A depends on Software B, C, and D. ○ Software B depends on Software E and F ○ Software C depends on G and H ○ Software D depends on I and J ○ 
 What is the solution? ● “Package management”
  • 9. Enter Anaconda ● Anaconda is a distribution of Python & R aimed at scientists that aims to simplify package management and deployment. ● When you install Anaconda you get: ○ Python + common scientific packages ○ Conda (language agnostic) ■ Package manager ■ Environment manager ● Miniconda is: ○ Python ○ Conda
  • 10. conda ● Thousands of packages available to install via conda ● Different “channels” that are the location where packages are hosted ● Channels include ○ Bioconda - for bioinformatics software ○ Conda-forge - miscellaneous community provided packages ● Channels are searched in order Bioconda
  • 11. ● Conda became a victim of its own success - as the number of packages increases, the time taken to “solve the dependencies” has increased. ○ What is the current state of the software environment on the computer? ○ What is the desired state of the software environment (for the new software)? ○ Make an upgrade plan that allows installation of new software, without breaking existing software. ● Two things slow down conda: ○ Increasing numbers of packages installed on your computer ○ Increasing number of packages available for installation. Mamba
  • 12. ● Mamba is a “drop in” replacement for conda that is much quicker: ○ `mamba install` ○ `mamba create` ● Uses the libsolv dependency solver and optimised code in C++ Mamba
  • 13. Conda environments ● Conda is also an environment manager ● Environments (also known as virtual environments) are a way to have separate and independent installations of software on the same computer. ● They work by modifying the $PATH variable (very simple explanation). ● Advantages of environments: ○ Software installs quickly (if there isn’t much inside your environment) ○ It’s possible to have software that requires e.g. different versions of the same dependencies. ● Default environment is called `base`
  • 14. What do you want your base environment to look like? vs
  • 15. Limitations of Conda ● Can be differences in dependency resolution solutions between platforms ● Conda environments are not inherently cross-platform I.e. doesn’t *necessarily* solve the “works on my machine” problem.
  • 16. Outline of practical 1. Install miniconda 2. Use conda to install mamba 3. Use mamba to install a package into the base environment 4. Use mamba to create a new environment and install software into it 5. Install software into an existing environment 6. Install software from a yaml file 7. Export the specification for an existing environment to a yaml file, and create a new environment from that file
  • 17. Acknowledgements & further reading Anna Price, Cardiff/CLIMB - https://www.youtube.com/watch?v=qORviM_ELdk Open Software Packaging for Science | by QuantStack | Medium Python Virtual Environments: A Primer

Hinweis der Redaktion

  1. https://medium.com/@QuantStack/open-software-packaging-for-science-61cecee7fc23
  2. https://realpython.com/python-virtual-environments-a-primer/