SlideShare ist ein Scribd-Unternehmen logo
1 von 24
The case for cloud computing in the life
sciences
Ola Spjuth <ola.spjuth@farmbio.uu.se>
Department of Pharmaceutical Biosciences
and Science for Life Laboratory
Uppsala University
About me
• Ola Spjuth, Docent
• Associate Professor at Uppsala University
– Data-intensive and translational bioinformatics (http://pharmb.io)
• Head of Bioinformatics Compute and Storage facility at SciLifeLab
– Responsible for managing resources
– Strategic e-infra planning and procurement for SciLifeLab
• Deputy Director at SNIC-UPPMAX HPC center
• Guest Researcher at Karolinska Institutet
– e-Science for Cancer Prevention and Control (eCPC), flagship
project at SeRC
2
From conventional microscopes…
..to digital video-microscopes
and image analysis
Molecular biology is a field in transition…
From manual operations…
…to automated robotized laboratories
Today: We have access to high-throughput
technologies to study biological phenomena
Science for Life Laboratory
An internationally leading center
that develops and applies
large-scale technologies for
molecular biosciences with a
focus on health and
environment.
Became a national platform in 2013
Stockholm node
Uppsala node
2017: Human whole genome sequenced
in 3 days for ~$1100
…requires supercomputers
for analysis and storage
Massively parallel sequencing….
2017: Illumina HiSeq X systems. 15K whole human
genomes per year
2016: NGI data velocity 950 Mbp/hour = 16 Mbp/s
Analysis
Scientists
Sample
transfer
Mode of operation
Platforms
Pre-processing (NGI)
Research (SNIC)
Data
delivery
Software +
reference data
Support
Education
Compute resources
Storage resources
Efficiency +
automation
UPPMAX: A national e-infrastructure
Some statistics Storage usage
Projects at SNIC-UPPMAX
Data-intensive bioinformatics
Other disciplines
Support tickets
New challenges: Data management and
analysis
• Storage
• Analysis methods, pipelines
• Scaling
• Automation
• Data integration, security
• Predictions
• …
Why cloud in the life sciences?
• Access to resources
– Flexible configurations
– On-demand
– Cost-efficient?
• Collaborate on international level
– Publish/federate data
– E.g. Large sequencing initiatives, “move compute to the
data”
• New types of analysis environments
– Hadoop/Spark/Flink etc.
– Microservices, Docker, Kubernetes, Mesos
12
Challenges with cloud
• Tradition: Strong HPC tradition in academia
– Existing resources funded by Research Council and
personnel at 6 centra in Sweden (SNIC)
• Economy: Cost model is new
– Difficult to assess the costs
• Legal: Working with sensitive data
• Educational: New technology for many
13
Needs in bioinformatics
• Primarily resources with a lot of RAM and storage (high I/O)
• Preferably transparent system, users don’t want to deal with e-
infrastructure at all
• How to work with storage (tiered?)
14
My research focus
e-infrastructure development
Automation, Big Data
e-Science methods development
Prediction models,
machine learning
Applied e-Science research
Drug discovery and
individualized diagnostics
Selected research in my group
Privacy
preservation
Workflows
Big Data
frameworks
Data management and
predictive modeling
Data
federation
Compute
federation
Reactive/continuous modeling
Data sources
Coordinate
Integrate
Version
Monitor
Publish
models
Archive
models
User
Bioclipse
Train and
assess model
Tools
Tools
Data
Data
VREs aim to
bridge this gap!
Researcher Other
researchers
Virtual Research Environments
Researcher
Tools
Data
Compute
and
storage
resources
Virtual Research Environment!
Other
researchers
Virtual Research Environments
Cloudflare
kubeadm Terraform
kubectl
Packer
• Enable users to deploy their own virtual
infrastructure on an IaaS provider
• Containerize tools, orchestrate with workflow
systems on top of Kubernetes
PhenoMeNal approach and
stack
KubeNow
Hierarchical Analysis of Temporal and
Spatial Image Data
21
Carolina Wählby
PI, PhD, Professor in Quantitative Microscopy
Andreas Hellander
Co-PI, Associate Professor
Ola Spjuth
Co-PI, Associate Professor
www.cb.uu.se/~carolina/HATSID.html
Presenting at Spark Summit 2017:
“EasyMapReduce: Leverage the power of Spark And
Docker To scale scientific tools in MapReduce
fashion“
22https://spark-summit.org/east-2017/events/easymapreduce-leverage-the-
power-of-spark-and-docker-to-scale-scientific-tools-in-mapreduce-fashion/
Our most recent scientific publication
23
http://jcheminf.springeropen.com/articles/10.1186/s13321-017-0204-4
European Open Science Cloud (EOSC)
• The vast majority of all data in the world (in fact up to 90%) has been
generated in the last two years.
• Scientific data is in direct need of openness, better handling, careful
management, machine actionability and sheer re-use.
• European Open Science Cloud: A vision of a future infrastructure to
support Open Research Data and Open Science in Europe
– It should enable trusted access to services, systems and the re-use
of shared scientific data across disciplinary, social and geographical
borders
– research data should be findable, accessible, interoperable and re-
usable (FAIR)
– provide the means to analyze datasets of huge sizes
24http://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud

Weitere ähnliche Inhalte

Was ist angesagt?

BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...
Alejandra Gonzalez-Beltran
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
dgarijo
 
An experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithmsAn experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithms
arx-deidentifier
 
Fr1T101-Kuo-20110729 IGARSS ESC.pptx
Fr1T101-Kuo-20110729 IGARSS ESC.pptxFr1T101-Kuo-20110729 IGARSS ESC.pptx
Fr1T101-Kuo-20110729 IGARSS ESC.pptx
grssieee
 
Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...
Ola Spjuth
 

Was ist angesagt? (20)

Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBI
 
FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.
 
Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
 
Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
ELIXIR-UK and the ELIXIR Interoperability Platform
ELIXIR-UK and the ELIXIR Interoperability PlatformELIXIR-UK and the ELIXIR Interoperability Platform
ELIXIR-UK and the ELIXIR Interoperability Platform
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
 
An experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithmsAn experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithms
 
Supporting Big Data, Open Data, Data Analytics and Data Science
Supporting Big Data, Open Data, Data Analytics and Data ScienceSupporting Big Data, Open Data, Data Analytics and Data Science
Supporting Big Data, Open Data, Data Analytics and Data Science
 
An introduction to machine learning in biomedical research: Key concepts, pr...
An introduction to machine learning in biomedical research:  Key concepts, pr...An introduction to machine learning in biomedical research:  Key concepts, pr...
An introduction to machine learning in biomedical research: Key concepts, pr...
 
Fr1T101-Kuo-20110729 IGARSS ESC.pptx
Fr1T101-Kuo-20110729 IGARSS ESC.pptxFr1T101-Kuo-20110729 IGARSS ESC.pptx
Fr1T101-Kuo-20110729 IGARSS ESC.pptx
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher?
 
Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...
 

Ähnlich wie The case for cloud computing in Life Sciences

10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
Alex Hardisty
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Digital pathology and its importance as an omics data layer
Digital pathology and its importance as an omics data layerDigital pathology and its importance as an omics data layer
Digital pathology and its importance as an omics data layer
Yves Sucaet
 

Ähnlich wie The case for cloud computing in Life Sciences (20)

Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
 
Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...Continuous modeling - automating model building on high-performance e-Infrast...
Continuous modeling - automating model building on high-performance e-Infrast...
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Agile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discoveryAgile large-scale machine-learning pipelines in drug discovery
Agile large-scale machine-learning pipelines in drug discovery
 
Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and Cloud
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
 
Towards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsTowards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery Labs
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative research
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
 
Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Building an informatics solution to sustain AI-guided cell profiling with hig...
Building an informatics solution to sustain AI-guided cell profiling with hig...Building an informatics solution to sustain AI-guided cell profiling with hig...
Building an informatics solution to sustain AI-guided cell profiling with hig...
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
Digital pathology and its importance as an omics data layer
Digital pathology and its importance as an omics data layerDigital pathology and its importance as an omics data layer
Digital pathology and its importance as an omics data layer
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imaging
 
2014 11-26 EATRIS biomarkers platform meeting, Amsterdam, Organising technolo...
2014 11-26 EATRIS biomarkers platform meeting, Amsterdam, Organising technolo...2014 11-26 EATRIS biomarkers platform meeting, Amsterdam, Organising technolo...
2014 11-26 EATRIS biomarkers platform meeting, Amsterdam, Organising technolo...
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
e-Strategy
e-Strategye-Strategy
e-Strategy
 
BSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming ModelsBSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming Models
 
Research Data, or: How I Learned to Stop Worrying and Love the Policy
Research Data, or: How I Learned to Stop Worrying and Love the PolicyResearch Data, or: How I Learned to Stop Worrying and Love the Policy
Research Data, or: How I Learned to Stop Worrying and Love the Policy
 

Mehr von Ola Spjuth

Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression DatasetsCombining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
Ola Spjuth
 

Mehr von Ola Spjuth (8)

Automating cell-based screening with open source, robotics and AI
Automating cell-based screening with open source, robotics and AIAutomating cell-based screening with open source, robotics and AI
Automating cell-based screening with open source, robotics and AI
 
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression DatasetsCombining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
 
Storage and Analysis of Sensitive Large-Scale Biomedical Data in Sweden
Storage and Analysis of Sensitive Large-Scale Biomedical Data in SwedenStorage and Analysis of Sensitive Large-Scale Biomedical Data in Sweden
Storage and Analysis of Sensitive Large-Scale Biomedical Data in Sweden
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-Science
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in science
 
Chemical decision support in toxicology and pharmacology (OpenToxEU 2013)
Chemical decision support in toxicology and pharmacology (OpenToxEU 2013)Chemical decision support in toxicology and pharmacology (OpenToxEU 2013)
Chemical decision support in toxicology and pharmacology (OpenToxEU 2013)
 
Building a flexible infrastructure with Bioclipse, open source, and federated...
Building a flexible infrastructure with Bioclipse, open source, and federated...Building a flexible infrastructure with Bioclipse, open source, and federated...
Building a flexible infrastructure with Bioclipse, open source, and federated...
 
Accessing and scripting CDK from Bioclipse
Accessing and scripting CDK from BioclipseAccessing and scripting CDK from Bioclipse
Accessing and scripting CDK from Bioclipse
 

Kürzlich hochgeladen

biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Silpa
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Silpa
 

Kürzlich hochgeladen (20)

GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 

The case for cloud computing in Life Sciences

  • 1. The case for cloud computing in the life sciences Ola Spjuth <ola.spjuth@farmbio.uu.se> Department of Pharmaceutical Biosciences and Science for Life Laboratory Uppsala University
  • 2. About me • Ola Spjuth, Docent • Associate Professor at Uppsala University – Data-intensive and translational bioinformatics (http://pharmb.io) • Head of Bioinformatics Compute and Storage facility at SciLifeLab – Responsible for managing resources – Strategic e-infra planning and procurement for SciLifeLab • Deputy Director at SNIC-UPPMAX HPC center • Guest Researcher at Karolinska Institutet – e-Science for Cancer Prevention and Control (eCPC), flagship project at SeRC 2
  • 3. From conventional microscopes… ..to digital video-microscopes and image analysis Molecular biology is a field in transition…
  • 4. From manual operations… …to automated robotized laboratories
  • 5. Today: We have access to high-throughput technologies to study biological phenomena
  • 6. Science for Life Laboratory An internationally leading center that develops and applies large-scale technologies for molecular biosciences with a focus on health and environment. Became a national platform in 2013 Stockholm node Uppsala node
  • 7. 2017: Human whole genome sequenced in 3 days for ~$1100 …requires supercomputers for analysis and storage Massively parallel sequencing…. 2017: Illumina HiSeq X systems. 15K whole human genomes per year 2016: NGI data velocity 950 Mbp/hour = 16 Mbp/s
  • 9. Software + reference data Support Education Compute resources Storage resources Efficiency + automation UPPMAX: A national e-infrastructure
  • 10. Some statistics Storage usage Projects at SNIC-UPPMAX Data-intensive bioinformatics Other disciplines Support tickets
  • 11. New challenges: Data management and analysis • Storage • Analysis methods, pipelines • Scaling • Automation • Data integration, security • Predictions • …
  • 12. Why cloud in the life sciences? • Access to resources – Flexible configurations – On-demand – Cost-efficient? • Collaborate on international level – Publish/federate data – E.g. Large sequencing initiatives, “move compute to the data” • New types of analysis environments – Hadoop/Spark/Flink etc. – Microservices, Docker, Kubernetes, Mesos 12
  • 13. Challenges with cloud • Tradition: Strong HPC tradition in academia – Existing resources funded by Research Council and personnel at 6 centra in Sweden (SNIC) • Economy: Cost model is new – Difficult to assess the costs • Legal: Working with sensitive data • Educational: New technology for many 13
  • 14. Needs in bioinformatics • Primarily resources with a lot of RAM and storage (high I/O) • Preferably transparent system, users don’t want to deal with e- infrastructure at all • How to work with storage (tiered?) 14
  • 15. My research focus e-infrastructure development Automation, Big Data e-Science methods development Prediction models, machine learning Applied e-Science research Drug discovery and individualized diagnostics
  • 16. Selected research in my group Privacy preservation Workflows Big Data frameworks Data management and predictive modeling Data federation Compute federation
  • 18. Tools Tools Data Data VREs aim to bridge this gap! Researcher Other researchers Virtual Research Environments
  • 20. Cloudflare kubeadm Terraform kubectl Packer • Enable users to deploy their own virtual infrastructure on an IaaS provider • Containerize tools, orchestrate with workflow systems on top of Kubernetes PhenoMeNal approach and stack KubeNow
  • 21. Hierarchical Analysis of Temporal and Spatial Image Data 21 Carolina Wählby PI, PhD, Professor in Quantitative Microscopy Andreas Hellander Co-PI, Associate Professor Ola Spjuth Co-PI, Associate Professor www.cb.uu.se/~carolina/HATSID.html
  • 22. Presenting at Spark Summit 2017: “EasyMapReduce: Leverage the power of Spark And Docker To scale scientific tools in MapReduce fashion“ 22https://spark-summit.org/east-2017/events/easymapreduce-leverage-the- power-of-spark-and-docker-to-scale-scientific-tools-in-mapreduce-fashion/
  • 23. Our most recent scientific publication 23 http://jcheminf.springeropen.com/articles/10.1186/s13321-017-0204-4
  • 24. European Open Science Cloud (EOSC) • The vast majority of all data in the world (in fact up to 90%) has been generated in the last two years. • Scientific data is in direct need of openness, better handling, careful management, machine actionability and sheer re-use. • European Open Science Cloud: A vision of a future infrastructure to support Open Research Data and Open Science in Europe – It should enable trusted access to services, systems and the re-use of shared scientific data across disciplinary, social and geographical borders – research data should be findable, accessible, interoperable and re- usable (FAIR) – provide the means to analyze datasets of huge sizes 24http://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud

Hinweis der Redaktion

  1. Strategic funding to enable: Infrastructure for high-throughput analysis Multi-disciplinary research environment Competence in technology and analysis methodology
  2. Access to computers (many if you need) Access to storage (a lot if you need) Pre-installed software and reference genomes Free
  3. How improve efficiency on shared HPC for data-intensive bioinformatics? Can Cloud Computing and Big Data frameworks aid data-intensive research? How useful are Scientific Workflows in data-intensive research? Can predictive modeling aid data acquisition, storage and analysis? How can we continuously improve predictive models as data changes?