SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Gestion des données scientifiques en imagerie in vivo
https://piv2017.sciencesconf.org/
Christophe CERIN, Université Paris13
christophe.cerin@lipn.univ-paris13.fr
Plateformes de resources numériques
Expérience USPC
Paris, December 7, 2017
Outline
2
• Motivations related to large scale platforms
• Background:
• Criteria for observing platforms
• High Performance Computing (HPC)
• Apache suite for Big Data
• Some projects
• Contribution: USPC platform
• Conclusion and perspective
Motivations related to large
scale platforms
In experimental Sciences
we need tips for using
them appropriately / best
practices
3
4
Definition of a large scale system
• System with unprecedented amounts of hardware (>1M cores,
>100 PB storage, >100Gbits/s interconnect);
• Ultra large scale system: hardware + lines of source code +
numbers of users + volumes of data;
• New problems:
• built across multiple organizations;
• often with conflicting purposes and needs;
• constructed from heterogeneous parts with complex
dependencies and emergent properties;
• continuously evolving;
• software, hardware and human failures will be the norm,
not the exception.
Since in our lives (Cloud Computing)
5
Software as a Service
SaaS
Plateform as a Service
PaaS

Infrastructure as a 

Service

IaaS
• Don’t believe them;
• Variety in the eco-system is the norm.
6
Some speeches are ready for you
In Cloud, innovation drives technology/science
Containers and no more Virtual Machines
7
Hardware
Operating System
App App App
Conventional Stack
Hardware
OS
App App App
Hypervisor
OS OS
Virtualized Stack
▪Abstraction on resources in the past: VM technologies to hide
physical properties, interactions between system layer,
application layer and the end-users
VM versus Container
8
9
In Sciences (not in business)
• Experimental approach: modeling,…,
experiments,…
• What is an experimental plan for large
scale systems and reproducible research?
• Answers depend on the system used:
• Cluster (dedicated; managed)
• Cloud (almost non supervised… or by
you)
Background
10
▪Cluster ➔Performance ➔Flops - Top500
▪Cluster ➔Power ➔Flops per watt - Green500
▪Cloud ➔Price ➔for computing, for storing…
▪Cloud: the economic model may be strange (Amazon
Spot Instances)
▪Scientific issues: how to compare cluster and cloud?
(be fair)
11
Performance criteria for observing
large scale system
12
Performance (Top500)
13
Performance (Green500)
14
Productivity
▪Productivity traditionally refers to the ratio between the quantity
of software produced and the cost spent for it.
▪Factors: Programming language used, Program size, The experience
of programmers and design personnel, The novelty of requirements,
The complexity of the program and its data, The use of structured
programming methods, Program class or the distribution method,
Program type of the application area, Tools and environmental
conditions, Enhancing existing programs or systems, Maintaining
existing programs or systems, Reusing existing modules and standard
designs, Program generators, Fourth-generation languages,
Geographic separation of development locations, Defect potentials
and removal methods, (Existing) Documentation, Prototyping before
main development begins, Project teams and organization
structures, Morale and compensation of staff
▪Parallel (http://press3.mcs.anl.gov/atpesc/files/2015/03/ESCTP-snir_215.pdf):
▪Shared memory (OpenMP):
▪CPU+vector
▪CPU+GPU
▪CPU+accelerators (Google Tensor processing unit…)
▪Message passing (MPI)
▪Sequential with the call to « parallel data structure » (Scala)
▪Map-reduce: Hadoop, Spark (In addition to Map and Reduce operations, it
supports SQL queries, streaming data, machine learning and graph data
processing.)
15
Programming model
▪Even more complicated: Cray has added "ARM Option" (i.e. CPU blade option,
using the ThunderX2) to their XC50 supercomputers
▪We need consolidations:
▪in hardware
▪in software
▪in algorithms
▪in applications
▪in experimental methods
▪hpc facilities
▪training
16
Hardware landscape
17
Apache suite for Big Data
1- Bases de Données
1.1 Apache Hbase
1.2 Apache Cassandra
1.3 CouchDB
1.4 MongoDB
2 Accès aux données/ requêtage
2.1 Pig
2.2 Hive
2.3 Livy
3 Data Intelligence
3.1 Apache Drill
3.2 Apache Mahout
3.3 H2O
4 Data Serialisation
4.1 Apache Thrift
4.2 Apache Avro
5 Data integration
5.1 Apache Sqoop
5.2 Apache Flume
5.3 Apache Chuckwa
6 Requetage
6.1 Presto
6.2 Impala
6.3 Dremel
7 Sécurité des données
7.1 Apache Metron
7.2 Sqrrl
8.2 MapReduce
8.3 Spark
9.1 Elasticsearch
10.1 Apache Hadoop
10.6 Apache Mesos
10.10 Apache Flink
10.12 Apache Zookeeper
10.15 Apache Storm
10.20 Apache Kafka
Example of a project mixing
HPC and Big Data considerations
Where theory comes before data:
physics
chemistry
Materials
...
Where data comes before theory:
Medicine
business insights
autonomy (smart car, building, city)
18
Harp: Hadoop and Collective
Communication for Iterative Computation
19
• Project http://salsaproj.indiana.edu/harp/
• Motivation:
• Communication patterns are not abstracted and
defined in Hadoop, Pregel/Giraph, Spark
• In contrary, MPI has very fine grain based
(collective) communication primitives (based on
arrays and buffers)
• Harp provides data abstractions and communication
abstractions on top of them. It can be plugged into
Hadoop runtime to enable efficient in-memory
communication to avoid HDFS read/write.
20
Harp
• The data
abstraction
has 3
categories
horizontally
and 3 levels
vertically
• Harp works as a plugin in Hadoop. The goal is to
make Hadoop cluster can schedule original
MapReduce jobs and Map-Collective jobs at the
same time.
• Collective communication is defined as
movement of partitions within tables.
• Collective communication requires
synchronization. Hadoop scheduler is modified
to schedule tasks in BSP style.
• Fault tolerance with checkpointing.
21
Harp
• For the data-scientist of nowadays;
• Hub to facilite access to Amazon cloud
resources, to share data, to collaborate
between many people through notebooks;
• RosettaHUB: https://www.rosettahub.com
• Available through Amazon Educate program
(https://s3.amazonaws.com/awseducate-
list/AWS_Educate_Institutions.pdf)
22
RosettaHub
23
24
25
26
SPC and the deployment of
virtual machines
IDV experience
(Imageries du vivant)
27
28
http://cirrus.uspc.fr
29
30
31
Technologie OpenNebula
32
33
34
35
36
• Plateforme d'imagerie ePad
http://pf-01.lab.parisdescartes.fr:1243/dcm4chee-web3/
• Plateforme PLM dans le cadre du projet DRIVE-SPC
http://pf-01.lab.parisdescartes.fr:2345/tc/webclient
• Plateforme Sis4web de SisNCom
http://pf-01.lab.parisdescartes.fr:1323/sis4web/login.php
• Benchsys
http://pf-01.lab.parisdescartes.fr:9001/
• Owncloud
http://pf-01.lab.parisdescartes.fr:1253/owncloud/index.php/login
• …
37
Examples IDV services
Conclusion and Future work
Many, many opportunities
38
• Computer hardware will continue to evolve; Trend:
accelerators instead of GPUs;
• Platforms will continue to evolve to facilitate the
management of data, the computation on data;
• Need for convergence/resonance between HPC and Big-
data eco-systems;
• SPC: needs for training, using, transforming the scientific
approaches… with research engineers for accompany the
change;
• SPC: needs for projects… and to share betweenez
communities.
39
Conclusion and perspective
Thanks to Leila Abidi, Olivier
Waldek, Roland Chervet
40

Weitere ähnliche Inhalte

Was ist angesagt?

Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC Clusters
Saliya Ekanayake
 

Was ist angesagt? (20)

51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Introducing the hadoop ecosystem
Introducing the hadoop ecosystemIntroducing the hadoop ecosystem
Introducing the hadoop ecosystem
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC Clusters
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming Data
 
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
Next Generation Grid: Integrating Parallel and Distributed Computing Runtimes...
 
Hadoop
Hadoop Hadoop
Hadoop
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architecture
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Big data | Hadoop | components of hadoop |Rahul Gulab Sing
Big data | Hadoop | components of hadoop |Rahul Gulab SingBig data | Hadoop | components of hadoop |Rahul Gulab Sing
Big data | Hadoop | components of hadoop |Rahul Gulab Sing
 
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
Classifying Simulation and Data Intensive Applications and the HPC-Big Data C...
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel 
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big dataanalyticsbeyondhadoop public_20_june_2013
Big dataanalyticsbeyondhadoop public_20_june_2013Big dataanalyticsbeyondhadoop public_20_june_2013
Big dataanalyticsbeyondhadoop public_20_june_2013
 
Hadoop
HadoopHadoop
Hadoop
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce Framework
 

Ähnlich wie C cerin piv2017_c

Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
MLconf
 
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
inside-BigData.com
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
Christopher Pezza
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 

Ähnlich wie C cerin piv2017_c (20)

Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
Anusua Trivedi, Data Scientist at Texas Advanced Computing Center (TACC), UT ...
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.ppt
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
 
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
Designing Convergent HPC and Big Data Software Stacks: An Overview of the HiB...
 
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
FC Brochure & Insert
FC Brochure & InsertFC Brochure & Insert
FC Brochure & Insert
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
ODSC and iRODS
ODSC and iRODSODSC and iRODS
ODSC and iRODS
 
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
 
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
 
IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43
 
Rack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC SupercomputerRack Cluster Deployment for SDSC Supercomputer
Rack Cluster Deployment for SDSC Supercomputer
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 

Mehr von Bertrand Tavitian

Mehr von Bertrand Tavitian (7)

14 30-mjoliot biomist-piv-2017
14 30-mjoliot biomist-piv-201714 30-mjoliot biomist-piv-2017
14 30-mjoliot biomist-piv-2017
 
9 30 fandre-dist_cnrs_piv_2017
9 30 fandre-dist_cnrs_piv_20179 30 fandre-dist_cnrs_piv_2017
9 30 fandre-dist_cnrs_piv_2017
 
M allanic piv2017_c
M allanic piv2017_cM allanic piv2017_c
M allanic piv2017_c
 
M joliot biomist-piv_2017_c
M joliot biomist-piv_2017_cM joliot biomist-piv_2017_c
M joliot biomist-piv_2017_c
 
14 00-20171207 rance-piv_c
14 00-20171207 rance-piv_c14 00-20171207 rance-piv_c
14 00-20171207 rance-piv_c
 
M kain fli-iam_piv_2017_c
M kain fli-iam_piv_2017_cM kain fli-iam_piv_2017_c
M kain fli-iam_piv_2017_c
 
Mc jacquemot piv2017_c
Mc jacquemot piv2017_cMc jacquemot piv2017_c
Mc jacquemot piv2017_c
 

Kürzlich hochgeladen

🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...
🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...
🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...
Sheetaleventcompany
 
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
Rashmi Entertainment
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
rajnisinghkjn
 
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Sheetaleventcompany
 
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan 087776558899
 

Kürzlich hochgeladen (20)

🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...
🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...
🍑👄Hyderabad Escorts Service☎️7783825323🍑👄 Call Girl service in Hyderabad☎️Hyd...
 
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
 
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
 
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableCall Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
 
Cardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their RegulationCardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their Regulation
 
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptxANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
 
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room DeliveryCall 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
 
Lucknow Call Girls Just Call 👉👉8630512678 Top Class Call Girl Service Available
Lucknow Call Girls Just Call 👉👉8630512678 Top Class Call Girl Service AvailableLucknow Call Girls Just Call 👉👉8630512678 Top Class Call Girl Service Available
Lucknow Call Girls Just Call 👉👉8630512678 Top Class Call Girl Service Available
 
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
 
Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...
Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...
Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
 
Lucknow Call Girls Service { 9984666624 } ❤️VVIP ROCKY Call Girl in Lucknow U...
Lucknow Call Girls Service { 9984666624 } ❤️VVIP ROCKY Call Girl in Lucknow U...Lucknow Call Girls Service { 9984666624 } ❤️VVIP ROCKY Call Girl in Lucknow U...
Lucknow Call Girls Service { 9984666624 } ❤️VVIP ROCKY Call Girl in Lucknow U...
 
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
 
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
 
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
 
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
 
Circulatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanismsCirculatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanisms
 
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
 

C cerin piv2017_c

  • 1. Gestion des données scientifiques en imagerie in vivo https://piv2017.sciencesconf.org/ Christophe CERIN, Université Paris13 christophe.cerin@lipn.univ-paris13.fr Plateformes de resources numériques Expérience USPC Paris, December 7, 2017
  • 2. Outline 2 • Motivations related to large scale platforms • Background: • Criteria for observing platforms • High Performance Computing (HPC) • Apache suite for Big Data • Some projects • Contribution: USPC platform • Conclusion and perspective
  • 3. Motivations related to large scale platforms In experimental Sciences we need tips for using them appropriately / best practices 3
  • 4. 4 Definition of a large scale system • System with unprecedented amounts of hardware (>1M cores, >100 PB storage, >100Gbits/s interconnect); • Ultra large scale system: hardware + lines of source code + numbers of users + volumes of data; • New problems: • built across multiple organizations; • often with conflicting purposes and needs; • constructed from heterogeneous parts with complex dependencies and emergent properties; • continuously evolving; • software, hardware and human failures will be the norm, not the exception.
  • 5. Since in our lives (Cloud Computing) 5 Software as a Service SaaS Plateform as a Service PaaS
 Infrastructure as a 
 Service
 IaaS
  • 6. • Don’t believe them; • Variety in the eco-system is the norm. 6 Some speeches are ready for you
  • 7. In Cloud, innovation drives technology/science Containers and no more Virtual Machines 7 Hardware Operating System App App App Conventional Stack Hardware OS App App App Hypervisor OS OS Virtualized Stack ▪Abstraction on resources in the past: VM technologies to hide physical properties, interactions between system layer, application layer and the end-users
  • 9. 9 In Sciences (not in business) • Experimental approach: modeling,…, experiments,… • What is an experimental plan for large scale systems and reproducible research? • Answers depend on the system used: • Cluster (dedicated; managed) • Cloud (almost non supervised… or by you)
  • 11. ▪Cluster ➔Performance ➔Flops - Top500 ▪Cluster ➔Power ➔Flops per watt - Green500 ▪Cloud ➔Price ➔for computing, for storing… ▪Cloud: the economic model may be strange (Amazon Spot Instances) ▪Scientific issues: how to compare cluster and cloud? (be fair) 11 Performance criteria for observing large scale system
  • 14. 14 Productivity ▪Productivity traditionally refers to the ratio between the quantity of software produced and the cost spent for it. ▪Factors: Programming language used, Program size, The experience of programmers and design personnel, The novelty of requirements, The complexity of the program and its data, The use of structured programming methods, Program class or the distribution method, Program type of the application area, Tools and environmental conditions, Enhancing existing programs or systems, Maintaining existing programs or systems, Reusing existing modules and standard designs, Program generators, Fourth-generation languages, Geographic separation of development locations, Defect potentials and removal methods, (Existing) Documentation, Prototyping before main development begins, Project teams and organization structures, Morale and compensation of staff
  • 15. ▪Parallel (http://press3.mcs.anl.gov/atpesc/files/2015/03/ESCTP-snir_215.pdf): ▪Shared memory (OpenMP): ▪CPU+vector ▪CPU+GPU ▪CPU+accelerators (Google Tensor processing unit…) ▪Message passing (MPI) ▪Sequential with the call to « parallel data structure » (Scala) ▪Map-reduce: Hadoop, Spark (In addition to Map and Reduce operations, it supports SQL queries, streaming data, machine learning and graph data processing.) 15 Programming model
  • 16. ▪Even more complicated: Cray has added "ARM Option" (i.e. CPU blade option, using the ThunderX2) to their XC50 supercomputers ▪We need consolidations: ▪in hardware ▪in software ▪in algorithms ▪in applications ▪in experimental methods ▪hpc facilities ▪training 16 Hardware landscape
  • 17. 17 Apache suite for Big Data 1- Bases de Données 1.1 Apache Hbase 1.2 Apache Cassandra 1.3 CouchDB 1.4 MongoDB 2 Accès aux données/ requêtage 2.1 Pig 2.2 Hive 2.3 Livy 3 Data Intelligence 3.1 Apache Drill 3.2 Apache Mahout 3.3 H2O 4 Data Serialisation 4.1 Apache Thrift 4.2 Apache Avro 5 Data integration 5.1 Apache Sqoop 5.2 Apache Flume 5.3 Apache Chuckwa 6 Requetage 6.1 Presto 6.2 Impala 6.3 Dremel 7 Sécurité des données 7.1 Apache Metron 7.2 Sqrrl 8.2 MapReduce 8.3 Spark 9.1 Elasticsearch 10.1 Apache Hadoop 10.6 Apache Mesos 10.10 Apache Flink 10.12 Apache Zookeeper 10.15 Apache Storm 10.20 Apache Kafka
  • 18. Example of a project mixing HPC and Big Data considerations Where theory comes before data: physics chemistry Materials ... Where data comes before theory: Medicine business insights autonomy (smart car, building, city) 18
  • 19. Harp: Hadoop and Collective Communication for Iterative Computation 19 • Project http://salsaproj.indiana.edu/harp/ • Motivation: • Communication patterns are not abstracted and defined in Hadoop, Pregel/Giraph, Spark • In contrary, MPI has very fine grain based (collective) communication primitives (based on arrays and buffers) • Harp provides data abstractions and communication abstractions on top of them. It can be plugged into Hadoop runtime to enable efficient in-memory communication to avoid HDFS read/write.
  • 20. 20 Harp • The data abstraction has 3 categories horizontally and 3 levels vertically
  • 21. • Harp works as a plugin in Hadoop. The goal is to make Hadoop cluster can schedule original MapReduce jobs and Map-Collective jobs at the same time. • Collective communication is defined as movement of partitions within tables. • Collective communication requires synchronization. Hadoop scheduler is modified to schedule tasks in BSP style. • Fault tolerance with checkpointing. 21 Harp
  • 22. • For the data-scientist of nowadays; • Hub to facilite access to Amazon cloud resources, to share data, to collaborate between many people through notebooks; • RosettaHUB: https://www.rosettahub.com • Available through Amazon Educate program (https://s3.amazonaws.com/awseducate- list/AWS_Educate_Institutions.pdf) 22 RosettaHub
  • 23. 23
  • 24. 24
  • 25. 25
  • 26. 26
  • 27. SPC and the deployment of virtual machines IDV experience (Imageries du vivant) 27
  • 29. 29
  • 30. 30
  • 32. 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 36. 36
  • 37. • Plateforme d'imagerie ePad http://pf-01.lab.parisdescartes.fr:1243/dcm4chee-web3/ • Plateforme PLM dans le cadre du projet DRIVE-SPC http://pf-01.lab.parisdescartes.fr:2345/tc/webclient • Plateforme Sis4web de SisNCom http://pf-01.lab.parisdescartes.fr:1323/sis4web/login.php • Benchsys http://pf-01.lab.parisdescartes.fr:9001/ • Owncloud http://pf-01.lab.parisdescartes.fr:1253/owncloud/index.php/login • … 37 Examples IDV services
  • 38. Conclusion and Future work Many, many opportunities 38
  • 39. • Computer hardware will continue to evolve; Trend: accelerators instead of GPUs; • Platforms will continue to evolve to facilitate the management of data, the computation on data; • Need for convergence/resonance between HPC and Big- data eco-systems; • SPC: needs for training, using, transforming the scientific approaches… with research engineers for accompany the change; • SPC: needs for projects… and to share betweenez communities. 39 Conclusion and perspective
  • 40. Thanks to Leila Abidi, Olivier Waldek, Roland Chervet 40