SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
© 2012 IBM Corporation1
Software and Hardware Infrastructures to conquer Data
Explosion in Life Science - Life Science Network Basel
Romeo Kienzler
Data Scientist and Architect, Pos. Graduate in Information Systems and Bioinformatics
IBM Innovation Center Zurich romeo.kienzler@ch.ibm.com
https://www.ibm.com/developerworks/mydeveloperworks/profiles/user/RomeoKienzler
© 2012 IBM Corporation2
Outline
●
Data Growth
●
Data Growth in Life Science
●
BigData in Life Science
●
How to address BigData?
●
Outlook
© 2012 IBM Corporation3
3
Data Growth
Data AVAILABLE to an
organization
data an organization can
PROCESS
Missed
opportunity
100 Million Tweets are posted every day, 35 hours of video are being uploaded every
minute,6.1 x 10^12 text messages have been sent in 2011 and 247 x 10^9 E-Mails passed
through the net.80 % spam and viruses. => Filtering is more and more important.
Up to 2003 the same amount of data has been produced as between 2003 and now
© 2012 IBM Corporation4
New Data Sources in Life Sciences
●
DNA (RNA) Sequencing
●
Next-Generation Sequencing
●
DNA Transistor
●
Imaging and Video
●
Unstructured Text
© 2012 IBM Corporation5
Data Growth in Life Sciences
Source: www.osehra.org
© 2012 IBM Corporation6
Data Growth in Life Sciences
Source: www.crops.org
© 2012 IBM Corporation7
Examples - NGS
© 2012 IBM Corporation8
Images and Videos
Source: www.phys.org
© 2012 IBM Corporation9
Examples – Text Analytics
Source: www.theglobalistreport.com
© 2012 IBM Corporation10
Watson
© 2012 IBM Corporation11
SIIB (Strategic IP Insight Platform)
Integrated chemical, biological and textual search
Deep analytics on scientific literature and patents
Aggregation of world wide Patent Data and scientific literature (30M+ docs) with ongoing updates
© 2012 IBM Corporation12
The challange
●
Store a huge amount of data
●
Process a huge amount of data
(incl. Search/Find)
●
Don't consume too much energy
© 2012 IBM Corporation13
Use many Hard Drives
© 2012 IBM Corporation14
Use many Hard Drives
© 2012 IBM Corporation15
Use many Hard Drives - Limits
(*) Given a Disk Capacity of 25TB
300 Crashes per Day, Data Loss after two weeks
© 2012 IBM Corporation16
Separate the Signal From the Noise¹
¹http://www.ibmsystemsmag.com/power/businessstrategy/BI-and-Analytics/signal_noise/
© 2012 IBM Corporation17
Store only what you need
© 2012 IBM Corporation18
Use many CPU's
Supercomputer before
➔
Weather
➔
Atom Bombs
➔
Science
➔
Crash Tests
Supercomputer in a Rack
➔
18 TB Main Memory, 1008 CPU Cores, 113 TFLOPS (1st
TOP500 2013: 17590 TFLOPS 2004: 71 TFLOPS)
© 2012 IBM Corporation19
© 2012 IBM Corporation20
Use specialized CPU's: GPUs
Source: www.ethz.ch
Source: www.nvidia.com
© 2012 IBM Corporation21
Use specialized CPU's: FPGA's
Source: www.virtex.com
© 2012 IBM Corporation22
Example FPGA: IBM Pure Data
●
Up to 1,28 PB Storage
●
Up to 10 Racks
●
Up to 500 GigaByte/s Throughput
●
Up to 1120 FPGA + 1120 Intel CPU Cores / 960 Hard Drives
© 2012 IBM Corporation23
Example FPGA: Conveycomputers
●
Accelerates BWA by 15x
●
Accelerates Smith-Waterman
Source: www.conveycomputer.com
© 2012 IBM Corporation24
Example: Algorithms
Source: www.biomedcentral.com/1471-2105/9/S2/S10
© 2012 IBM Corporation25
Example: Cloud
●
Managed Infrastructure
●
Dynamic Provisioning
●
Specialized HW
●
SaaS
Source: www.basespace.illumina.com
© 2012 IBM Corporation26
Conclusion
●
Main BigData Sources are Sequences and Plain Text
●
Many others to come (e.g. Images and Videos)
●
Store Data on many Commodity Hard Drives (Energy Problem not solved)
●
Filter Signal from Noise
●
Process Data on many CPU's
●
Usage of specialized Hardware / CPU's
●
Research in performance of algorithms
© 2012 IBM Corporation27
Outlook
●
Currently very heterogeneous infrastructures
●
Trends:
●
Virtualization
●
Standardization
●
Consumerization
●
Limits
●
Space
●
Energy consumption
●
What shall I do?
●
RELAX
© 2012 IBM Corporation28
The future will be full of surprises
A battery powered pocket size super
computer?
Raspberry Pi
Parallela
© 2012 IBM Corporation29
Acknowledgements
Slides 14 – 16 & 21 have been taken from a
Keynote speech of Axel Köster, IBM Germany

Weitere ähnliche Inhalte

Ähnlich wie Software and Hardware Infrastructures to conquer Data Explosion in Life Science - Life Science Network Basel

The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...Romeo Kienzler
 
Introduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj BongirrIntroduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj BongirrPranav Kulkarni
 
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computingTechnology Outlook - The new Era of computing
Technology Outlook - The new Era of computingSwiss Big Data User Group
 
Flash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointFlash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointCTI Group
 
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...Nicolas Brousse
 
PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...
PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...
PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...Puppet
 
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processorHardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processorSlide_N
 
EMC Big Data Solutions Overview
EMC Big Data Solutions OverviewEMC Big Data Solutions Overview
EMC Big Data Solutions Overviewwalshe1
 
IBM Tape the future of tape
IBM Tape the future of tapeIBM Tape the future of tape
IBM Tape the future of tapeJosef Weingand
 
dell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptxdell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptxSriramFreelance
 
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...MongoDB
 
S100296 data-footprint-orlando-v1804a
S100296 data-footprint-orlando-v1804aS100296 data-footprint-orlando-v1804a
S100296 data-footprint-orlando-v1804aTony Pearson
 
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18IBM Sverige
 
Next generation storage: eliminating the guesswork and avoiding forklift upgrade
Next generation storage: eliminating the guesswork and avoiding forklift upgradeNext generation storage: eliminating the guesswork and avoiding forklift upgrade
Next generation storage: eliminating the guesswork and avoiding forklift upgradeJisc
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Helathcare modernize-tebc105-v1704a
Helathcare modernize-tebc105-v1704aHelathcare modernize-tebc105-v1704a
Helathcare modernize-tebc105-v1704aTony Pearson
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationSlide_N
 

Ähnlich wie Software and Hardware Infrastructures to conquer Data Explosion in Life Science - Life Science Network Basel (20)

The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
 
Introduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj BongirrIntroduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj Bongirr
 
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computingTechnology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
 
Flash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointFlash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling Point
 
IBM PureSystems
IBM PureSystemsIBM PureSystems
IBM PureSystems
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
 
PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...
PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...
PuppetConf 2017: Adobe Advertising Cloud: Lean Puppet Workflow to Support Mul...
 
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processorHardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
 
EMC Big Data Solutions Overview
EMC Big Data Solutions OverviewEMC Big Data Solutions Overview
EMC Big Data Solutions Overview
 
IBM Tape the future of tape
IBM Tape the future of tapeIBM Tape the future of tape
IBM Tape the future of tape
 
dell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptxdell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptx
 
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
 
The future of tape
The future of tapeThe future of tape
The future of tape
 
S100296 data-footprint-orlando-v1804a
S100296 data-footprint-orlando-v1804aS100296 data-footprint-orlando-v1804a
S100296 data-footprint-orlando-v1804a
 
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
Trender, inspirationer och visioner - Mikael Haglund #ibmbpsse18
 
Next generation storage: eliminating the guesswork and avoiding forklift upgrade
Next generation storage: eliminating the guesswork and avoiding forklift upgradeNext generation storage: eliminating the guesswork and avoiding forklift upgrade
Next generation storage: eliminating the guesswork and avoiding forklift upgrade
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Helathcare modernize-tebc105-v1704a
Helathcare modernize-tebc105-v1704aHelathcare modernize-tebc105-v1704a
Helathcare modernize-tebc105-v1704a
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and Visualization
 

Mehr von Romeo Kienzler

Parallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network TrainingParallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network TrainingRomeo Kienzler
 
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & FlinkCognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & FlinkRomeo Kienzler
 
Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...Romeo Kienzler
 
Blockchain Technology Book Vernisage
Blockchain Technology Book VernisageBlockchain Technology Book Vernisage
Blockchain Technology Book VernisageRomeo Kienzler
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Romeo Kienzler
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarRomeo Kienzler
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningRomeo Kienzler
 
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Romeo Kienzler
 
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTDeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTRomeo Kienzler
 
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataReal-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataRomeo Kienzler
 
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...Romeo Kienzler
 
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A ServiceScala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A ServiceRomeo Kienzler
 
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...Romeo Kienzler
 
TDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAASTDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAASRomeo Kienzler
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamRomeo Kienzler
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...Romeo Kienzler
 
DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14Romeo Kienzler
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichRomeo Kienzler
 
Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014Romeo Kienzler
 

Mehr von Romeo Kienzler (20)

Parallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network TrainingParallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network Training
 
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & FlinkCognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
 
Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...
 
Blockchain Technology Book Vernisage
Blockchain Technology Book VernisageBlockchain Technology Book Vernisage
Blockchain Technology Book Vernisage
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, Qatar
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
 
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
 
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTDeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoT
 
Geo Python16 keynote
Geo Python16 keynoteGeo Python16 keynote
Geo Python16 keynote
 
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataReal-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor Data
 
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
 
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A ServiceScala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
 
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
 
TDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAASTDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAAS
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
 
DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
 
Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014
 

Kürzlich hochgeladen

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Kürzlich hochgeladen (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

Software and Hardware Infrastructures to conquer Data Explosion in Life Science - Life Science Network Basel

  • 1. © 2012 IBM Corporation1 Software and Hardware Infrastructures to conquer Data Explosion in Life Science - Life Science Network Basel Romeo Kienzler Data Scientist and Architect, Pos. Graduate in Information Systems and Bioinformatics IBM Innovation Center Zurich romeo.kienzler@ch.ibm.com https://www.ibm.com/developerworks/mydeveloperworks/profiles/user/RomeoKienzler
  • 2. © 2012 IBM Corporation2 Outline ● Data Growth ● Data Growth in Life Science ● BigData in Life Science ● How to address BigData? ● Outlook
  • 3. © 2012 IBM Corporation3 3 Data Growth Data AVAILABLE to an organization data an organization can PROCESS Missed opportunity 100 Million Tweets are posted every day, 35 hours of video are being uploaded every minute,6.1 x 10^12 text messages have been sent in 2011 and 247 x 10^9 E-Mails passed through the net.80 % spam and viruses. => Filtering is more and more important. Up to 2003 the same amount of data has been produced as between 2003 and now
  • 4. © 2012 IBM Corporation4 New Data Sources in Life Sciences ● DNA (RNA) Sequencing ● Next-Generation Sequencing ● DNA Transistor ● Imaging and Video ● Unstructured Text
  • 5. © 2012 IBM Corporation5 Data Growth in Life Sciences Source: www.osehra.org
  • 6. © 2012 IBM Corporation6 Data Growth in Life Sciences Source: www.crops.org
  • 7. © 2012 IBM Corporation7 Examples - NGS
  • 8. © 2012 IBM Corporation8 Images and Videos Source: www.phys.org
  • 9. © 2012 IBM Corporation9 Examples – Text Analytics Source: www.theglobalistreport.com
  • 10. © 2012 IBM Corporation10 Watson
  • 11. © 2012 IBM Corporation11 SIIB (Strategic IP Insight Platform) Integrated chemical, biological and textual search Deep analytics on scientific literature and patents Aggregation of world wide Patent Data and scientific literature (30M+ docs) with ongoing updates
  • 12. © 2012 IBM Corporation12 The challange ● Store a huge amount of data ● Process a huge amount of data (incl. Search/Find) ● Don't consume too much energy
  • 13. © 2012 IBM Corporation13 Use many Hard Drives
  • 14. © 2012 IBM Corporation14 Use many Hard Drives
  • 15. © 2012 IBM Corporation15 Use many Hard Drives - Limits (*) Given a Disk Capacity of 25TB 300 Crashes per Day, Data Loss after two weeks
  • 16. © 2012 IBM Corporation16 Separate the Signal From the Noise¹ ¹http://www.ibmsystemsmag.com/power/businessstrategy/BI-and-Analytics/signal_noise/
  • 17. © 2012 IBM Corporation17 Store only what you need
  • 18. © 2012 IBM Corporation18 Use many CPU's Supercomputer before ➔ Weather ➔ Atom Bombs ➔ Science ➔ Crash Tests Supercomputer in a Rack ➔ 18 TB Main Memory, 1008 CPU Cores, 113 TFLOPS (1st TOP500 2013: 17590 TFLOPS 2004: 71 TFLOPS)
  • 19. © 2012 IBM Corporation19
  • 20. © 2012 IBM Corporation20 Use specialized CPU's: GPUs Source: www.ethz.ch Source: www.nvidia.com
  • 21. © 2012 IBM Corporation21 Use specialized CPU's: FPGA's Source: www.virtex.com
  • 22. © 2012 IBM Corporation22 Example FPGA: IBM Pure Data ● Up to 1,28 PB Storage ● Up to 10 Racks ● Up to 500 GigaByte/s Throughput ● Up to 1120 FPGA + 1120 Intel CPU Cores / 960 Hard Drives
  • 23. © 2012 IBM Corporation23 Example FPGA: Conveycomputers ● Accelerates BWA by 15x ● Accelerates Smith-Waterman Source: www.conveycomputer.com
  • 24. © 2012 IBM Corporation24 Example: Algorithms Source: www.biomedcentral.com/1471-2105/9/S2/S10
  • 25. © 2012 IBM Corporation25 Example: Cloud ● Managed Infrastructure ● Dynamic Provisioning ● Specialized HW ● SaaS Source: www.basespace.illumina.com
  • 26. © 2012 IBM Corporation26 Conclusion ● Main BigData Sources are Sequences and Plain Text ● Many others to come (e.g. Images and Videos) ● Store Data on many Commodity Hard Drives (Energy Problem not solved) ● Filter Signal from Noise ● Process Data on many CPU's ● Usage of specialized Hardware / CPU's ● Research in performance of algorithms
  • 27. © 2012 IBM Corporation27 Outlook ● Currently very heterogeneous infrastructures ● Trends: ● Virtualization ● Standardization ● Consumerization ● Limits ● Space ● Energy consumption ● What shall I do? ● RELAX
  • 28. © 2012 IBM Corporation28 The future will be full of surprises A battery powered pocket size super computer? Raspberry Pi Parallela
  • 29. © 2012 IBM Corporation29 Acknowledgements Slides 14 – 16 & 21 have been taken from a Keynote speech of Axel Köster, IBM Germany