SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Trends from the Trenches
2012 Bio-IT World Asia, Singapore




                                    1
I’m Chris.

I’m an infrastructure geek.

I work for the BioTeam.

                              2
BioTeam
Who, what & why




 ‣ Independent consulting shop
 ‣ Staffed by scientists forced to
   learn IT, SW & HPC to get our
   own research done
 ‣ 10+ years bridging the “gap”
   between science, IT & high
   performance computing

                                     3
BioTeam
Why we get invited to these sorts of talks ...


  ‣ Lots of people hire us across
    wide range of project types
    •   Pharma, Biotech, EDU,
        Nonprofit, .Gov, .Mil, etc.
  ‣ We get to see how groups of
    smart people approach similar
    problems
  ‣ We can speak honestly &
    objectively about what we see
    “in the real world”
                                                 4
Disclaimer.




              5
Listen to me at your own risk
Seriously.



  ‣ I’m not an expert, pundit,
    visionary or “thought leader”
  ‣ All career success entirely due
    to shamelessly copying what
    actual smart people do
  ‣ I’m biased, burnt-out & cynical
  ‣ Filter my words accordingly

                                      6
Introduction
                         1
Business & Marketplace
                         2
Storage
                         3
Cloud
                         4
Hot for ’12 ...
                         5
                             7
Business Landscape
So far 2012 feels a lot like 2011 ...


                                        8
Business & Meta Observations
More of the same in ’12 ...


 ‣ ~4 staff full time on issues involving data handling, data
   management and multi-instrument Next-Gen
   sequencing/analysis
 ‣ ~2 staff full time on infrastructure, storage and facility
   related projects
   •   Dwan: Big infrastructure & facility projects for Fortune 20
       companies, research consortia & .GOV customers
   •   Dag: 40% infrastructure, 20% storage, 20% cloud

 ‣ ~1 staff full time on Amazon Cloud projects
                                                                     9
What that tells us


 ‣ Same problem(s) as last year
 ‣ Next-gen sequencing still
   causing a lot of pain when it
   comes to data handling,
   storage, organization &
   integration
 ‣ As sequencing continues to be
   commoditized, this will likely
   only get worse
                                    10
Storage
          11
Science-centric Storage
Current State Assessment




 ‣ Storage still making me crazy in ’12




                                          12
Science-centric Storage
Why I’m not worried




 ‣ Peta-capable storage is trivial to acquire in 2012
 ‣ Scale-out NAS has won the battle
 ‣ It’s simply not as hard/risky as it used to be



                                                        13
On the other hand ...




                        14
OMG! The Sky Is Falling!
Maybe a little panic is appropriate ...




                                          15
The sky IS falling!
Uncomfortable truths


‣ Cost of acquiring data (genomes)
  falling faster than rate at which
  industry is increasing drive capacity
‣ Human researchers downstream of
  these datasets are also consuming
  more storage (and less predictably)
‣ High-scale labs must react or
  potentially have catastrophic issues
  in 2012-2013

                                          16
The sky IS falling!
Current Practices Are Not Sustainable

 ‣ FACT: Chemistry changing faster than we can refresh our
   datacenters and research IT infrastructure
 ‣ FACT: Rate at which we can cheaply acquire interesting data
   exceeds rate at which storage companies can increase the
   capacity of their products
 ‣ FACT: We are poor at managing, tagging, valuing & curating our
   data. Few scientists really understand true cost/complexity
   involved with keeping data safe, online & accessible
 ‣ FACT: In 2012 people still think “keep everything online, forever”
   is a viable demand to be making of IT staff
 ‣ FACT: Something is going to break. Soon.
                                                                        17
CRAM it.



           18
The sky IS falling!
CRAM it in 2012 ...

 ‣ Minor improvements are useless; order-of-magnitude needed
 ‣ Some people are talking about radical new methods –
   compressing against reference sequences and only storing the
   diffs
   •   With a variable compression “quality budget” to spend on
       lossless techniques in the areas you care about
 ‣ http://biote.am/5v - Ewan Birney on “Compressing DNA”
 ‣ http://biote.am/5w - The actual CRAM paper
 ‣ If CRAM takes off, storage landscape will change
                                                                  19
Storage: What comes next?
        Next 18 months will be really fun...
                                               20
What comes next.
The same rules apply for 2012 and beyond ...


 ‣ Accept that science changes faster than IT infrastructure
 ‣ Be glad you are not Broad/Sanger/BGI/NCBI
 ‣ Flexibility, scalability and agility become the key
   requirements of research informatics platforms
   •   Tiered storage is in your future ...
 ‣ Shared/concurrent access is still the overwhelming
   storage use case

                                                               21
What comes next.
In the following year ...


 ‣ Many peta-scale capable systems deployed
   •   Most will operate in the hundreds-of-TBs range
 ‣ Far more aggressive “data triage”
 ‣ Genome compression via CRAM
 ‣ Even more data will sit untouched & unloved
 ‣ Growing need for tiers, HSM & even tape


                                                        22
What comes next.
In the following year ...

 ‣ Broad and others are paving the way with respect to
   metadata-aware & policy driven storage frameworks
   •   And we’ll shamelessly copy a year or two later
 ‣ I’m still on my cloud storage kick
   •   Economics are inescapable; Will be built into storage
       platforms, gateways & VMs
   •   Cloud object stores are only a HTTP RESTful call away
   •   Cloud will become “just another tier”

                                                               23
What comes next.
Expect your storage to be smarter & more capable ...


 ‣ What do DDN, Panasas, Isilon,
   BlueArc, etc. have in common?
   •   Under the hood they all run
       Unix or Unix-like OS’s on
       x86_64 architectures
 ‣ Some storage arrays can
   already run applications natively
   •   More will follow
   •   Likely a big trend for 2012
                                                       24
Storage: The road ahead
                My $.02 for 2012...
                                      25
The Road Ahead
                      Trends & Tips for 2012

‣ Peta-capable platforms required
‣ Scale-out NAS still the best fit
‣ Customers will no longer build one
  big scale-out NAS tier
‣ My ‘hack’ of using nearline spec
  storage as primary science tier is
  obsolete in ’12
‣ pNFS mainstream in 2012?
‣ Not everything is worth backing up
‣ Expect disruptive stuff
                                               26
The Road Ahead
                      Trends & Tips for 2012


‣ Your storage will be able to run apps
  •   Dedupe, cloud gateways &
      replication
  •   ‘CRAM’ or similar compression
  •   Storage Resource Brokers
      (iRODS) & metadata servers
  •   HDFS/Hadoop hooks?
  •   Lab, Data management & LIMS
      applications                               Drobo Appliance running
                                               BioTeam MiniLIMS internally...


                                                                                27
The Road Ahead
                        Trends & Tips for 2012


‣ Hadoop / MapReduce / BigData
  •   Just like GRID and CLOUD the
      space is being over-hyped
  •   You still need to think about it
  •   ... and have a roadmap for doing it
  •   Deep, deep ties to your storage
  •   Your users want/need it
  •   My $.02? Fantastic cloud use case

                                                 28
Disruptive Storage Example

                             29
Backblaze Pod For Biotech

                            30
100 Terabytes for $12,000 USD
http://bioteam.net/tag/backblaze/

                                    31
Storage Future Feels Like This ...
Multiple Tiers, Multiple Vendors, Multiple Products



                                                      32
The ‘C’ word
   Does a Bio-IT talk exist if it does not mention “the cloud”?
                                                                  33
Cloud Stuff


 ‣ Before I make some blunt comments ...
 ‣ I am not an Amazon Cloud shill
 ‣ I am a jaded, cynical, zero-loyalty consumer of IT
   services and products that let me get work done
 ‣ Because I only get paid when my solutions work, I am
   picky about what tools I keep in my toolkit
 ‣ Amazon Web Services is a fantastic tool

                                                          34
So you think
you have a cloud?
No APIs?
Not a cloud.
No self-service?
  Not a cloud.
Installing VMware
& issuing a press release?
     Not a cloud.
Block storage
and virtual servers only?

    (barely) a cloud;
Amazon is the IaaS Cloud Leader

‣ Why Amazon is attractive for infrastructure clouds:
  •       Anyone can do virtual servers and block/object storage
  •       Bio-IT needs “more stuff ” in order to get real work done
  •       AWS product & service stack (“the glue”) is far more
          comprehensive than any other cloud competitors
      -        Need some examples?
           -     ElasticIP, VPC, IAM, SQS, SNS, SES, SimpleDB,
                 DynamoDB, CloudFormation, ElasticBeanstalk, SWS,
                 DirectConnect, etc.

                                                                      40
Amazon Cloud Dominance Could Be A Good Thing

 ‣ Amazon Cloud Dominance May Be Good For Bio-IT
 ‣ The competition must innovate in really interesting ways
   in order to compete. This is already happening.
   •   Purpose-built platforms for regulated/compliant operation
   •   “Hands-on” Managed Services for Healthcare/Pharma
   •   Hybrid on-premise/off-premise solutions
   •   Full life science solution & software service stacks
   •   Bespoke Service Level Agreements (SLAs)
   •   ,,,
                                                                   41
Private Clouds
            My $.02 cents
                            42
Private Clouds in 2012:


 ‣ I’m no longer dismissing them as “useless”
 ‣ Usable & useful in certain situations
 ‣ Hype vs. Reality ratio still unbalanced
 ‣ Sensible only for certain environments
   •   Have you seen what you have to do
       to your networks & gear?
 ‣ There are easier ways
Private Clouds: My Advice for ‘12



 ‣ Remain cynical (test vendor claims)
 ‣ Due Diligence still essential
 ‣ I personally would not deploy anything that does not
   explicitly provide Amazon API compatibility
Private Clouds: My Advice for ‘12



 Most people are better off:
  1. Adding VM platforms to existing HPC clusters &
     environments
  2. Extending enterprise VM platforms to allow user self-
     service & server catalogs
Cloud Advice
               My $.02 cents
                               46
Cloud Advice
Don’t get left behind




 ‣ Research IT Organizations need a cloud strategy today
 ‣ Those that don’t will be bypassed by frustrated users
 ‣ IaaS cloud services are only a departmental credit card
   away ... and some senior scientists are too big to be fired
   for violating IT policy


                                                                47
Cloud Advice
Design Patterns




 ‣ You will need three tested cloud design patterns:


 ‣ (1) To handle ‘legacy’ scientific apps & workflows
 ‣ (2) The special stuff that is worth re-architecting
 ‣ (3) Hadoop & big data analytics


                                                         48
Cloud Advice
(1) Legacy HPC on the Cloud




 ‣ MIT StarCluster
   •   http://web.mit.edu/star/cluster/
 ‣ This is your baseline for legacy apps on ‘the cloud’
 ‣ Extend as needed



                                                          49
Cloud Advice
(2) “Cloudy” HPC




 ‣ Some of our research workflows are important enough to
   be rewritten for “the cloud” and the advantages that a
   truly elastic & API-driven infrastructure can deliver
 ‣ This is where you have the most freedom
 ‣ Many published best practices you can borrow
 ‣ Good commercial options: Cycle Computing, BT, etc.

                                                            50
Cloud Advice
(3) Big Data HPC




 ‣ It will be a MapReduce world, get used to it
 ‣ Little need to roll your own Hadoop in 2012
 ‣ ISV & commercial ecosystem already healthy
 ‣ Multiple providers today; both onsite & cloud-based
 ‣ Often an excellent cloud use case


                                                         51
Cloud Data Movement
            My $.02 cents
                            52
Cloud Data Movement




‣ Over several years we have participated in a number of
  large “cloud data movement” efforts
‣ We used to be big fans of physical media movement
‣ However ...



                                                           53
Physical Data Movement Is Not Easy.




                                      54
Cloud Data Movement



‣ At first glance, physical data movement “seems easy”
‣ It’s not. It is hard to do correctly and requires significant
  human effort and operational resources
‣ This has been a hard lesson learned over several years
‣ We have a new strategy for 2012 and the next image
  shows why ...

                                                                 55
March 2012
             56
Cloud Data Movement
Wow!

 ‣ With a 1GbE internet connection ...
 ‣ and using Aspera software ....
 ‣ We sustained 700 Mb/sec for more than 7 hours
   freighting genomes into Amazon Web Services
 ‣ This is fast enough for many use cases, including
   genome sequencing core facilities*
 ‣ Chris Dwan’s webinar on this topic:
   http://biote.am/7e

                                                       57
Cloud Data Movement
Wow!



 ‣ Results like this mean we now favor network-based data
   movement over physical media movement
 ‣ Large-scale physical data movement carries a high
   operational burden and consumes non-trivial staff time &
   resources
 ‣ *Unclear if our experience holds true for Asia or
   Asia-EU-Americas data transfers


                                                              58
Cloud Data Movement
There are three ways to do network data movement ...


 ‣ (1) Buy software from Aspera and be done with it
 ‣ (2) Attend the annual SuperComputing conference & see
   which student group wins the bandwidth challenge
   contest; use their code
 ‣ (3) Get GridFTP from the Globus folks
   •   Trend: At every single “data movement” talk I’ve been to in
       2011 it seemed that any speaker who was NOT using Aspera
       was a very happy user of GridFTP. #notCoincidence


                                                                     59
Hot topics for 2012 ...
                          60
Hot for ’12
BioTeam side projects & research interests




 ‣ Like to wrap up with some topics we think are
   interesting
 ‣ Who knows? These might be trends for 2013!




                                                   61
Siri Voice Control of Instruments/Pipelines

 ‣ BioTeam recently revealed
   work with BT and Accelrys
 ‣ Demonstrated Siri voice
   control of a Pipeline Pilot
   experiment running in the BT
   Compute Cloud
 ‣ http://biote.am/7h
 ‣ We expect to continue doing
   cool things with Siri in ’12
                                              62
Smart Storage & Lab-local Appliances

 ‣ I firmly expect the “storage
   arrays running apps & VMs”
   trend to go mainstream
 ‣ This has beneficial implications
   for life science informatics
 ‣ We’ll be hitting this topic hard
   on systems ranging from Drobo
   to DataDirect
 ‣ Also working with the Intel
   Modular Server concept              63
Lab Local Appliances
Intel Modular Server


  ‣ Interesting hardware
    combination; storage +
    servers + native
    hypervisor
  ‣ VM Pool 1: MiniLIMs +
    other useful lab software
  ‣ VM Pool 2: Amazon
    Storage Gateway
    Appliance
                                http://biote.am/7i
  ‣ Server Blade 3:
    BrightCluster HPC Stack
                                                     64
Cloud, Community & Orchestration


‣ The emerging class of “DevOps” and “Infrastructure
  Automation” methods are incredibly interesting
  •   We love Opscode & Chef (http://opscode.com)
‣ We’ll be doing more with systems orchestration in ’12
  •   And hopefully expanding our community collection of
      useful Chef coobooks for life science informatics
‣ We also still love MIT StarCluster and will hopefully be
  contributing plugins and enhancements
                                                             65
Thanks!
Slides online at: http://slideshare.net/chrisdag/

                                                    66

Weitere ähnliche Inhalte

Was ist angesagt?

Cloud Security for Life Science R&D
Cloud Security for Life Science R&DCloud Security for Life Science R&D
Cloud Security for Life Science R&DChris Dagdigian
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudChris Dagdigian
 
BioIT World 2016 - HPC Trends from the Trenches
BioIT World 2016 - HPC Trends from the TrenchesBioIT World 2016 - HPC Trends from the Trenches
BioIT World 2016 - HPC Trends from the TrenchesChris Dagdigian
 
Multi-Tenant Pharma HPC Clusters
Multi-Tenant Pharma HPC ClustersMulti-Tenant Pharma HPC Clusters
Multi-Tenant Pharma HPC ClustersChris Dagdigian
 
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome Meeting
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingBio-IT & Cloud Sobriety: 2013 Beyond The Genome Meeting
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingChris Dagdigian
 
Practical Petabyte Pushing
Practical Petabyte PushingPractical Petabyte Pushing
Practical Petabyte PushingChris Dagdigian
 
Decision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guideDecision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guideDavid Soden
 
Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?Storage Switzerland
 
Copy Data Management & Storage Efficiency - Ravi Namboori
Copy Data Management & Storage Efficiency - Ravi NambooriCopy Data Management & Storage Efficiency - Ravi Namboori
Copy Data Management & Storage Efficiency - Ravi NambooriRavi namboori
 
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?Storage Switzerland
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)mark madsen
 
BioTeam Trends from the Trenches - NIH, April 2014
BioTeam Trends from the Trenches - NIH, April 2014BioTeam Trends from the Trenches - NIH, April 2014
BioTeam Trends from the Trenches - NIH, April 2014Ari Berman
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehousemark madsen
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 
IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...
IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...
IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...Anna Duong
 
Future of cloud up presentation m_dawson
Future of cloud up presentation m_dawsonFuture of cloud up presentation m_dawson
Future of cloud up presentation m_dawsonKhazret Sapenov
 
Cloud Computing Overview
Cloud Computing OverviewCloud Computing Overview
Cloud Computing OverviewDoug Allen
 
Corporate Profile Presentation V2.0 (1)
Corporate Profile Presentation V2.0 (1)Corporate Profile Presentation V2.0 (1)
Corporate Profile Presentation V2.0 (1)Stuart Macnab
 

Was ist angesagt? (20)

Cloud Security for Life Science R&D
Cloud Security for Life Science R&DCloud Security for Life Science R&D
Cloud Security for Life Science R&D
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the Cloud
 
BioIT World 2016 - HPC Trends from the Trenches
BioIT World 2016 - HPC Trends from the TrenchesBioIT World 2016 - HPC Trends from the Trenches
BioIT World 2016 - HPC Trends from the Trenches
 
Multi-Tenant Pharma HPC Clusters
Multi-Tenant Pharma HPC ClustersMulti-Tenant Pharma HPC Clusters
Multi-Tenant Pharma HPC Clusters
 
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome Meeting
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingBio-IT & Cloud Sobriety: 2013 Beyond The Genome Meeting
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome Meeting
 
Practical Petabyte Pushing
Practical Petabyte PushingPractical Petabyte Pushing
Practical Petabyte Pushing
 
Decision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guideDecision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guide
 
Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?
 
Copy Data Management & Storage Efficiency - Ravi Namboori
Copy Data Management & Storage Efficiency - Ravi NambooriCopy Data Management & Storage Efficiency - Ravi Namboori
Copy Data Management & Storage Efficiency - Ravi Namboori
 
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
 
BioTeam Trends from the Trenches - NIH, April 2014
BioTeam Trends from the Trenches - NIH, April 2014BioTeam Trends from the Trenches - NIH, April 2014
BioTeam Trends from the Trenches - NIH, April 2014
 
Data vault: What's Next
Data vault: What's NextData vault: What's Next
Data vault: What's Next
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehouse
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
Vna origins
Vna originsVna origins
Vna origins
 
IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...
IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...
IDC Success Story: Objected Based Solution Promises Enhanced Data Storage and...
 
Future of cloud up presentation m_dawson
Future of cloud up presentation m_dawsonFuture of cloud up presentation m_dawson
Future of cloud up presentation m_dawson
 
Cloud Computing Overview
Cloud Computing OverviewCloud Computing Overview
Cloud Computing Overview
 
Corporate Profile Presentation V2.0 (1)
Corporate Profile Presentation V2.0 (1)Corporate Profile Presentation V2.0 (1)
Corporate Profile Presentation V2.0 (1)
 

Andere mochten auch

How to prepare for jee main2017
How to prepare for jee main2017  How to prepare for jee main2017
How to prepare for jee main2017 Abhinandan singh
 
Como crear blog en blogger
Como crear blog en bloggerComo crear blog en blogger
Como crear blog en bloggerdaryu
 
Inteligencias múltiples ⒽⓈⒽ
Inteligencias múltiples ⒽⓈⒽInteligencias múltiples ⒽⓈⒽ
Inteligencias múltiples ⒽⓈⒽHenry Upla
 
Bt on-thi-hki-anh-11 cac bai test
Bt on-thi-hki-anh-11 cac bai testBt on-thi-hki-anh-11 cac bai test
Bt on-thi-hki-anh-11 cac bai testThanh Nga Vũ
 
Sensación y percepción ⒽⓈⒽ
Sensación y percepción ⒽⓈⒽSensación y percepción ⒽⓈⒽ
Sensación y percepción ⒽⓈⒽHenry Upla
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindwise
 
Relatório Anual de Atividades 2015
Relatório Anual de Atividades 2015Relatório Anual de Atividades 2015
Relatório Anual de Atividades 2015JCI Londrina
 
Curso de oratoria: Hablar Bien y Convencer
Curso de oratoria: Hablar Bien y ConvencerCurso de oratoria: Hablar Bien y Convencer
Curso de oratoria: Hablar Bien y ConvencerDr. Raúl Franchi
 
2. Sensación y Percepción
 2. Sensación y Percepción 2. Sensación y Percepción
2. Sensación y Percepciónenmape
 
Knowledge management (KM) tools
Knowledge management (KM) toolsKnowledge management (KM) tools
Knowledge management (KM) toolsDmitry Kudryavtsev
 

Andere mochten auch (18)

How to prepare for jee main2017
How to prepare for jee main2017  How to prepare for jee main2017
How to prepare for jee main2017
 
Mobile Marketing
Mobile Marketing Mobile Marketing
Mobile Marketing
 
Docente Tecnologico
Docente TecnologicoDocente Tecnologico
Docente Tecnologico
 
Feedback und Konflikte
Feedback und KonflikteFeedback und Konflikte
Feedback und Konflikte
 
Beschleunigung
BeschleunigungBeschleunigung
Beschleunigung
 
Nata 2017 important dates
Nata 2017 important datesNata 2017 important dates
Nata 2017 important dates
 
Como crear blog en blogger
Como crear blog en bloggerComo crear blog en blogger
Como crear blog en blogger
 
Inteligencias múltiples ⒽⓈⒽ
Inteligencias múltiples ⒽⓈⒽInteligencias múltiples ⒽⓈⒽ
Inteligencias múltiples ⒽⓈⒽ
 
Bt on-thi-hki-anh-11 cac bai test
Bt on-thi-hki-anh-11 cac bai testBt on-thi-hki-anh-11 cac bai test
Bt on-thi-hki-anh-11 cac bai test
 
Sensación y percepción ⒽⓈⒽ
Sensación y percepción ⒽⓈⒽSensación y percepción ⒽⓈⒽ
Sensación y percepción ⒽⓈⒽ
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaboration
 
Relatório Anual de Atividades 2015
Relatório Anual de Atividades 2015Relatório Anual de Atividades 2015
Relatório Anual de Atividades 2015
 
Curso de oratoria: Hablar Bien y Convencer
Curso de oratoria: Hablar Bien y ConvencerCurso de oratoria: Hablar Bien y Convencer
Curso de oratoria: Hablar Bien y Convencer
 
Kommunikationsmodelle
KommunikationsmodelleKommunikationsmodelle
Kommunikationsmodelle
 
Infor EAM ASE
Infor EAM ASEInfor EAM ASE
Infor EAM ASE
 
2. Sensación y Percepción
 2. Sensación y Percepción 2. Sensación y Percepción
2. Sensación y Percepción
 
Knowledge management (KM) tools
Knowledge management (KM) toolsKnowledge management (KM) tools
Knowledge management (KM) tools
 
La percepción
La percepciónLa percepción
La percepción
 

Ähnlich wie Trends from the Trenches (Singapore Edition)

2012: Trends from the Trenches
2012: Trends from the Trenches2012: Trends from the Trenches
2012: Trends from the TrenchesChris Dagdigian
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Facilitating Collaborative Life Science Research in Commercial & Enterprise E...
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Chris Dagdigian
 
Long Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC DatacenterLong Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC Datacenterinside-BigData.com
 
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiSlim Baltagi
 
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020Storage Switzerland
 
2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy edition2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy editionChris Dwan
 
Webinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix them
Webinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix themWebinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix them
Webinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix themStorage Switzerland
 
Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise Ankit Bose
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
DCD Big Discussion Guide
DCD Big Discussion GuideDCD Big Discussion Guide
DCD Big Discussion GuideJames Laker
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayZendCon
 
The Distributed & Decentralized Cloud
The Distributed & Decentralized CloudThe Distributed & Decentralized Cloud
The Distributed & Decentralized CloudMargaret Dawson
 
Cloud as a Flexible & Collaborative Tool for Creators
Cloud as a Flexible & Collaborative Tool for CreatorsCloud as a Flexible & Collaborative Tool for Creators
Cloud as a Flexible & Collaborative Tool for Creatorsjlchatelain
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...Jochem van Grondelle
 
Bimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationBimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationRobert Gleave
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudInside Analysis
 
ADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsDATAVERSITY
 

Ähnlich wie Trends from the Trenches (Singapore Edition) (20)

2012: Trends from the Trenches
2012: Trends from the Trenches2012: Trends from the Trenches
2012: Trends from the Trenches
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Facilitating Collaborative Life Science Research in Commercial & Enterprise E...
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...
 
Long Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC DatacenterLong Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC Datacenter
 
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
 
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
 
Cloud Storage for all
Cloud Storage for allCloud Storage for all
Cloud Storage for all
 
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
 
2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy edition2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy edition
 
Webinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix them
Webinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix themWebinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix them
Webinar: 5 Reasons Primary Cloud Storage is Broken and How to Fix them
 
Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
DCD Big Discussion Guide
DCD Big Discussion GuideDCD Big Discussion Guide
DCD Big Discussion Guide
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go Away
 
The Distributed & Decentralized Cloud
The Distributed & Decentralized CloudThe Distributed & Decentralized Cloud
The Distributed & Decentralized Cloud
 
Cloud as a Flexible & Collaborative Tool for Creators
Cloud as a Flexible & Collaborative Tool for CreatorsCloud as a Flexible & Collaborative Tool for Creators
Cloud as a Flexible & Collaborative Tool for Creators
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
 
Bimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationBimodal IT and EDW Modernization
Bimodal IT and EDW Modernization
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the Cloud
 
ADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise Analytics
 

Kürzlich hochgeladen

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 

Kürzlich hochgeladen (20)

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 

Trends from the Trenches (Singapore Edition)

  • 1. Trends from the Trenches 2012 Bio-IT World Asia, Singapore 1
  • 2. I’m Chris. I’m an infrastructure geek. I work for the BioTeam. 2
  • 3. BioTeam Who, what & why ‣ Independent consulting shop ‣ Staffed by scientists forced to learn IT, SW & HPC to get our own research done ‣ 10+ years bridging the “gap” between science, IT & high performance computing 3
  • 4. BioTeam Why we get invited to these sorts of talks ... ‣ Lots of people hire us across wide range of project types • Pharma, Biotech, EDU, Nonprofit, .Gov, .Mil, etc. ‣ We get to see how groups of smart people approach similar problems ‣ We can speak honestly & objectively about what we see “in the real world” 4
  • 6. Listen to me at your own risk Seriously. ‣ I’m not an expert, pundit, visionary or “thought leader” ‣ All career success entirely due to shamelessly copying what actual smart people do ‣ I’m biased, burnt-out & cynical ‣ Filter my words accordingly 6
  • 7. Introduction 1 Business & Marketplace 2 Storage 3 Cloud 4 Hot for ’12 ... 5 7
  • 8. Business Landscape So far 2012 feels a lot like 2011 ... 8
  • 9. Business & Meta Observations More of the same in ’12 ... ‣ ~4 staff full time on issues involving data handling, data management and multi-instrument Next-Gen sequencing/analysis ‣ ~2 staff full time on infrastructure, storage and facility related projects • Dwan: Big infrastructure & facility projects for Fortune 20 companies, research consortia & .GOV customers • Dag: 40% infrastructure, 20% storage, 20% cloud ‣ ~1 staff full time on Amazon Cloud projects 9
  • 10. What that tells us ‣ Same problem(s) as last year ‣ Next-gen sequencing still causing a lot of pain when it comes to data handling, storage, organization & integration ‣ As sequencing continues to be commoditized, this will likely only get worse 10
  • 11. Storage 11
  • 12. Science-centric Storage Current State Assessment ‣ Storage still making me crazy in ’12 12
  • 13. Science-centric Storage Why I’m not worried ‣ Peta-capable storage is trivial to acquire in 2012 ‣ Scale-out NAS has won the battle ‣ It’s simply not as hard/risky as it used to be 13
  • 14. On the other hand ... 14
  • 15. OMG! The Sky Is Falling! Maybe a little panic is appropriate ... 15
  • 16. The sky IS falling! Uncomfortable truths ‣ Cost of acquiring data (genomes) falling faster than rate at which industry is increasing drive capacity ‣ Human researchers downstream of these datasets are also consuming more storage (and less predictably) ‣ High-scale labs must react or potentially have catastrophic issues in 2012-2013 16
  • 17. The sky IS falling! Current Practices Are Not Sustainable ‣ FACT: Chemistry changing faster than we can refresh our datacenters and research IT infrastructure ‣ FACT: Rate at which we can cheaply acquire interesting data exceeds rate at which storage companies can increase the capacity of their products ‣ FACT: We are poor at managing, tagging, valuing & curating our data. Few scientists really understand true cost/complexity involved with keeping data safe, online & accessible ‣ FACT: In 2012 people still think “keep everything online, forever” is a viable demand to be making of IT staff ‣ FACT: Something is going to break. Soon. 17
  • 18. CRAM it. 18
  • 19. The sky IS falling! CRAM it in 2012 ... ‣ Minor improvements are useless; order-of-magnitude needed ‣ Some people are talking about radical new methods – compressing against reference sequences and only storing the diffs • With a variable compression “quality budget” to spend on lossless techniques in the areas you care about ‣ http://biote.am/5v - Ewan Birney on “Compressing DNA” ‣ http://biote.am/5w - The actual CRAM paper ‣ If CRAM takes off, storage landscape will change 19
  • 20. Storage: What comes next? Next 18 months will be really fun... 20
  • 21. What comes next. The same rules apply for 2012 and beyond ... ‣ Accept that science changes faster than IT infrastructure ‣ Be glad you are not Broad/Sanger/BGI/NCBI ‣ Flexibility, scalability and agility become the key requirements of research informatics platforms • Tiered storage is in your future ... ‣ Shared/concurrent access is still the overwhelming storage use case 21
  • 22. What comes next. In the following year ... ‣ Many peta-scale capable systems deployed • Most will operate in the hundreds-of-TBs range ‣ Far more aggressive “data triage” ‣ Genome compression via CRAM ‣ Even more data will sit untouched & unloved ‣ Growing need for tiers, HSM & even tape 22
  • 23. What comes next. In the following year ... ‣ Broad and others are paving the way with respect to metadata-aware & policy driven storage frameworks • And we’ll shamelessly copy a year or two later ‣ I’m still on my cloud storage kick • Economics are inescapable; Will be built into storage platforms, gateways & VMs • Cloud object stores are only a HTTP RESTful call away • Cloud will become “just another tier” 23
  • 24. What comes next. Expect your storage to be smarter & more capable ... ‣ What do DDN, Panasas, Isilon, BlueArc, etc. have in common? • Under the hood they all run Unix or Unix-like OS’s on x86_64 architectures ‣ Some storage arrays can already run applications natively • More will follow • Likely a big trend for 2012 24
  • 25. Storage: The road ahead My $.02 for 2012... 25
  • 26. The Road Ahead Trends & Tips for 2012 ‣ Peta-capable platforms required ‣ Scale-out NAS still the best fit ‣ Customers will no longer build one big scale-out NAS tier ‣ My ‘hack’ of using nearline spec storage as primary science tier is obsolete in ’12 ‣ pNFS mainstream in 2012? ‣ Not everything is worth backing up ‣ Expect disruptive stuff 26
  • 27. The Road Ahead Trends & Tips for 2012 ‣ Your storage will be able to run apps • Dedupe, cloud gateways & replication • ‘CRAM’ or similar compression • Storage Resource Brokers (iRODS) & metadata servers • HDFS/Hadoop hooks? • Lab, Data management & LIMS applications Drobo Appliance running BioTeam MiniLIMS internally... 27
  • 28. The Road Ahead Trends & Tips for 2012 ‣ Hadoop / MapReduce / BigData • Just like GRID and CLOUD the space is being over-hyped • You still need to think about it • ... and have a roadmap for doing it • Deep, deep ties to your storage • Your users want/need it • My $.02? Fantastic cloud use case 28
  • 30. Backblaze Pod For Biotech 30
  • 31. 100 Terabytes for $12,000 USD http://bioteam.net/tag/backblaze/ 31
  • 32. Storage Future Feels Like This ... Multiple Tiers, Multiple Vendors, Multiple Products 32
  • 33. The ‘C’ word Does a Bio-IT talk exist if it does not mention “the cloud”? 33
  • 34. Cloud Stuff ‣ Before I make some blunt comments ... ‣ I am not an Amazon Cloud shill ‣ I am a jaded, cynical, zero-loyalty consumer of IT services and products that let me get work done ‣ Because I only get paid when my solutions work, I am picky about what tools I keep in my toolkit ‣ Amazon Web Services is a fantastic tool 34
  • 35. So you think you have a cloud?
  • 36. No APIs? Not a cloud.
  • 37. No self-service? Not a cloud.
  • 38. Installing VMware & issuing a press release? Not a cloud.
  • 39. Block storage and virtual servers only? (barely) a cloud;
  • 40. Amazon is the IaaS Cloud Leader ‣ Why Amazon is attractive for infrastructure clouds: • Anyone can do virtual servers and block/object storage • Bio-IT needs “more stuff ” in order to get real work done • AWS product & service stack (“the glue”) is far more comprehensive than any other cloud competitors - Need some examples? - ElasticIP, VPC, IAM, SQS, SNS, SES, SimpleDB, DynamoDB, CloudFormation, ElasticBeanstalk, SWS, DirectConnect, etc. 40
  • 41. Amazon Cloud Dominance Could Be A Good Thing ‣ Amazon Cloud Dominance May Be Good For Bio-IT ‣ The competition must innovate in really interesting ways in order to compete. This is already happening. • Purpose-built platforms for regulated/compliant operation • “Hands-on” Managed Services for Healthcare/Pharma • Hybrid on-premise/off-premise solutions • Full life science solution & software service stacks • Bespoke Service Level Agreements (SLAs) • ,,, 41
  • 42. Private Clouds My $.02 cents 42
  • 43. Private Clouds in 2012: ‣ I’m no longer dismissing them as “useless” ‣ Usable & useful in certain situations ‣ Hype vs. Reality ratio still unbalanced ‣ Sensible only for certain environments • Have you seen what you have to do to your networks & gear? ‣ There are easier ways
  • 44. Private Clouds: My Advice for ‘12 ‣ Remain cynical (test vendor claims) ‣ Due Diligence still essential ‣ I personally would not deploy anything that does not explicitly provide Amazon API compatibility
  • 45. Private Clouds: My Advice for ‘12 Most people are better off: 1. Adding VM platforms to existing HPC clusters & environments 2. Extending enterprise VM platforms to allow user self- service & server catalogs
  • 46. Cloud Advice My $.02 cents 46
  • 47. Cloud Advice Don’t get left behind ‣ Research IT Organizations need a cloud strategy today ‣ Those that don’t will be bypassed by frustrated users ‣ IaaS cloud services are only a departmental credit card away ... and some senior scientists are too big to be fired for violating IT policy 47
  • 48. Cloud Advice Design Patterns ‣ You will need three tested cloud design patterns: ‣ (1) To handle ‘legacy’ scientific apps & workflows ‣ (2) The special stuff that is worth re-architecting ‣ (3) Hadoop & big data analytics 48
  • 49. Cloud Advice (1) Legacy HPC on the Cloud ‣ MIT StarCluster • http://web.mit.edu/star/cluster/ ‣ This is your baseline for legacy apps on ‘the cloud’ ‣ Extend as needed 49
  • 50. Cloud Advice (2) “Cloudy” HPC ‣ Some of our research workflows are important enough to be rewritten for “the cloud” and the advantages that a truly elastic & API-driven infrastructure can deliver ‣ This is where you have the most freedom ‣ Many published best practices you can borrow ‣ Good commercial options: Cycle Computing, BT, etc. 50
  • 51. Cloud Advice (3) Big Data HPC ‣ It will be a MapReduce world, get used to it ‣ Little need to roll your own Hadoop in 2012 ‣ ISV & commercial ecosystem already healthy ‣ Multiple providers today; both onsite & cloud-based ‣ Often an excellent cloud use case 51
  • 52. Cloud Data Movement My $.02 cents 52
  • 53. Cloud Data Movement ‣ Over several years we have participated in a number of large “cloud data movement” efforts ‣ We used to be big fans of physical media movement ‣ However ... 53
  • 54. Physical Data Movement Is Not Easy. 54
  • 55. Cloud Data Movement ‣ At first glance, physical data movement “seems easy” ‣ It’s not. It is hard to do correctly and requires significant human effort and operational resources ‣ This has been a hard lesson learned over several years ‣ We have a new strategy for 2012 and the next image shows why ... 55
  • 57. Cloud Data Movement Wow! ‣ With a 1GbE internet connection ... ‣ and using Aspera software .... ‣ We sustained 700 Mb/sec for more than 7 hours freighting genomes into Amazon Web Services ‣ This is fast enough for many use cases, including genome sequencing core facilities* ‣ Chris Dwan’s webinar on this topic: http://biote.am/7e 57
  • 58. Cloud Data Movement Wow! ‣ Results like this mean we now favor network-based data movement over physical media movement ‣ Large-scale physical data movement carries a high operational burden and consumes non-trivial staff time & resources ‣ *Unclear if our experience holds true for Asia or Asia-EU-Americas data transfers 58
  • 59. Cloud Data Movement There are three ways to do network data movement ... ‣ (1) Buy software from Aspera and be done with it ‣ (2) Attend the annual SuperComputing conference & see which student group wins the bandwidth challenge contest; use their code ‣ (3) Get GridFTP from the Globus folks • Trend: At every single “data movement” talk I’ve been to in 2011 it seemed that any speaker who was NOT using Aspera was a very happy user of GridFTP. #notCoincidence 59
  • 60. Hot topics for 2012 ... 60
  • 61. Hot for ’12 BioTeam side projects & research interests ‣ Like to wrap up with some topics we think are interesting ‣ Who knows? These might be trends for 2013! 61
  • 62. Siri Voice Control of Instruments/Pipelines ‣ BioTeam recently revealed work with BT and Accelrys ‣ Demonstrated Siri voice control of a Pipeline Pilot experiment running in the BT Compute Cloud ‣ http://biote.am/7h ‣ We expect to continue doing cool things with Siri in ’12 62
  • 63. Smart Storage & Lab-local Appliances ‣ I firmly expect the “storage arrays running apps & VMs” trend to go mainstream ‣ This has beneficial implications for life science informatics ‣ We’ll be hitting this topic hard on systems ranging from Drobo to DataDirect ‣ Also working with the Intel Modular Server concept 63
  • 64. Lab Local Appliances Intel Modular Server ‣ Interesting hardware combination; storage + servers + native hypervisor ‣ VM Pool 1: MiniLIMs + other useful lab software ‣ VM Pool 2: Amazon Storage Gateway Appliance http://biote.am/7i ‣ Server Blade 3: BrightCluster HPC Stack 64
  • 65. Cloud, Community & Orchestration ‣ The emerging class of “DevOps” and “Infrastructure Automation” methods are incredibly interesting • We love Opscode & Chef (http://opscode.com) ‣ We’ll be doing more with systems orchestration in ’12 • And hopefully expanding our community collection of useful Chef coobooks for life science informatics ‣ We also still love MIT StarCluster and will hopefully be contributing plugins and enhancements 65
  • 66. Thanks! Slides online at: http://slideshare.net/chrisdag/ 66

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n