SlideShare ist ein Scribd-Unternehmen logo
1 von 42
BIOTEAM
Enabling Science




                   Storage Infrastructure
Font: Optima Regular




                   and Data Management
Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                      in Life Sciences
                                                   Ari E. Berman, Ph.D.
                                         Senior Scientific Consultant, BioTeam, Inc.




                                             ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  A little about me
Enabling Science




                   • Ph.D. in Molecular Biology/Neuroscience
Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   • Trained in laboratory and bioinformatics
                   • 13 years experience as an IT infrastructure/
                           HPC geek/Perl monger
                   • Odd mix of skills led me to BioTeam
                   • Joined BioTeam in May
                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  Who is BioTeam?
Enabling Science




                   • Independent Consulting Practice
Font: Optima Regular

Colors:

                   • Made up of scientists vast experience in
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)


                           software, HPC, and IT
                   • Unique cross-section of skill sets
                   • 10+ years of bridging the gap between
                           technology and science
                   • Functions as much as a think tank as a
                           consulting practice.

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
Why am I here talking
BIOTEAM
Enabling Science                      to you?

                   • We work on broad range of projects:
Font: Optima Regular



                           Pharma, Biotech, EDU, .gov, .mil, etc.
Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   • We are in a unique position: can see how
                           people are approaching current problems
                   • We work from a tech agnostic perspective:
                           we provide what’s best for the customer
                   • Our niche: 1000ft. overview of tech
                           problems in life sciences

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM
Enabling Science
                                         Why are we all here?

Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                 Big data in life-sciences: just when you thought
                   it was safe to go back into the datacenter...

                                          ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                              Big data: the tired story
Enabling Science



          •      Next-generation sequencing,
                 Mass spec, imaging, etc.
Font: Optima Regular




          •
Colors:

                 High-throughput
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)


                 experimentation

          •      Clinical research/standard
                 healthcare - personalized
                 medicine

          •      Un-natural expansion of
                 technology (sequencing)

          •      Now: we can get the data
                 fast, what do we do with it?

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                              Big data: the tired story
Enabling Science


          •      At this point, this is an old
                 problem
Font: Optima Regular




          •
Colors:
                 Most sequencers generating
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)

                 0.5TB/day

          •      Final genomes around
                 300GB

          •      High-volume quantitative
                 methods quickly produce
                 100’s of TBs of data

          •      The kicker: tight research
                 budgets

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                    Storing Big Data
Enabling Science



                                                                •   Problem is less about storing
Font: Optima Regular
                                                                    the data. We’ve solved
Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                                                                    storage.
Light Blue #6699CC (CMYK 62, 22, 3, 0)


                                                                •   We can now put in
                                                                    thousands of spindles in a
                                                                    semi-affordable manner

                                                                •   Lots of high-density boxes

                                                                •   The petabyte challenge has
                                                                    been met

                                                                •   Now, it needs to work well

                                                                •   And still be affordable

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
Today’s problem: Accessing
BIOTEAM
Enabling Science                    Big Data

Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)                           •   In practice - get to 1.5PB,
Light Blue #6699CC (CMYK 62, 22, 3, 0)
                                                                    500M files: metadata falls off
                                                                    a cliff

                                                                •   Directory listings take
                                                                    minutes

                                                                •   Sorting takes forever

                                                                •   Forget about filesystem
                                                                    profiling/optimization




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
Today’s problem: Accessing
BIOTEAM
Enabling Science                    Big Data

               •
Font: Optima Regular

Colors:
                       What’s being done?

               •
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                       SSDs thought to be our
Light Blue #6699CC (CMYK 62, 22, 3, 0)


                       savior

               •       Blazing fast, SLC, many in
                       parallel

               •       Parallel filesystems could
                       cache metadata on SSDs

               •       Reduce search time orders
                       of magnitude



                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
Today’s problem: Accessing
BIOTEAM
Enabling Science                    Big Data

               •
Font: Optima Regular
                       Of course, it’s not that
Colors:
                       simple
Dark Blue #003399 (CMYK 96, 69, 3, 0)


               •
Light Blue #6699CC (CMYK 62, 22, 3, 0)
                       Now, distribution and access
                       points of SSDs matter

               •       How they are addressed
                       matters

               •       How many small files on the
                       filesystem matters

               •       How the files are to be used
                       matters


                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  But wait: there’s more
Enabling Science




               •       A consistent array of disks
Font: Optima Regular

Colors:
                       no longer enough beyond
                       1.5PB
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




               •       Entire solutions of high-
                       speed disks not cost-
                       effective

               •       Distribution of file access
                       needs: some fast, some
                       archive

               •       Tiering of storage
                       infrastructure


                                          ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                                Tiering
Enabling Science



             •         Keep archival data on
Font: Optima Regular
                       slower, cheaper disks
Colors:
             •         No SSDs
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)


             •         Keep fast access files on
                       smaller, high-speed disks with
                       many (possibly all) SSDs
                       (HPC, high throughput
                       needs)
             •         Mid-level tiers for
                       administrative needs
                       (documents, etc)
             •         Can even add a tape tier for
                       more permanent storage

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                     Managing Tiers
Enabling Science




             •
Font: Optima Regular
                       Administratively difficult

             •
Colors:
                       Can manage by different
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)

                       mount points, quotas, user
                       education

             •         Better: policy engines

             •         Use with parallel file systems
                       (GPFS, OneFS, etc)

             •         Policy based automated
                       movement of files through
                       tiers, even to tape


                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                        By golly, we’ve done it!
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   • If done correctly, single namespace
                           infrastructure can work well for all needs
                   • Can handle HPC to archive
                   • Can be done in a semi-affordable manner

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                         Now what?
Enabling Science




Colors:            • Now, we’re faced with more problems
Font: Optima Regular


Dark Blue #003399 (CMYK 96, 69, 3, 0)



                   • For NIH, HIPAA laws, and general sanity,
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                           need DR
                   • Need twice the space than you’ll use
                   • No other way to do it right now
                   • Use inexpensive, slow disk solutions to save
                           money on DR

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                         Now what?
Enabling Science




Font: Optima Regular



               •
Colors:
                       Also: how to keep track
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)

                       of data?

               •       At 1PB and 0.5 billion
                       files, creative directory
                       structures lose out

               •       Complexity too much
                       for anyone to handle




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  Data Management
Enabling Science




                   •
Font: Optima Regular

Colors:                   One solution: databases
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)



                   •      Keep higher depth of
                          metadata (tagging,
                          descriptions)

                   •      Cumbersome for the
                          general user to use: adds
                          complexity layer to user
                          experience




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  Data Management
Enabling Science




                   •
Font: Optima Regular

Colors:                   Databases can work,
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                          though
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   •      iRODS is a good
                          example

                   •      Put the metadata
                          database layer in-
                          between the filesystem
                          and the user




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  Data Management
Enabling Science




                   •      Others working on this
                          model as well
Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)


                   •      Cambridge Computer:
                          “As you approach
                          billions of files, file
                          exploring is no longer
                          feasible.”

                   •      Need a new interface

                   •      Rich metadata to keep
                          track of the files


                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                          Wait, more metadata?
Enabling Science




                   •      More metadata? wasn’t
                          this the original problem
Font: Optima Regular

Colors:

                          on large filesystems?
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   •      Wouldn’t this make
                          matters worse?

                   •      Depends on how it is
                          done.

                   •      Current models have
                          metadata completely
                          separate from files


                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                          Wait, more metadata?
Enabling Science



                   •      And...

                   •
Font: Optima Regular

Colors:
                          Who’s going to go back
                          and type all of that
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)

                          metadata in?

                   •      No one - we kind of
                          need to start over

                   •      ...or, need a way of
                          inferring metadata and
                          filling in the blanks from
                          existing data

                   •      Still need legacy support
                          for systems

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                   Or: Middleware
Enabling Science




Colors:
                   • Use an interactive software product
Font: Optima Regular


Dark Blue #003399 (CMYK 96, 69, 3, 0)

                           between filesystem and user
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   • Can manage link between filesystem
                           and extended metadata
                   • Can enhance the scientific process:
                           manage data, analysis, results, and
                           facilitate collaboration

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    What are scientists doing?
Enabling Science

    Lab Scientist w/
      Excel
    •Accessible for most
          scientists
Font: Optima Regular
    •Flexible
Colors:
    •   Data maintenance
Dark Blue #003399 (CMYK 96, 69, 3, 0)
      burden on lab scientists
Light Blue #6699CC (CMYK 62, 22, 3, 0)

    •Quickly overwhelmed in
      size and complexity
    •Data publication by email




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    What are scientists doing?
Enabling Science

    Lab Scientist w/
      Excel
    •Accessible for most
          scientists
Font: Optima Regular
    •Flexible
Colors:
    •   Data maintenance
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                                         Lab Bioinformatician
                           •Quick development of web-
      burden on lab scientists
Light Blue #6699CC (CMYK 62, 22, 3, 0)

    •Quickly overwhelmed in based system
      size and complexity •Rapid turn around for
    •Data publication by emailscientist needs
                           •Single point of failure
                           •Limited breadth of
                              experience
                           •Poor documentation, poor
                              transition




                                            ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    What are scientists doing?
Enabling Science

    Lab Scientist w/
      Excel
    •Accessible for most
          scientists
Font: Optima Regular
    •Flexible
Colors:
    •   Data maintenance
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                                         Lab Bioinformatician
                           •Quick development of web-
      burden on lab scientists
Light Blue #6699CC (CMYK 62, 22, 3, 0)

    •Quickly overwhelmed in based system
      size and complexity •Rapid turn around for    Outsource custom
    •Data publication by emailscientist needs          software
                           •Single point of failure
                           •Limited breadth of      •Stable, professional software
                              experience            •Well documented, easier
                                                       transition
                           •Poor documentation, poor Communication barrier with
                              transition            •
                                                       scientists
                                                    •Lack of domain knowledge
                                                       leaves large functionality
                                                       gaps
                                                    •Inflexible design leaves
                                                       software obsolete in a
                                                       matter




                                            ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    What are scientists doing?
Enabling Science

    Lab Scientist w/
      Excel
    •Accessible for most
          scientists
Font: Optima Regular
    •Flexible
Colors:
    •   Data maintenance
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                                         Lab Bioinformatician
                           •Quick development of web-
      burden on lab scientists
Light Blue #6699CC (CMYK 62, 22, 3, 0)

    •Quickly overwhelmed in based system
      size and complexity •Rapid turn around for    Outsource custom
    •Data publication by emailscientist needs          software
                           •Single point of failure
                           •Limited breadth of      •Stable, professional software
                              experience            •Well documented, easier
                                                       transition
                           •Poor documentation, poor Communication barrier with
                              transition            •
                                                       scientists
                                                                          “Shrink-wrapped”
                                                    •Lack of domain knowledge
                                                                             software.
                                                       leaves large functionality
                                                       gaps               •Lab data management solutions
                                                    •Inflexible design leavesleverage many customers, years
                                                       software obsolete in a experience
                                                                             of
                                                       matter             •Year-to-year enhancement of
                                                                             product
                                                                          •High purchase price due to limited
                                                                             market
                                                                          •Mismatch to local lab expertise
                                                                             and workflow
                                                                          •Unused complexity
                                            ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                                 LIMS
Enabling Science




Font: Optima Regular
                       • Laboratory Information Management
Colors:
                            System
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                       • Many out there now: standard and custom
                       • Many focus markets
                       • Basespace: Illumina (NGS)
                       • Quartzy: general lab monkey
                       • MiniLIMS
                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                         Disclaimer
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                   • This will feel like a sales pitch
                   • Just want to illustrate how we’re tackling
                           information mangement problem




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    The BioTeam Solution: MiniLIMS
Enabling Science




              • An affordable software product that leverages
Font: Optima Regular



                       real world experience
Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)



                       •
Light Blue #6699CC (CMYK 62, 22, 3, 0)

                            Decades of combined software and informatics
                            expertise
                       •    Years of LIMS customization
                       •    $4995 license for academic labs

              • Flexible architecture that adapts to new
                       processes and technologies
                       •    Schema-less design allows real time changes to data
                            model
                       •    Plugin architecture allows mix and match functionality



                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    The BioTeam Solution: MiniLIMS
Enabling Science




Font: Optima Regular   • Customization options that match lab
Colors:
                            resources
Dark Blue #003399 (CMYK 96, 69, 3, 0)


                          •       End user customizable system and Excel import/
Light Blue #6699CC (CMYK 62, 22, 3, 0)



                                  export that empowers lab scientists
                          •       Accessible source code and APIs for in-house
                                  developers
                          •       BioTeam consulting for labs without development
                                  resources, or development teams that are stretched
                                  thin




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                  The BioTeam Solution: MiniLIMS
Enabling Science




                                                   End user configurable
Font: Optima Regular

Colors:         Form and Page            Display                                            GC Mass Spec
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)

                                                                                              Invoicing




                                                                          PHP API
               Data and Configuration
                             Objects
                                                                                            Analysis tools

                               Data Broker
                                                                                                    NGS

                       Schema-less MySQL
                              persistence




                                         MiniLIMS Core                              MiniLIMS Plugins


                                            ©BioTeam, Inc. 2012 - http://www.bioteam.net
MiniLIMS: Linking lab to
BIOTEAM                                        datacenter
Enabling Science


                                          Central auth
                                                                                 Customer workflow   MiniLIMS Core
                                                          Reagent
                                                         inventory    Uptime
            User acct setup, login                                   reporting     Lab workflow      MiniLIMS Plugin
Font: Optima Regular

Colors:                                                                                             MiniLIMS Custom
Dark Blue #003399 (CMYK 96, 69, 3, 0)
                                           Sample receiving
              Sample registration
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                            Sample / library prep,
                                                     QC




                                           Run / slide / flowcell
            Sample status                  setup




                                                   Instrument console



                                           Run monitoring




            Results delivery, billing      Analysis launch,
                                           monitoring, results




                                          ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                    Simple/Flexible Concept
Enabling Science




Font: Optima Regular
                                                                  Type          Name

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                                                Property                Value




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                  Simple to query
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                     Customizations: plugins
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                     Customizations: plugins
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                               1. Select Bowtie                                     3. Click to run
                                                                       2. Select the Fastq File &
                                                                                                       the protocol
                                                                          Name the experiment

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                     Customizations: plugins
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                                          Workflows
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                     Moving forward: Appliance
Enabling Science



                   •      Turnkey solution

                   •
Font: Optima Regular
                          MiniLIMS + Local
                          Analysis Engine
Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)



                   •      Plan is to link to cloud
                          resources: automatic
                          backup & link to hosted
                          MiniLIMS

                   •      16 cores, 96GB RAM,
                          18T redundant storage,
                          SSD for OS.

                   •      Solution for any lab
                          needing LIMS

                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM                              How to enable science
Enabling Science




         •      Solidify storage infrastructure
Font: Optima Regular

Colors:
         •      Add tiered storage with
Dark Blue #003399 (CMYK 96, 69, 3, 0)

                policy engine to move data
Light Blue #6699CC (CMYK 62, 22, 3, 0)




         •      Supply DR

         •      Enable metadata
                acceleration: SSDs + cache

         •      Implement middleware for
                rich metadata tracking

         •      Make it easy for the
                scientists


                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net
BIOTEAM
Enabling Science




Font: Optima Regular

Colors:
Dark Blue #003399 (CMYK 96, 69, 3, 0)
Light Blue #6699CC (CMYK 62, 22, 3, 0)




                                               Thank you!




                                         ©BioTeam, Inc. 2012 - http://www.bioteam.net

Weitere ähnliche Inhalte

Ähnlich wie Ari Berman - Intel Big Data Seminar 9/6/2012

Brokering Data: Accelerating Data Evaluation with Databricks White Label
Brokering Data: Accelerating Data Evaluation with Databricks White LabelBrokering Data: Accelerating Data Evaluation with Databricks White Label
Brokering Data: Accelerating Data Evaluation with Databricks White LabelDatabricks
 
Establishing Release Quality Levels and Release Acceptance Tests
Establishing Release Quality Levels and Release Acceptance TestsEstablishing Release Quality Levels and Release Acceptance Tests
Establishing Release Quality Levels and Release Acceptance TestsLuke Hohmann
 
Email Management Using Oracle WebCenter Content Records
Email Management Using Oracle WebCenter Content RecordsEmail Management Using Oracle WebCenter Content Records
Email Management Using Oracle WebCenter Content RecordsRaoul Miller
 
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 PresentationGC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 PresentationLudovic Poitou
 
Designing Cloud Backup to reduce DR downtime for IT Professionals
Designing Cloud Backup to reduce DR downtime for IT ProfessionalsDesigning Cloud Backup to reduce DR downtime for IT Professionals
Designing Cloud Backup to reduce DR downtime for IT ProfessionalsStorage Switzerland
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintrothomasrconnor
 
Webinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data MonetizationWebinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data MonetizationStorage Switzerland
 
Cryocrate presentation for zoom 3.26.19
Cryocrate presentation for zoom 3.26.19Cryocrate presentation for zoom 3.26.19
Cryocrate presentation for zoom 3.26.19kphodel
 
Key Considerations for a Microfiche Scanning Project.pdf
Key Considerations for a Microfiche Scanning Project.pdfKey Considerations for a Microfiche Scanning Project.pdf
Key Considerations for a Microfiche Scanning Project.pdfManaged Ousource Solutions
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
 
Big data in Engineering Application
Big data in Engineering ApplicationBig data in Engineering Application
Big data in Engineering ApplicationMathews Job
 
The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...
The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...
The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...inside-BigData.com
 
Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?Brad Houston
 
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020Storage Switzerland
 
17783_bigdata-notes2.ppt
17783_bigdata-notes2.ppt17783_bigdata-notes2.ppt
17783_bigdata-notes2.pptHARIKRISHNANU13
 

Ähnlich wie Ari Berman - Intel Big Data Seminar 9/6/2012 (16)

Brokering Data: Accelerating Data Evaluation with Databricks White Label
Brokering Data: Accelerating Data Evaluation with Databricks White LabelBrokering Data: Accelerating Data Evaluation with Databricks White Label
Brokering Data: Accelerating Data Evaluation with Databricks White Label
 
Establishing Release Quality Levels and Release Acceptance Tests
Establishing Release Quality Levels and Release Acceptance TestsEstablishing Release Quality Levels and Release Acceptance Tests
Establishing Release Quality Levels and Release Acceptance Tests
 
Email Management Using Oracle WebCenter Content Records
Email Management Using Oracle WebCenter Content RecordsEmail Management Using Oracle WebCenter Content Records
Email Management Using Oracle WebCenter Content Records
 
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 PresentationGC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
 
Designing Cloud Backup to reduce DR downtime for IT Professionals
Designing Cloud Backup to reduce DR downtime for IT ProfessionalsDesigning Cloud Backup to reduce DR downtime for IT Professionals
Designing Cloud Backup to reduce DR downtime for IT Professionals
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintro
 
Webinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data MonetizationWebinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data Monetization
 
Cryocrate presentation for zoom 3.26.19
Cryocrate presentation for zoom 3.26.19Cryocrate presentation for zoom 3.26.19
Cryocrate presentation for zoom 3.26.19
 
4 campanile seagate strategy_20150316
4 campanile seagate strategy_201503164 campanile seagate strategy_20150316
4 campanile seagate strategy_20150316
 
Key Considerations for a Microfiche Scanning Project.pdf
Key Considerations for a Microfiche Scanning Project.pdfKey Considerations for a Microfiche Scanning Project.pdf
Key Considerations for a Microfiche Scanning Project.pdf
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
 
Big data in Engineering Application
Big data in Engineering ApplicationBig data in Engineering Application
Big data in Engineering Application
 
The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...
The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...
The Transformation of HPC: Simulation and Cognitive Methods in the Era of Big...
 
Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?
 
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
Four Reasons Why Your Backup & Recovery Hardware will Break by 2020
 
17783_bigdata-notes2.ppt
17783_bigdata-notes2.ppt17783_bigdata-notes2.ppt
17783_bigdata-notes2.ppt
 

Kürzlich hochgeladen

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Ari Berman - Intel Big Data Seminar 9/6/2012

  • 1. BIOTEAM Enabling Science Storage Infrastructure Font: Optima Regular and Data Management Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) in Life Sciences Ari E. Berman, Ph.D. Senior Scientific Consultant, BioTeam, Inc. ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 2. BIOTEAM A little about me Enabling Science • Ph.D. in Molecular Biology/Neuroscience Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Trained in laboratory and bioinformatics • 13 years experience as an IT infrastructure/ HPC geek/Perl monger • Odd mix of skills led me to BioTeam • Joined BioTeam in May ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 3. BIOTEAM Who is BioTeam? Enabling Science • Independent Consulting Practice Font: Optima Regular Colors: • Made up of scientists vast experience in Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) software, HPC, and IT • Unique cross-section of skill sets • 10+ years of bridging the gap between technology and science • Functions as much as a think tank as a consulting practice. ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 4. Why am I here talking BIOTEAM Enabling Science to you? • We work on broad range of projects: Font: Optima Regular Pharma, Biotech, EDU, .gov, .mil, etc. Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • We are in a unique position: can see how people are approaching current problems • We work from a tech agnostic perspective: we provide what’s best for the customer • Our niche: 1000ft. overview of tech problems in life sciences ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 5. BIOTEAM Enabling Science Why are we all here? Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) Big data in life-sciences: just when you thought it was safe to go back into the datacenter... ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 6. BIOTEAM Big data: the tired story Enabling Science • Next-generation sequencing, Mass spec, imaging, etc. Font: Optima Regular • Colors: High-throughput Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) experimentation • Clinical research/standard healthcare - personalized medicine • Un-natural expansion of technology (sequencing) • Now: we can get the data fast, what do we do with it? ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 7. BIOTEAM Big data: the tired story Enabling Science • At this point, this is an old problem Font: Optima Regular • Colors: Most sequencers generating Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) 0.5TB/day • Final genomes around 300GB • High-volume quantitative methods quickly produce 100’s of TBs of data • The kicker: tight research budgets ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 8. BIOTEAM Storing Big Data Enabling Science • Problem is less about storing Font: Optima Regular the data. We’ve solved Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) storage. Light Blue #6699CC (CMYK 62, 22, 3, 0) • We can now put in thousands of spindles in a semi-affordable manner • Lots of high-density boxes • The petabyte challenge has been met • Now, it needs to work well • And still be affordable ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 9. Today’s problem: Accessing BIOTEAM Enabling Science Big Data Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) • In practice - get to 1.5PB, Light Blue #6699CC (CMYK 62, 22, 3, 0) 500M files: metadata falls off a cliff • Directory listings take minutes • Sorting takes forever • Forget about filesystem profiling/optimization ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 10. Today’s problem: Accessing BIOTEAM Enabling Science Big Data • Font: Optima Regular Colors: What’s being done? • Dark Blue #003399 (CMYK 96, 69, 3, 0) SSDs thought to be our Light Blue #6699CC (CMYK 62, 22, 3, 0) savior • Blazing fast, SLC, many in parallel • Parallel filesystems could cache metadata on SSDs • Reduce search time orders of magnitude ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 11. Today’s problem: Accessing BIOTEAM Enabling Science Big Data • Font: Optima Regular Of course, it’s not that Colors: simple Dark Blue #003399 (CMYK 96, 69, 3, 0) • Light Blue #6699CC (CMYK 62, 22, 3, 0) Now, distribution and access points of SSDs matter • How they are addressed matters • How many small files on the filesystem matters • How the files are to be used matters ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 12. BIOTEAM But wait: there’s more Enabling Science • A consistent array of disks Font: Optima Regular Colors: no longer enough beyond 1.5PB Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Entire solutions of high- speed disks not cost- effective • Distribution of file access needs: some fast, some archive • Tiering of storage infrastructure ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 13. BIOTEAM Tiering Enabling Science • Keep archival data on Font: Optima Regular slower, cheaper disks Colors: • No SSDs Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Keep fast access files on smaller, high-speed disks with many (possibly all) SSDs (HPC, high throughput needs) • Mid-level tiers for administrative needs (documents, etc) • Can even add a tape tier for more permanent storage ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 14. BIOTEAM Managing Tiers Enabling Science • Font: Optima Regular Administratively difficult • Colors: Can manage by different Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) mount points, quotas, user education • Better: policy engines • Use with parallel file systems (GPFS, OneFS, etc) • Policy based automated movement of files through tiers, even to tape ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 15. BIOTEAM By golly, we’ve done it! Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • If done correctly, single namespace infrastructure can work well for all needs • Can handle HPC to archive • Can be done in a semi-affordable manner ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 16. BIOTEAM Now what? Enabling Science Colors: • Now, we’re faced with more problems Font: Optima Regular Dark Blue #003399 (CMYK 96, 69, 3, 0) • For NIH, HIPAA laws, and general sanity, Light Blue #6699CC (CMYK 62, 22, 3, 0) need DR • Need twice the space than you’ll use • No other way to do it right now • Use inexpensive, slow disk solutions to save money on DR ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 17. BIOTEAM Now what? Enabling Science Font: Optima Regular • Colors: Also: how to keep track Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) of data? • At 1PB and 0.5 billion files, creative directory structures lose out • Complexity too much for anyone to handle ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 18. BIOTEAM Data Management Enabling Science • Font: Optima Regular Colors: One solution: databases Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Keep higher depth of metadata (tagging, descriptions) • Cumbersome for the general user to use: adds complexity layer to user experience ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 19. BIOTEAM Data Management Enabling Science • Font: Optima Regular Colors: Databases can work, Dark Blue #003399 (CMYK 96, 69, 3, 0) though Light Blue #6699CC (CMYK 62, 22, 3, 0) • iRODS is a good example • Put the metadata database layer in- between the filesystem and the user ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 20. BIOTEAM Data Management Enabling Science • Others working on this model as well Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Cambridge Computer: “As you approach billions of files, file exploring is no longer feasible.” • Need a new interface • Rich metadata to keep track of the files ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 21. BIOTEAM Wait, more metadata? Enabling Science • More metadata? wasn’t this the original problem Font: Optima Regular Colors: on large filesystems? Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Wouldn’t this make matters worse? • Depends on how it is done. • Current models have metadata completely separate from files ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 22. BIOTEAM Wait, more metadata? Enabling Science • And... • Font: Optima Regular Colors: Who’s going to go back and type all of that Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) metadata in? • No one - we kind of need to start over • ...or, need a way of inferring metadata and filling in the blanks from existing data • Still need legacy support for systems ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 23. BIOTEAM Or: Middleware Enabling Science Colors: • Use an interactive software product Font: Optima Regular Dark Blue #003399 (CMYK 96, 69, 3, 0) between filesystem and user Light Blue #6699CC (CMYK 62, 22, 3, 0) • Can manage link between filesystem and extended metadata • Can enhance the scientific process: manage data, analysis, results, and facilitate collaboration ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 24. BIOTEAM What are scientists doing? Enabling Science Lab Scientist w/ Excel •Accessible for most scientists Font: Optima Regular •Flexible Colors: • Data maintenance Dark Blue #003399 (CMYK 96, 69, 3, 0) burden on lab scientists Light Blue #6699CC (CMYK 62, 22, 3, 0) •Quickly overwhelmed in size and complexity •Data publication by email ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 25. BIOTEAM What are scientists doing? Enabling Science Lab Scientist w/ Excel •Accessible for most scientists Font: Optima Regular •Flexible Colors: • Data maintenance Dark Blue #003399 (CMYK 96, 69, 3, 0) Lab Bioinformatician •Quick development of web- burden on lab scientists Light Blue #6699CC (CMYK 62, 22, 3, 0) •Quickly overwhelmed in based system size and complexity •Rapid turn around for •Data publication by emailscientist needs •Single point of failure •Limited breadth of experience •Poor documentation, poor transition ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 26. BIOTEAM What are scientists doing? Enabling Science Lab Scientist w/ Excel •Accessible for most scientists Font: Optima Regular •Flexible Colors: • Data maintenance Dark Blue #003399 (CMYK 96, 69, 3, 0) Lab Bioinformatician •Quick development of web- burden on lab scientists Light Blue #6699CC (CMYK 62, 22, 3, 0) •Quickly overwhelmed in based system size and complexity •Rapid turn around for Outsource custom •Data publication by emailscientist needs software •Single point of failure •Limited breadth of •Stable, professional software experience •Well documented, easier transition •Poor documentation, poor Communication barrier with transition • scientists •Lack of domain knowledge leaves large functionality gaps •Inflexible design leaves software obsolete in a matter ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 27. BIOTEAM What are scientists doing? Enabling Science Lab Scientist w/ Excel •Accessible for most scientists Font: Optima Regular •Flexible Colors: • Data maintenance Dark Blue #003399 (CMYK 96, 69, 3, 0) Lab Bioinformatician •Quick development of web- burden on lab scientists Light Blue #6699CC (CMYK 62, 22, 3, 0) •Quickly overwhelmed in based system size and complexity •Rapid turn around for Outsource custom •Data publication by emailscientist needs software •Single point of failure •Limited breadth of •Stable, professional software experience •Well documented, easier transition •Poor documentation, poor Communication barrier with transition • scientists “Shrink-wrapped” •Lack of domain knowledge software. leaves large functionality gaps •Lab data management solutions •Inflexible design leavesleverage many customers, years software obsolete in a experience of matter •Year-to-year enhancement of product •High purchase price due to limited market •Mismatch to local lab expertise and workflow •Unused complexity ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 28. BIOTEAM LIMS Enabling Science Font: Optima Regular • Laboratory Information Management Colors: System Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Many out there now: standard and custom • Many focus markets • Basespace: Illumina (NGS) • Quartzy: general lab monkey • MiniLIMS ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 29. BIOTEAM Disclaimer Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • This will feel like a sales pitch • Just want to illustrate how we’re tackling information mangement problem ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 30. BIOTEAM The BioTeam Solution: MiniLIMS Enabling Science • An affordable software product that leverages Font: Optima Regular real world experience Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) • Light Blue #6699CC (CMYK 62, 22, 3, 0) Decades of combined software and informatics expertise • Years of LIMS customization • $4995 license for academic labs • Flexible architecture that adapts to new processes and technologies • Schema-less design allows real time changes to data model • Plugin architecture allows mix and match functionality ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 31. BIOTEAM The BioTeam Solution: MiniLIMS Enabling Science Font: Optima Regular • Customization options that match lab Colors: resources Dark Blue #003399 (CMYK 96, 69, 3, 0) • End user customizable system and Excel import/ Light Blue #6699CC (CMYK 62, 22, 3, 0) export that empowers lab scientists • Accessible source code and APIs for in-house developers • BioTeam consulting for labs without development resources, or development teams that are stretched thin ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 32. BIOTEAM The BioTeam Solution: MiniLIMS Enabling Science End user configurable Font: Optima Regular Colors: Form and Page Display GC Mass Spec Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) Invoicing PHP API Data and Configuration Objects Analysis tools Data Broker NGS Schema-less MySQL persistence MiniLIMS Core MiniLIMS Plugins ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 33. MiniLIMS: Linking lab to BIOTEAM datacenter Enabling Science Central auth Customer workflow MiniLIMS Core Reagent inventory Uptime User acct setup, login reporting Lab workflow MiniLIMS Plugin Font: Optima Regular Colors: MiniLIMS Custom Dark Blue #003399 (CMYK 96, 69, 3, 0) Sample receiving Sample registration Light Blue #6699CC (CMYK 62, 22, 3, 0) Sample / library prep, QC Run / slide / flowcell Sample status setup Instrument console Run monitoring Results delivery, billing Analysis launch, monitoring, results ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 34. BIOTEAM Simple/Flexible Concept Enabling Science Font: Optima Regular Type Name Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) Property Value ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 35. BIOTEAM Simple to query Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 36. BIOTEAM Customizations: plugins Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 37. BIOTEAM Customizations: plugins Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) 1. Select Bowtie 3. Click to run 2. Select the Fastq File & the protocol Name the experiment ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 38. BIOTEAM Customizations: plugins Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 39. BIOTEAM Workflows Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 40. BIOTEAM Moving forward: Appliance Enabling Science • Turnkey solution • Font: Optima Regular MiniLIMS + Local Analysis Engine Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) • Plan is to link to cloud resources: automatic backup & link to hosted MiniLIMS • 16 cores, 96GB RAM, 18T redundant storage, SSD for OS. • Solution for any lab needing LIMS ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 41. BIOTEAM How to enable science Enabling Science • Solidify storage infrastructure Font: Optima Regular Colors: • Add tiered storage with Dark Blue #003399 (CMYK 96, 69, 3, 0) policy engine to move data Light Blue #6699CC (CMYK 62, 22, 3, 0) • Supply DR • Enable metadata acceleration: SSDs + cache • Implement middleware for rich metadata tracking • Make it easy for the scientists ©BioTeam, Inc. 2012 - http://www.bioteam.net
  • 42. BIOTEAM Enabling Science Font: Optima Regular Colors: Dark Blue #003399 (CMYK 96, 69, 3, 0) Light Blue #6699CC (CMYK 62, 22, 3, 0) Thank you! ©BioTeam, Inc. 2012 - http://www.bioteam.net

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n