SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Wednesday, February 22, 12
Eric Kavanagh
                             Eric.kavanagh@bloorgroup.com




                                             Twitter Tag: #briefr
Wednesday, February 22, 12
To conduct an Open Research program that
                       invites the participation of both IT users and
                       technology vendors

                       To assist IT buyers in understanding database
                       technology and the architecture that surrounds
                       it.

                       Allow audience members to pose serious
                       questions... and get answers!

                       Publish all findings


                                                                Twitter Tag: #briefr
Wednesday, February 22, 12
Your Host: Eric Kavanagh
                   Research Leader: Mark Madsen - Third Nature
                  Primary Collaborator: Robin Bloor - The Bloor
                  Group
                   Guest Analyst 1: Rajiv Rawat - BI Results
                   Guest Analyst 2: Malcolm Chisholm - Consultant



Wednesday, February 22, 12
Rajeev Rawat is the founder and CEO of BI
             Results. His career has involved leading large
             cross-functional teams at both IBM and Xerox,
             where he was involved in direct customer facing
             roles as well as taking part in headquarters
             assignments.
             His headquarters positions with worldwide
             responsibility included strategic assignments for
             alliances and relationships with technology
             partners, product management and product
             marketing. Other responsibilities include
             restructuring business models, test of new
             technology platforms, and sales coverage plans.
             Rajeev led the introduction of new technologies
             and solutions for Xerox and IBM.
             www.biresults.com, biresult@gmail.com
             LinkedIn: Rajeev Rawat



   Twitter Tag: #brief
Wednesday, February 22, 12
The Bloor Group

       Fit for Purpose: The New Database Revolution
       The Bloor Group – February 22, 2012

         Five Years of Incredible Excitement In
         Information Acrobatics!

         -Seismic shift in data

Ç                                                                                                              Ç
           Variety, Volume, Velocity


√                                                                                                              √



                                                                                          Rajeev Rawat
                                                                    Serving to achieve your full potential
                     ©Copyright BI Results, LLC 2012
   
   
   


Wednesday, February 22, 12
The Bloor Group

       The Next Five Years
       The Most Exciting Times In Information Acrobatics

         New Venture Funding
                                                                      Key Value Store, Big Table, Graph DB, Document DB
         New (Needed) Functionality


Ç                                                                                                                                  Ç
         New Skills


√                                                                                                                                  √
         New Ventures

         Innovative Code
                                                                                   NoSQL Innovation
         Lots of Great Innovation                                                Apache Project, Amazon, Facebook,
                                                                               Google, Open Source Community, Twitter



                                                                Reports of the Death of The RDBMS
                                                                             Are Highly Exaggerated
                     ©Copyright BI Results, LLC 2012
   
   
     


Wednesday, February 22, 12
The Bloor Group

       RDBMS Still Dominates
       Reliable Heavy Lifting

                                                                                 RDBMS Vs. NoSQL?

         Strengths
         - Robust (ACID, Fail-proof)
         - Structure (Granular, Scalable, Fast)
Ç        - Governance (Backups, Precision)
         - Tools (ETL, Analytics, Reporting)
                                                                                                                 Ç
√        - Ecosystem (Global deep collaboration)
         - Skills (Certifications, Experience)
                                                                                                                 √
         - Policies, Procedures (Reliability)
         - Documentation (Support, Training)
                                                                               Photo: Watchmojo.com




                                                                    Reports of the Death of RDBMS
                                                                            Are Highly Exaggerated
                     ©Copyright BI Results, LLC 2012
   
   
   


Wednesday, February 22, 12
The Bloor Group

       NoSQL
       Being Tested, Validated, Calibrated
                                                                        Key Value Store, Big Table, Graph
         - Co-Existence, Transition,                                           DB, Document DB
           NoSQL Only

         - Meta Tag, Master Data

Ç                                                                                                                      Ç
           Other scheme/s


√                                                                                                                      √
         - Data Governance, Controls.
           Authentication, Security

         - Deep Analytics on Mixed
           Datasets
                                                                            Complexity, Semi- Structured,
                                                                              Highly Connected Data


                                                                    Fantastic Growth Opportunity
                                                                                 Skills, Investing
                     ©Copyright BI Results, LLC 2012
   
   
   


Wednesday, February 22, 12
The Bloor Group

       NoSQL, RDBMS Innovation
       Fantastic Opportunity for Growth

         Gaps You Can Help Close
                                                                        The Race Is On!
         - Mapping Big Data with
           Legacy Data

Ç        - Strategy and Policy for                                       Finish Line
                                                                                                       Ç
√                                                                                                      √
           Governance, Precision,
           Controls

         - Opportunities at all sides
           - Enterprise
           - Legacy Vendors
           - Innovative Ventures                                    Tested For Prime Time
           - Technology and Business
                                                                     Time to Rise To The Top
                                                                             Skills, Investing
                     ©Copyright BI Results, LLC 2012
   
   
   


Wednesday, February 22, 12
Disection &
                             Discussion




                                           Twitter Tag: #briefr
Wednesday, February 22, 12
Robin Bloor is Chief
                               Analyst at The
                                Bloor Group.



                              Robin.Bloor@Bloorgroup.com




    Twitter Tag: #briefr
Wednesday, February 22, 12
Wednesday, February 22, 12
RDBMS




Wednesday, February 22, 12
The SQL Barrier
           SQL has:
            DDL (for data definition)                                     SQL
                                                                        Barrier
            DML (for Select, Project and Join)
                                                         Results                  Or results
                But it has no MML or TML               processing
                                                    must be done here
                                                                                  processing
                                                                               must be done here
           Usually result sets are brought to the
           client for further manipulation, but
           using them for further data access
           becomes problematic.                                          SQL


           Conclusions:                                                           Analytic
                                                                                   DBMS
            This separation of data from process
            is arbitrary and unhelpful




Wednesday, February 22, 12
That MapReduce Thing
           There are two fundamental
           approaches to parallelism
             Data Partitioning
             Process partitioning
           MapReduce implements an
           approach which is oriented to
           the first of these. Thus proves
           to be suited to many “big data”
           tasks.
           It is not the end ofd the parallel
           processing story by any means.




Wednesday, February 22, 12
Malcolm Chisholm has 25+ years experience in
               data management working in finance, insurance,
               manufacturing, government, defense,
               pharmaceuticals, and retail. He specializes in
               data governance, MDM, metadata engineering,
               business rules management/execution, data
               architecture and design. He is a well-known
               presenter at conferences in the U.S. and Europe,
               writes columns in trade journals, and has
               authored the books: Managing Reference Data in
               Enterprise Databases; How to Build a Business
               Rules Engine; and Definitions in Information
               Management. In 2011, Malcolm was presented
               with the prestigious DAMA International
               Professional Achievement Award for
               contributions to Master Data Management.
               He can be contacted at
               mchisholm@refdataportal.com.



   Twitter Tag: #briefr
Wednesday, February 22, 12
Disection &
                             Discussion




                                           Twitter Tag: #briefr
Wednesday, February 22, 12
The New Database Revolution:
                        Relational Roundtable

                                          The Virtual Circle

                                            February 22, 2012
                                               San Francisco


                                        Malcolm Chisholm Ph.D.
                                      mchisholm@refdataportal.com
                               Telephone 732-687-9283 • Fax 407-264-6809
                                         www.refdataportal.com
                                        www.bizrulesengine.com

    © AskGet.com Inc., 2012. All rights reserved

Wednesday, February 22, 12
“Big Data” Is Used Differently
                             Relational Paradigm                     ULS Dataspace in Cloud




       “Set at a time” processing                       Uncover individual facts

       Behavior of populations of identical things      Much is master data

       Event data predominates                          Events are not as much repetitive transactions
       Exception reporting for singular things/events   Can aggregate from individual facts (but bottom-
       (bust still top-down)                            up)
       Heavy data entry supported                       Surf and drill

                                                        Data entry is to support analysis


     • The relational paradigm is different to ULS “Big Data”. [ULS = Ultra-Large
       Scale - usually Petabyte scale]
     • Difficult to rely on relational thinking in Cloud databases
    © AskGet.com Inc., 2012. All rights reserved

Wednesday, February 22, 12
Sources
          Source A
                                          Emails
          Source B                     Documents         I    ULS Dataspace in Cloud

                                      Web Pages          N
          Source C                                       G
                                            XML          E
          Source D                      Relational       S
                                                         T
                                        Flat Files       I
          Source E                                       O
                                           Audio
                                                         N
                                           Image
                                           Video
     •   Sources provide data to the ULS dataspace
     •   One source can provide many data formats
     •   Many sources can provide the same format
     •   Sources may duplicate the same data
     •   HINT – Think metadata
     © AskGet.com Inc., 2012. All rights reserved
Wednesday, February 22, 12
Segments in Dataspace
                                                         ULS Dataspace in Cloud
     Source A                I
                             N     Ingested Data              Terms in
                             G         Store                 Documents                  Document-Term
     Source B                E                                                           Inverted Index
                             S                     M/R                            M/R
     Source C                T
                             I
                             O
     Source N                N
                                                              Extracted                  Deduplicated
                                                             Master Data                 Master Data

                                                   M/R                            M/R




    •   The ULS dataspace is not a single “blob” of data
    •   It will have different segments with different kinds of data in it
    •   The segments will be derived from the originally ingested data
    •   MapReduce (M/R) is the equivalent of ETL to move data around and
        transform it (filter, summarize)
    © AskGet.com Inc., 2012. All rights reserved

Wednesday, February 22, 12
No Common Notation for Columnar Designs

                                                                            ?


                                                            Col A   Col B   Col C   Col D   Col E
                                                   Row 01   Val1A
                                                   Row 02   Val2A   Val2B   Val2C   Val2D   Val2E
                                                   Row 03   Val3A           Val3C           Val3E




    •   E/R diagramming techniques allow us to visualize a relational database
    •   There is nothing that is quite the same for columnar databases
    •   (a) It is sparse and columns may be missing
    •   (b) How do you show the MapReduce transformations (not quite relations)?


    © AskGet.com Inc., 2012. All rights reserved

Wednesday, February 22, 12
Need a Data Dictionary




    • The ULS dataspace can grow quickly and have many data objects
    • Without a DD developers and users will get hopelessly lost (none of the
      logic imposed by the relational model)
    • The fundamental unit is the field – show where it occurs in rows, ColQuals
      and payloads
    • Tables less important than in relational
    © AskGet.com Inc., 2012. All rights reserved

Wednesday, February 22, 12
Disection &
                             Discussion




                                           Twitter Tag: #briefr
Wednesday, February 22, 12
Mark Madsen is founder of Third Nature, a
                research and consulting firm focused on
                analytics, BI and decision-making. Mark
                spent the past two decades working on
                analysis and decision support in many
                industries and countries. He is an award-
                winning architect and former CTO whose
                work has been featured in numerous
                industry publications. Over the past ten
                years Mark received awards for his work
                from the American Productivity & Quality
                Center, TDWI, and the Smithsonian Institute.
                He is an international speaker, a contributing
                editor at Intelligent Enterprise, and manages
                the open source channel at the Business
                Intelligence Network. For more information
                or to contact Mark, visit http://
                ThirdNature.net.




   Twitter Tag: #briefr
Wednesday, February 22, 12
One Size Doesn’t Fit All



                             February 22, 2012

                             Mark R. Madsen
                             http://ThirdNature.net




Wednesday, February 22, 12
The	
  future	
  of	
  data	
  is	
  the	
  database




Wednesday, February 22, 12
You keep using that word.
                             I do not think it means
                             what you think it means.




Wednesday, February 22, 12
Good	
  conceptual	
  model,	
  but	
  a	
  prematurely	
  
                             standardized	
  implementa5on




The	
  rela*onal	
  database	
  is	
  the	
  franchise	
  technology	
  for	
  storing	
  and	
  
retrieving	
  data,	
  but…
1.Global,	
  sta*c	
  schema	
  model
2.No	
  rich	
  typing	
  system
3.Many	
  are	
  not	
  a	
  good	
  fit	
  for	
  network	
  parallel	
  compu*ng,	
  aka	
  cloud
4.Limited	
  API	
  in	
  atomic	
  SQL	
  statement	
  syntax	
  	
  &	
  simple	
  result	
  set	
  return



Wednesday, February 22, 12
Plus,	
  if	
  they’re	
  all	
  the	
  same	
  why	
  are	
  there	
  so	
  many?
       Sybase	
  IQ,	
  ASE                  EnterpriseDB	
      Algebraix
       Teradata,	
  Aster	
  Data            LucidDB             Intersystems	
  Caché
       Oracle,	
  RAC                        Vectorwise          Streambase
       MicrosoT	
  SQLServer,	
  PDW         MonetDB             SQLStream
       IBM	
  DB2s,	
  Netezza               Exasol              Coral8
       Paraccel                              Illuminate          Ingres
       Kogni*o                               Ver*ca              Postgres
       EMC/Greenplum                         InfiniDB             Cassandra
       Oracle	
  Exadata                     1010	
  Data        CouchDB
       SAP	
  HANA                           SAND                Mongo
       Infobright                            Endeca              Hbase
       MySQL                                 Xtreme	
  Data      Redis
       MarkLogic                             IMS                 RainStor
       Tokyo	
  Cabinet                      Hive                Scalaris

                             And a few hundred more.
Wednesday, February 22, 12
The	
  future	
  of	
  data	
  is	
  the	
  rela0onal	
  database?




                             SQL                       noSQL




Wednesday, February 22, 12
The	
  future	
  of	
  data	
  is	
  the	
  rela0onal	
  database?




                             SQL                       noSQL




Wednesday, February 22, 12
Technologies	
  are	
  not	
  
                             perfect	
  replacements	
  for	
  
                             one	
  another.

                             When	
  replacing	
  the	
  old	
  
                             with	
  the	
  new	
  (or	
  ignoring	
  
                             the	
  new	
  over	
  the	
  old)	
  you	
  
                             always	
  make	
  tradeoffs,	
  and	
  
                             usually	
  you	
  won’t	
  see	
  them	
  
                             for	
  a	
  long	
  0me.


Wednesday, February 22, 12
Disection &
                             Discussion




                                           Twitter Tag: #briefr
Wednesday, February 22, 12
Wednesday, February 22, 12
March:
                             Vendor Research
                             March 14th: Second Round Table focusing on No SQL databases and
                             their application
                             DB Revolution Survey conducted

                    April:
                             Vendor Research
                             Publishing of Round Table Transcripts, with comments
                    May:
                             Authoring of White Paper
                             Publishing of White Paper
                             Publishing of survey activity



                                                                                    Twitter Tag: #briefr
Wednesday, February 22, 12
March 14th: Second DB
               Revolution Round Table

               March Briefing Room:
               Integration

               April Briefing Room:
               Discovery

               May Briefing Room: Analytics




                                              Twitter Tag: #briefr
Wednesday, February 22, 12
Thank You
                              For Your
                             Attention



Wednesday, February 22, 12

Weitere ähnliche Inhalte

Mehr von Inside Analysis

The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 
Phasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyPhasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyInside Analysis
 
Red Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariRed Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariInside Analysis
 
DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)Inside Analysis
 
DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)Inside Analysis
 
Big Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven AnalyticsBig Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven AnalyticsInside Analysis
 

Mehr von Inside Analysis (20)

The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 
Phasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyPhasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey Malafsky
 
Red Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariRed Hat - Sarangan Rangachari
Red Hat - Sarangan Rangachari
 
WebAction-Sami Abkay
WebAction-Sami AbkayWebAction-Sami Abkay
WebAction-Sami Abkay
 
DisrupTech 2015ek
DisrupTech 2015ekDisrupTech 2015ek
DisrupTech 2015ek
 
DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)
 
DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)
 
Big Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven AnalyticsBig Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven Analytics
 

Kürzlich hochgeladen

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Roundtable 1: Relational and Analytic Database Innovations

  • 2. Eric Kavanagh Eric.kavanagh@bloorgroup.com Twitter Tag: #briefr Wednesday, February 22, 12
  • 3. To conduct an Open Research program that invites the participation of both IT users and technology vendors To assist IT buyers in understanding database technology and the architecture that surrounds it. Allow audience members to pose serious questions... and get answers! Publish all findings Twitter Tag: #briefr Wednesday, February 22, 12
  • 4. Your Host: Eric Kavanagh Research Leader: Mark Madsen - Third Nature Primary Collaborator: Robin Bloor - The Bloor Group Guest Analyst 1: Rajiv Rawat - BI Results Guest Analyst 2: Malcolm Chisholm - Consultant Wednesday, February 22, 12
  • 5. Rajeev Rawat is the founder and CEO of BI Results. His career has involved leading large cross-functional teams at both IBM and Xerox, where he was involved in direct customer facing roles as well as taking part in headquarters assignments. His headquarters positions with worldwide responsibility included strategic assignments for alliances and relationships with technology partners, product management and product marketing. Other responsibilities include restructuring business models, test of new technology platforms, and sales coverage plans. Rajeev led the introduction of new technologies and solutions for Xerox and IBM. www.biresults.com, biresult@gmail.com LinkedIn: Rajeev Rawat Twitter Tag: #brief Wednesday, February 22, 12
  • 6. The Bloor Group Fit for Purpose: The New Database Revolution The Bloor Group – February 22, 2012 Five Years of Incredible Excitement In Information Acrobatics! -Seismic shift in data Ç Ç Variety, Volume, Velocity √ √ Rajeev Rawat Serving to achieve your full potential ©Copyright BI Results, LLC 2012 Wednesday, February 22, 12
  • 7. The Bloor Group The Next Five Years The Most Exciting Times In Information Acrobatics New Venture Funding Key Value Store, Big Table, Graph DB, Document DB New (Needed) Functionality Ç Ç New Skills √ √ New Ventures Innovative Code NoSQL Innovation Lots of Great Innovation Apache Project, Amazon, Facebook, Google, Open Source Community, Twitter Reports of the Death of The RDBMS Are Highly Exaggerated ©Copyright BI Results, LLC 2012 Wednesday, February 22, 12
  • 8. The Bloor Group RDBMS Still Dominates Reliable Heavy Lifting RDBMS Vs. NoSQL? Strengths - Robust (ACID, Fail-proof) - Structure (Granular, Scalable, Fast) Ç - Governance (Backups, Precision) - Tools (ETL, Analytics, Reporting) Ç √ - Ecosystem (Global deep collaboration) - Skills (Certifications, Experience) √ - Policies, Procedures (Reliability) - Documentation (Support, Training) Photo: Watchmojo.com Reports of the Death of RDBMS Are Highly Exaggerated ©Copyright BI Results, LLC 2012 Wednesday, February 22, 12
  • 9. The Bloor Group NoSQL Being Tested, Validated, Calibrated Key Value Store, Big Table, Graph - Co-Existence, Transition, DB, Document DB NoSQL Only - Meta Tag, Master Data Ç Ç Other scheme/s √ √ - Data Governance, Controls. Authentication, Security - Deep Analytics on Mixed Datasets Complexity, Semi- Structured, Highly Connected Data Fantastic Growth Opportunity Skills, Investing ©Copyright BI Results, LLC 2012 Wednesday, February 22, 12
  • 10. The Bloor Group NoSQL, RDBMS Innovation Fantastic Opportunity for Growth Gaps You Can Help Close The Race Is On! - Mapping Big Data with Legacy Data Ç - Strategy and Policy for Finish Line Ç √ √ Governance, Precision, Controls - Opportunities at all sides - Enterprise - Legacy Vendors - Innovative Ventures Tested For Prime Time - Technology and Business Time to Rise To The Top Skills, Investing ©Copyright BI Results, LLC 2012 Wednesday, February 22, 12
  • 11. Disection & Discussion Twitter Tag: #briefr Wednesday, February 22, 12
  • 12. Robin Bloor is Chief Analyst at The Bloor Group. Robin.Bloor@Bloorgroup.com Twitter Tag: #briefr Wednesday, February 22, 12
  • 15. The SQL Barrier SQL has: DDL (for data definition) SQL Barrier DML (for Select, Project and Join) Results Or results But it has no MML or TML processing must be done here processing must be done here Usually result sets are brought to the client for further manipulation, but using them for further data access becomes problematic. SQL Conclusions: Analytic DBMS This separation of data from process is arbitrary and unhelpful Wednesday, February 22, 12
  • 16. That MapReduce Thing There are two fundamental approaches to parallelism Data Partitioning Process partitioning MapReduce implements an approach which is oriented to the first of these. Thus proves to be suited to many “big data” tasks. It is not the end ofd the parallel processing story by any means. Wednesday, February 22, 12
  • 17. Malcolm Chisholm has 25+ years experience in data management working in finance, insurance, manufacturing, government, defense, pharmaceuticals, and retail. He specializes in data governance, MDM, metadata engineering, business rules management/execution, data architecture and design. He is a well-known presenter at conferences in the U.S. and Europe, writes columns in trade journals, and has authored the books: Managing Reference Data in Enterprise Databases; How to Build a Business Rules Engine; and Definitions in Information Management. In 2011, Malcolm was presented with the prestigious DAMA International Professional Achievement Award for contributions to Master Data Management. He can be contacted at mchisholm@refdataportal.com. Twitter Tag: #briefr Wednesday, February 22, 12
  • 18. Disection & Discussion Twitter Tag: #briefr Wednesday, February 22, 12
  • 19. The New Database Revolution: Relational Roundtable The Virtual Circle February 22, 2012 San Francisco Malcolm Chisholm Ph.D. mchisholm@refdataportal.com Telephone 732-687-9283 • Fax 407-264-6809 www.refdataportal.com www.bizrulesengine.com © AskGet.com Inc., 2012. All rights reserved Wednesday, February 22, 12
  • 20. “Big Data” Is Used Differently Relational Paradigm ULS Dataspace in Cloud “Set at a time” processing Uncover individual facts Behavior of populations of identical things Much is master data Event data predominates Events are not as much repetitive transactions Exception reporting for singular things/events Can aggregate from individual facts (but bottom- (bust still top-down) up) Heavy data entry supported Surf and drill Data entry is to support analysis • The relational paradigm is different to ULS “Big Data”. [ULS = Ultra-Large Scale - usually Petabyte scale] • Difficult to rely on relational thinking in Cloud databases © AskGet.com Inc., 2012. All rights reserved Wednesday, February 22, 12
  • 21. Sources Source A Emails Source B Documents I ULS Dataspace in Cloud Web Pages N Source C G XML E Source D Relational S T Flat Files I Source E O Audio N Image Video • Sources provide data to the ULS dataspace • One source can provide many data formats • Many sources can provide the same format • Sources may duplicate the same data • HINT – Think metadata © AskGet.com Inc., 2012. All rights reserved Wednesday, February 22, 12
  • 22. Segments in Dataspace ULS Dataspace in Cloud Source A I N Ingested Data Terms in G Store Documents Document-Term Source B E Inverted Index S M/R M/R Source C T I O Source N N Extracted Deduplicated Master Data Master Data M/R M/R • The ULS dataspace is not a single “blob” of data • It will have different segments with different kinds of data in it • The segments will be derived from the originally ingested data • MapReduce (M/R) is the equivalent of ETL to move data around and transform it (filter, summarize) © AskGet.com Inc., 2012. All rights reserved Wednesday, February 22, 12
  • 23. No Common Notation for Columnar Designs ? Col A Col B Col C Col D Col E Row 01 Val1A Row 02 Val2A Val2B Val2C Val2D Val2E Row 03 Val3A Val3C Val3E • E/R diagramming techniques allow us to visualize a relational database • There is nothing that is quite the same for columnar databases • (a) It is sparse and columns may be missing • (b) How do you show the MapReduce transformations (not quite relations)? © AskGet.com Inc., 2012. All rights reserved Wednesday, February 22, 12
  • 24. Need a Data Dictionary • The ULS dataspace can grow quickly and have many data objects • Without a DD developers and users will get hopelessly lost (none of the logic imposed by the relational model) • The fundamental unit is the field – show where it occurs in rows, ColQuals and payloads • Tables less important than in relational © AskGet.com Inc., 2012. All rights reserved Wednesday, February 22, 12
  • 25. Disection & Discussion Twitter Tag: #briefr Wednesday, February 22, 12
  • 26. Mark Madsen is founder of Third Nature, a research and consulting firm focused on analytics, BI and decision-making. Mark spent the past two decades working on analysis and decision support in many industries and countries. He is an award- winning architect and former CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http:// ThirdNature.net. Twitter Tag: #briefr Wednesday, February 22, 12
  • 27. One Size Doesn’t Fit All February 22, 2012 Mark R. Madsen http://ThirdNature.net Wednesday, February 22, 12
  • 28. The  future  of  data  is  the  database Wednesday, February 22, 12
  • 29. You keep using that word. I do not think it means what you think it means. Wednesday, February 22, 12
  • 30. Good  conceptual  model,  but  a  prematurely   standardized  implementa5on The  rela*onal  database  is  the  franchise  technology  for  storing  and   retrieving  data,  but… 1.Global,  sta*c  schema  model 2.No  rich  typing  system 3.Many  are  not  a  good  fit  for  network  parallel  compu*ng,  aka  cloud 4.Limited  API  in  atomic  SQL  statement  syntax    &  simple  result  set  return Wednesday, February 22, 12
  • 31. Plus,  if  they’re  all  the  same  why  are  there  so  many? Sybase  IQ,  ASE EnterpriseDB   Algebraix Teradata,  Aster  Data LucidDB Intersystems  Caché Oracle,  RAC Vectorwise Streambase MicrosoT  SQLServer,  PDW MonetDB SQLStream IBM  DB2s,  Netezza Exasol Coral8 Paraccel Illuminate Ingres Kogni*o Ver*ca Postgres EMC/Greenplum InfiniDB Cassandra Oracle  Exadata 1010  Data CouchDB SAP  HANA SAND Mongo Infobright Endeca Hbase MySQL Xtreme  Data Redis MarkLogic IMS RainStor Tokyo  Cabinet Hive Scalaris And a few hundred more. Wednesday, February 22, 12
  • 32. The  future  of  data  is  the  rela0onal  database? SQL noSQL Wednesday, February 22, 12
  • 33. The  future  of  data  is  the  rela0onal  database? SQL noSQL Wednesday, February 22, 12
  • 34. Technologies  are  not   perfect  replacements  for   one  another. When  replacing  the  old   with  the  new  (or  ignoring   the  new  over  the  old)  you   always  make  tradeoffs,  and   usually  you  won’t  see  them   for  a  long  0me. Wednesday, February 22, 12
  • 35. Disection & Discussion Twitter Tag: #briefr Wednesday, February 22, 12
  • 37. March: Vendor Research March 14th: Second Round Table focusing on No SQL databases and their application DB Revolution Survey conducted April: Vendor Research Publishing of Round Table Transcripts, with comments May: Authoring of White Paper Publishing of White Paper Publishing of survey activity Twitter Tag: #briefr Wednesday, February 22, 12
  • 38. March 14th: Second DB Revolution Round Table March Briefing Room: Integration April Briefing Room: Discovery May Briefing Room: Analytics Twitter Tag: #briefr Wednesday, February 22, 12
  • 39. Thank You For Your Attention Wednesday, February 22, 12