SlideShare ist ein Scribd-Unternehmen logo
1 von 42
www.Objectivity.com




                      Choosing The Right Big
                      Data Tools For The Job
                      – A Polyglot Approach

                      A Webinar Presented by Leon Guzenda
                                 on August 9, 2012
Overview

The Problem

•
    Current Big Data Analytics

•
    Relationship Analytics

•
    Leveraging Alternative Technologies
    –
      NoSQL

•
    The Polyglot Approach
About Objectivity Inc.
Company          • Objectivity, Inc. is headquartered in Sunnyvale, CA.
                 • Established in 1988 to tackle database problems that network/hierarchical/relational and file-based technologies
                 struggle with.

                 • Objectivity has over two decades of Big Data and NoSQL experience

Products         • Develops NoSQL platforms for managing and discovering relationships and patterns in complex data:
                          • Objectivity/DB - an object database that manages localized, centralized or distributed databases
                          • InfiniteGraph - a massively scalable graph database built on Objectivity/DB that enables organizations
                          to find, store and exploit the relationships in their data


Markets          • The Big Data market is projected to be around $12B in 2012, with a CAGR of 28% over the next five years.
                 • 40% per year data growth, cloud adoption, mobile usage and improved real-time analytics underpin Objectivity’s
                 growth opportunities as a Big Data analytics enabler.


Customers • Embedded in hundreds of enterprises, government organizations and products - millions of deployments.

Financials • Consistently generates increased revenues.
                 • Privately held by the employees and a few venture capital companies.



      Copyright © Objectivity, Inc. 2012
The Problem

Information Overload!

Making sense of it all takes time and $$$




            Current “Big Data” Analytics
A Typical “Big Data” Analytics Setup

                       Data Aggregation and Analytics Applications


          Commodity Linux Platforms and/or High Performance Computing Clusters




          Column      Data          Graph      Object                                   K-V
 RDBMS                                                         Hadoop      Doc DB
           Store      W/H            DB         DB                                     Store


         Structured                 Semi-Structured                     Unstructured
Leveraging Alternative Technologies
Not Only SQL – a group of 4 primary technologies
•
    Users choose between four different primary technologies for different
    purposes:
    –
        Key-Value Stores
    –
        “Big Table” Clones
    –
        Document Databases
    –
        Object and Graph databases (including InfiniteGraph)

•
    Many implementations sacrifice consistency (ACID transactions, CAP
    – eventual consistency) for performance.

•
    Technologies such as Objectivity/DB and InfiniteGraph offer ACID
    transactions, with consistency and performance.
The NoSQL Market
Key-Value Stores

“Dynamo: Amazon’s High Available Key-Value Store” [2007]

•
    Data model:
    –
        Global key-value mapping
    –
        Scalable (sharded) HashMap          KEY   VALUE
    –
        Highly fault tolerant (typically)

•
    Examples:
    –
        Riak, Redis and Voldemort
Key-Value Stores: Pros & Cons
•
    Strengths:
    –
        Simple data model
    –
        Great at scaling out horizontally
    –
        Scalable
    –
        Available
                                            KEY   VALUE
•
    Weaknesses:
    –
        Simplistic data model
    –
        Poor for complex data
    –
        Unsuited for interconnected data
Big Table Clones – Column Family
•
    Google’s “Bigtable: A Distributed Storage System for
    Structured Data” [2006]
•
    Column-Family are essentially Big Table clones.
                                                             Column
•
    Data Model:                                KEY    Column Name Value D/Time
    –
        A big table, with column families.
    –
        Map-reduce for parallel query/processing.

•
    Examples:
    –
        Hbase, HyperTable and Cassandra.
Big Table Clones – Pros & Cons
•
    Strengths:
    –
        Data model supports semi-structured data
    –
        Naturally indexed (columns)
    –
        Good at scaling out horizontally

                                                               Column
•
    Weaknesses:
                                                  KEY   Column Name Value D/Time
    –
        Complex data model
    –
        Unsuited for highly interconnected data
Document Databases
•
    Data Model:
    –
        A collection of unstructured or semi-structured documents.
    –
        Each document is referenced using a key-value pair.
    –
        The “value” can range from unstructured text to a collection of key-
        value pairs or a group of XML objects.
    –
        Index-centric to support queries based on content.

•
    Examples:
                                                    KEY       DOCUMENT
    –
        CouchDB and MongoDB.
Document Databases – Pros & Cons
•
    Strengths:
    –
        Simple, powerful data model
    –
        Good scalability if sharding is supported

•
    Weaknesses:                                     KEY    DOCUMENT
    –
        Unsuited for interconnected data
    –
        Query model limited is to keys and indexes
    –
        Generally uses Map-Reduce (designed for batch operations) for
        larger queries
Object Databases
•
    Data Model [ODMG'93]:
    –
        Objects have a Class (type) and a group of Values
    –
        Each Object instance has a unique Object Identifier [OID]
    –
        Connections use Object Identifiers for efficiency
    –
        Supports class inheritance and polymorphism

•
    Examples:
                                                    OID         OBJECT
    –
        Objectivity/DB and db4objects
                                                              Connections
Object Databases – Pros & Cons
•
    Strengths:
    –
        Simple, powerful data model that includes inheritance and
        polymorphism
    –
        Every object has a class (type) and a unique Object Identifier
    –
        Good scalability if sharding is supported
    –
        Uses Object Identifiers instead of JOIN tables to support very fast
        navigational operations                        OID        OBJECT

                                                                Connections
•
    Weaknesses:

    –
        The query language never became a standard
    –
        Supports standard object oriented languages but isn't supported by
        a wide range of third party tools in the way that SQL is.
Graph Databases
•
    Data model:
    –
        Node (Vertex) and Relationship (Edge) objects
    –
        Directed
    –
        May be a hypergraph (edges with multiple endpoints)

•
    Examples:
    –
        InfiniteGraph, Neo4j, OrientDB, AllegroGraph, TitanDB and Dex


                                 2     N
                      VERTEX                   EDGE
Graph Databases – Pros & Cons
•
    Strengths:
    –
        Extremely fast for connected data
    –
        Scales out, typically
    –
        Easy to query (navigation)
    –
        Simple data model

•
    Weaknesses:
    –
        May not support distribution or sharding
    –
        Requires conceptual shift... a different way of thinking


                                    2     N
                       VERTEX                      EDGE
Competing “Big Data” Analytics Solutions
Typical “Big Data” Analytics Phases



                                                             Analytics and
      Front-End Processing          Repository            Visualization Tools




      The strategic competitors are all moving in the same direction
Incremental Improvements Aren’t Enough

All current solutions use the same basic architectural model

• None of the current solutions have a way to store connections between
  entities in different silos

• Most analytic technology focuses on the content of the data nodes,
  rather than the many kinds of connections between the nodes and the
  data in those connections

• Why? Because relational and most NoSQL solutions are bad at handling
  relationships.

• Object and Graph databases can efficiently store, manage and query the
  many kinds of relationships hidden in the data.
Relationship Analytics
Example 1 - Market Analysis
The 10 companies that control a majority of U.S. consumer goods brands
Example 2 - Demographics
Used in social network analysis, marketing, medical research etc.
Example 3 - Seed To Consumer Tracking




                                        ?
Example 4 - Ad Placement Networks

Smartphone Ad placement - based on the the user’s profile and location data
 captured by opt-in applications.

• The location data can be stored and distilled in a key-value and column store
  hybrid database, such as Cassandra

• The locations are matched with geospatial data to deduce user interests.
• As Ad placement orders arrive, an application built on a graph database such
  as InfiniteGraph, matches groups of users with Ads:

• Maximizes relevance for the user.
• Yields maximum value for the advertiser and the placer.
Example 5 - Healthcare Informatics



Problem: Physicians need better electronic records for managing patient data on a global
 basis and match symptoms, causes, treatments and interdependencies to improve
 diagnoses and outcomes.

• Solution: Create a database capable of leveraging existing architecture using NOSQL tools
  such as Objectivity/DB and InfiniteGraph that can handle data capture, symptoms,
  diagnoses, treatments, reactions to medications, interactions and progress.

• Result: It works:
  • Diagnosis is faster and more accurate
  • The knowledge base tracks similar medical cases.
  • Treatment success rates have improved.
Relationship (Connection) Analytics...
Relational Database
Think about the SQL query for finding all links between the two “blue” rows... Good luck!
               Table_A       Table_B    Table_C   Table_D   Table_E      Table_F        Table_G




       Relational databases aren’t good at handling complex relationships!
Relationship (Connection) Analytics...
Relational Database
Think about the SQL query for finding all links between the two “blue” rows... Good luck!
               Table_A       Table_B    Table_C   Table_D   Table_E      Table_F        Table_G




Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code

          A3                                                                                      G4
Visual Analytics
The Polyglot Approach
Lesson 1 – The Repository Matters A Lot

NEED           RDBMS   Key-    Column   Document   ODBMS   Graph
                       Value   Family   Database           Database
OLTP           YES     No      Maybe    No         Maybe   No
Text           No      No      No       YES        Maybe   No
Handling
Multimedia     No      Maybe   No       Maybe      YES     Maybe
Engineering/   No      No      No       No         YES     Maybe
Scientific
Business       YES     No      Maybe    No         Maybe   Maybe
Intelligence
Log            Maybe   No      Maybe    No         YES     Maybe
Processing
Connection     No      No      No       No         Maybe   YES
Handling/
Analysis
Lesson 2 – Languages and Tools Matter Too

  NEED           Repository   Language     BI Tools   Visual
                                                      Analytics
  OLTP           RDBMS        SQL, Java    YES        Maybe
  Text           Document     Java, XML    No         Maybe
                 Database
  Multimedia     ODBMS        Java, C++    No         Maybe
  Eng/Science    ODBMS        C,C++, R     Maybe      YES
                              Fortran
  Business       RDBMS        Java, SQL, R YES        YES
  Intelligence
  Log            NoSQL,       C++, R,      Maybe      YES
  Processing     ODBMS        Java, SQL
  Connection     Graph        Java, C++,   Maybe      YES
  Handling/      Database     SPARQL
  Analysis
SUMMARY: A Polyglot Approach Works Best...


          LANGUAGE                 REPOSITORY




                      PROBLEM




                      ANALYTICS




      BI TOOLS       GRAPH TOOLS      VISUAL ANALYTICS
...SUMMARY: A Polyglot Approach Works Best
InfiniteGraph
     THE BIG DATA CONNECTION PLATFORM
SPARE SLIDES
InfiniteGraph - The Enterprise Graph Database

• A high performance distributed database engine that supports analyst-time decision
    support and actionable intelligence
• Cost effective link analysis – flexible deployment on commodity resources (hardware
    and OS).
•   Efficient, scalable, risk averse technology – enterprise proven.
•   High Speed parallel ingest to load graph data quickly.
•   Parallel, distributed queries
•   Flexible plugin architecture
•   Complementary technology
•   Fast proof of concept – easy to use Graph API.
Objectivity/DB
 A distributed, object database built for handling data with many complex relationships.

• Reliable - Deployed in process control, telecom and medical equipment, Big Science,
  complex financial, defense and Intelligence Community applications.

• Provably scalable - used to build the World’s first Petabyte+ database at Stanford
  Linear Accelerator in the year 2000.

• Advanced query capabilities - Parallel Query Engine
• Interoperable - across languages and platforms
  –
      C++, C#, Java, Python and SQL++
  –
      Linux, Mac OS X and Windows (32 and 64-bit)
The Big Data Connection Platform

Data Visualization
   & Analytics
                      *Now HP     *Now IBM




Big Data Connection
     Platform




Processing Platform
                                                                       *Now EMC         *Now IBM   *Now IBM
                                             *Now Teradata   *Now HP
                                *Now SAP




   Connectors /
    Integration


     Servers /
   File Storage                                                           *Now Oracle
The Big Data Connection Platform

Data Visualization
   & Analytics
                      *Now HP     *Now IBM




Big Data Connection
     Platform




Processing Platform
                                                                       *Now EMC         *Now IBM   *Now IBM
                                             *Now Teradata   *Now HP
                                *Now SAP




   Connectors /
    Integration


     Servers /
   File Storage                                                           *Now Oracle
Thank You!

 Please take a look at objectivity.com
For Online Demos, White Papers, Free Downloads,
              Samples & Tutorials


     You Can Also See Us At NoSQL Now!
         In San Jose, CA on August 22

Weitere ähnliche Inhalte

Was ist angesagt?

A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016Łukasz Grala
 
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAATemporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAACuneyt Goksu
 
Migrating from Oracle to Postgres
Migrating from Oracle to PostgresMigrating from Oracle to Postgres
Migrating from Oracle to PostgresEDB
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopHazelcast
 
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorEDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorDaniel Martin
 
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Database Cloud Services Office Hours : Oracle sharding  hyperscale globally d...Database Cloud Services Office Hours : Oracle sharding  hyperscale globally d...
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...Tammy Bednar
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleEDB
 
Which Postgres is Right for You? - Part 2
Which Postgres is Right for You? - Part 2Which Postgres is Right for You? - Part 2
Which Postgres is Right for You? - Part 2EDB
 
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527Zohar Elkayam
 
Avoiding.the.pitfallsof.oracle.migration.2013
Avoiding.the.pitfallsof.oracle.migration.2013Avoiding.the.pitfallsof.oracle.migration.2013
Avoiding.the.pitfallsof.oracle.migration.2013EDB
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0EDB
 
An Expert Guide to Migrating Legacy Databases to PostgreSQL
An Expert Guide to Migrating Legacy Databases to PostgreSQLAn Expert Guide to Migrating Legacy Databases to PostgreSQL
An Expert Guide to Migrating Legacy Databases to PostgreSQLEDB
 
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...EDB
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Minimize Headaches with Your Postgres Deployment
Minimize Headaches with Your Postgres DeploymentMinimize Headaches with Your Postgres Deployment
Minimize Headaches with Your Postgres DeploymentEDB
 
Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure passJason Strate
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911Ines Sombra
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021Sandesh Rao
 

Was ist angesagt? (20)

A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAATemporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
 
Migrating from Oracle to Postgres
Migrating from Oracle to PostgresMigrating from Oracle to Postgres
Migrating from Oracle to Postgres
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorEDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
 
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Database Cloud Services Office Hours : Oracle sharding  hyperscale globally d...Database Cloud Services Office Hours : Oracle sharding  hyperscale globally d...
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off Oracle
 
Which Postgres is Right for You? - Part 2
Which Postgres is Right for You? - Part 2Which Postgres is Right for You? - Part 2
Which Postgres is Right for You? - Part 2
 
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
Things Every Oracle DBA Needs to Know About the Hadoop Ecosystem 20170527
 
Avoiding.the.pitfallsof.oracle.migration.2013
Avoiding.the.pitfallsof.oracle.migration.2013Avoiding.the.pitfallsof.oracle.migration.2013
Avoiding.the.pitfallsof.oracle.migration.2013
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0
 
An Expert Guide to Migrating Legacy Databases to PostgreSQL
An Expert Guide to Migrating Legacy Databases to PostgreSQLAn Expert Guide to Migrating Legacy Databases to PostgreSQL
An Expert Guide to Migrating Legacy Databases to PostgreSQL
 
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Minimize Headaches with Your Postgres Deployment
Minimize Headaches with Your Postgres DeploymentMinimize Headaches with Your Postgres Deployment
Minimize Headaches with Your Postgres Deployment
 
Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure pass
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 

Ähnlich wie Choosing the Right Big Data Tools for the Job - A Polyglot Approach

Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyInfiniteGraph
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012InfiniteGraph
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology LandscapeShivanandaVSeeri
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4jSina Khorami
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Gilbane Boston 2012 Big Data 101
Gilbane Boston 2012 Big Data 101Gilbane Boston 2012 Big Data 101
Gilbane Boston 2012 Big Data 101Peter O'Kelly
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Gilbane Boston 2011 big data
Gilbane Boston 2011 big dataGilbane Boston 2011 big data
Gilbane Boston 2011 big dataPeter O'Kelly
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Big Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakCaserta
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQLPhilippe Julio
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Evolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital eraEvolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital eraVishal Puri
 

Ähnlich wie Choosing the Right Big Data Tools for the Job - A Polyglot Approach (20)

Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology Landscape
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4j
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Gilbane Boston 2012 Big Data 101
Gilbane Boston 2012 Big Data 101Gilbane Boston 2012 Big Data 101
Gilbane Boston 2012 Big Data 101
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Gilbane Boston 2011 big data
Gilbane Boston 2011 big dataGilbane Boston 2011 big data
Gilbane Boston 2011 big data
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
Big Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with Riak
 
Big Data with Not Only SQL
Big Data with Not Only SQLBig Data with Not Only SQL
Big Data with Not Only SQL
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
UNIT-2.pptx
UNIT-2.pptxUNIT-2.pptx
UNIT-2.pptx
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Evolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital eraEvolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital era
 

Mehr von DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

Mehr von DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Kürzlich hochgeladen

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 

Kürzlich hochgeladen (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 

Choosing the Right Big Data Tools for the Job - A Polyglot Approach

  • 1. www.Objectivity.com Choosing The Right Big Data Tools For The Job – A Polyglot Approach A Webinar Presented by Leon Guzenda on August 9, 2012
  • 2. Overview The Problem • Current Big Data Analytics • Relationship Analytics • Leveraging Alternative Technologies – NoSQL • The Polyglot Approach
  • 3. About Objectivity Inc. Company • Objectivity, Inc. is headquartered in Sunnyvale, CA. • Established in 1988 to tackle database problems that network/hierarchical/relational and file-based technologies struggle with. • Objectivity has over two decades of Big Data and NoSQL experience Products • Develops NoSQL platforms for managing and discovering relationships and patterns in complex data: • Objectivity/DB - an object database that manages localized, centralized or distributed databases • InfiniteGraph - a massively scalable graph database built on Objectivity/DB that enables organizations to find, store and exploit the relationships in their data Markets • The Big Data market is projected to be around $12B in 2012, with a CAGR of 28% over the next five years. • 40% per year data growth, cloud adoption, mobile usage and improved real-time analytics underpin Objectivity’s growth opportunities as a Big Data analytics enabler. Customers • Embedded in hundreds of enterprises, government organizations and products - millions of deployments. Financials • Consistently generates increased revenues. • Privately held by the employees and a few venture capital companies. Copyright © Objectivity, Inc. 2012
  • 4. The Problem Information Overload! Making sense of it all takes time and $$$ Current “Big Data” Analytics
  • 5. A Typical “Big Data” Analytics Setup Data Aggregation and Analytics Applications Commodity Linux Platforms and/or High Performance Computing Clusters Column Data Graph Object K-V RDBMS Hadoop Doc DB Store W/H DB DB Store Structured Semi-Structured Unstructured
  • 7. Not Only SQL – a group of 4 primary technologies • Users choose between four different primary technologies for different purposes: – Key-Value Stores – “Big Table” Clones – Document Databases – Object and Graph databases (including InfiniteGraph) • Many implementations sacrifice consistency (ACID transactions, CAP – eventual consistency) for performance. • Technologies such as Objectivity/DB and InfiniteGraph offer ACID transactions, with consistency and performance.
  • 9. Key-Value Stores “Dynamo: Amazon’s High Available Key-Value Store” [2007] • Data model: – Global key-value mapping – Scalable (sharded) HashMap KEY VALUE – Highly fault tolerant (typically) • Examples: – Riak, Redis and Voldemort
  • 10. Key-Value Stores: Pros & Cons • Strengths: – Simple data model – Great at scaling out horizontally – Scalable – Available KEY VALUE • Weaknesses: – Simplistic data model – Poor for complex data – Unsuited for interconnected data
  • 11. Big Table Clones – Column Family • Google’s “Bigtable: A Distributed Storage System for Structured Data” [2006] • Column-Family are essentially Big Table clones. Column • Data Model: KEY Column Name Value D/Time – A big table, with column families. – Map-reduce for parallel query/processing. • Examples: – Hbase, HyperTable and Cassandra.
  • 12. Big Table Clones – Pros & Cons • Strengths: – Data model supports semi-structured data – Naturally indexed (columns) – Good at scaling out horizontally Column • Weaknesses: KEY Column Name Value D/Time – Complex data model – Unsuited for highly interconnected data
  • 13. Document Databases • Data Model: – A collection of unstructured or semi-structured documents. – Each document is referenced using a key-value pair. – The “value” can range from unstructured text to a collection of key- value pairs or a group of XML objects. – Index-centric to support queries based on content. • Examples: KEY DOCUMENT – CouchDB and MongoDB.
  • 14. Document Databases – Pros & Cons • Strengths: – Simple, powerful data model – Good scalability if sharding is supported • Weaknesses: KEY DOCUMENT – Unsuited for interconnected data – Query model limited is to keys and indexes – Generally uses Map-Reduce (designed for batch operations) for larger queries
  • 15. Object Databases • Data Model [ODMG'93]: – Objects have a Class (type) and a group of Values – Each Object instance has a unique Object Identifier [OID] – Connections use Object Identifiers for efficiency – Supports class inheritance and polymorphism • Examples: OID OBJECT – Objectivity/DB and db4objects Connections
  • 16. Object Databases – Pros & Cons • Strengths: – Simple, powerful data model that includes inheritance and polymorphism – Every object has a class (type) and a unique Object Identifier – Good scalability if sharding is supported – Uses Object Identifiers instead of JOIN tables to support very fast navigational operations OID OBJECT Connections • Weaknesses: – The query language never became a standard – Supports standard object oriented languages but isn't supported by a wide range of third party tools in the way that SQL is.
  • 17. Graph Databases • Data model: – Node (Vertex) and Relationship (Edge) objects – Directed – May be a hypergraph (edges with multiple endpoints) • Examples: – InfiniteGraph, Neo4j, OrientDB, AllegroGraph, TitanDB and Dex 2 N VERTEX EDGE
  • 18. Graph Databases – Pros & Cons • Strengths: – Extremely fast for connected data – Scales out, typically – Easy to query (navigation) – Simple data model • Weaknesses: – May not support distribution or sharding – Requires conceptual shift... a different way of thinking 2 N VERTEX EDGE
  • 19. Competing “Big Data” Analytics Solutions
  • 20. Typical “Big Data” Analytics Phases Analytics and Front-End Processing Repository Visualization Tools The strategic competitors are all moving in the same direction
  • 21. Incremental Improvements Aren’t Enough All current solutions use the same basic architectural model • None of the current solutions have a way to store connections between entities in different silos • Most analytic technology focuses on the content of the data nodes, rather than the many kinds of connections between the nodes and the data in those connections • Why? Because relational and most NoSQL solutions are bad at handling relationships. • Object and Graph databases can efficiently store, manage and query the many kinds of relationships hidden in the data.
  • 23. Example 1 - Market Analysis The 10 companies that control a majority of U.S. consumer goods brands
  • 24. Example 2 - Demographics Used in social network analysis, marketing, medical research etc.
  • 25. Example 3 - Seed To Consumer Tracking ?
  • 26. Example 4 - Ad Placement Networks Smartphone Ad placement - based on the the user’s profile and location data captured by opt-in applications. • The location data can be stored and distilled in a key-value and column store hybrid database, such as Cassandra • The locations are matched with geospatial data to deduce user interests. • As Ad placement orders arrive, an application built on a graph database such as InfiniteGraph, matches groups of users with Ads: • Maximizes relevance for the user. • Yields maximum value for the advertiser and the placer.
  • 27. Example 5 - Healthcare Informatics Problem: Physicians need better electronic records for managing patient data on a global basis and match symptoms, causes, treatments and interdependencies to improve diagnoses and outcomes. • Solution: Create a database capable of leveraging existing architecture using NOSQL tools such as Objectivity/DB and InfiniteGraph that can handle data capture, symptoms, diagnoses, treatments, reactions to medications, interactions and progress. • Result: It works: • Diagnosis is faster and more accurate • The knowledge base tracks similar medical cases. • Treatment success rates have improved.
  • 28. Relationship (Connection) Analytics... Relational Database Think about the SQL query for finding all links between the two “blue” rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Table_G Relational databases aren’t good at handling complex relationships!
  • 29. Relationship (Connection) Analytics... Relational Database Think about the SQL query for finding all links between the two “blue” rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Table_G Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code A3 G4
  • 32. Lesson 1 – The Repository Matters A Lot NEED RDBMS Key- Column Document ODBMS Graph Value Family Database Database OLTP YES No Maybe No Maybe No Text No No No YES Maybe No Handling Multimedia No Maybe No Maybe YES Maybe Engineering/ No No No No YES Maybe Scientific Business YES No Maybe No Maybe Maybe Intelligence Log Maybe No Maybe No YES Maybe Processing Connection No No No No Maybe YES Handling/ Analysis
  • 33. Lesson 2 – Languages and Tools Matter Too NEED Repository Language BI Tools Visual Analytics OLTP RDBMS SQL, Java YES Maybe Text Document Java, XML No Maybe Database Multimedia ODBMS Java, C++ No Maybe Eng/Science ODBMS C,C++, R Maybe YES Fortran Business RDBMS Java, SQL, R YES YES Intelligence Log NoSQL, C++, R, Maybe YES Processing ODBMS Java, SQL Connection Graph Java, C++, Maybe YES Handling/ Database SPARQL Analysis
  • 34. SUMMARY: A Polyglot Approach Works Best... LANGUAGE REPOSITORY PROBLEM ANALYTICS BI TOOLS GRAPH TOOLS VISUAL ANALYTICS
  • 35. ...SUMMARY: A Polyglot Approach Works Best
  • 36. InfiniteGraph THE BIG DATA CONNECTION PLATFORM
  • 38. InfiniteGraph - The Enterprise Graph Database • A high performance distributed database engine that supports analyst-time decision support and actionable intelligence • Cost effective link analysis – flexible deployment on commodity resources (hardware and OS). • Efficient, scalable, risk averse technology – enterprise proven. • High Speed parallel ingest to load graph data quickly. • Parallel, distributed queries • Flexible plugin architecture • Complementary technology • Fast proof of concept – easy to use Graph API.
  • 39. Objectivity/DB A distributed, object database built for handling data with many complex relationships. • Reliable - Deployed in process control, telecom and medical equipment, Big Science, complex financial, defense and Intelligence Community applications. • Provably scalable - used to build the World’s first Petabyte+ database at Stanford Linear Accelerator in the year 2000. • Advanced query capabilities - Parallel Query Engine • Interoperable - across languages and platforms – C++, C#, Java, Python and SQL++ – Linux, Mac OS X and Windows (32 and 64-bit)
  • 40. The Big Data Connection Platform Data Visualization & Analytics *Now HP *Now IBM Big Data Connection Platform Processing Platform *Now EMC *Now IBM *Now IBM *Now Teradata *Now HP *Now SAP Connectors / Integration Servers / File Storage *Now Oracle
  • 41. The Big Data Connection Platform Data Visualization & Analytics *Now HP *Now IBM Big Data Connection Platform Processing Platform *Now EMC *Now IBM *Now IBM *Now Teradata *Now HP *Now SAP Connectors / Integration Servers / File Storage *Now Oracle
  • 42. Thank You! Please take a look at objectivity.com For Online Demos, White Papers, Free Downloads, Samples & Tutorials You Can Also See Us At NoSQL Now! In San Jose, CA on August 22

Hinweis der Redaktion

  1. Thinking we should be less about Objy in the last bullet… possibly Object oriented and graph databases… ?
  2. Note Object Oriented Databases as NOSQL here.
  3. By initiating a polyglot approach – One can utilize existing SQL based architecture and databases while still gaining the competitive advantage that the latest NOSQL technologies provide. One example of this Polyglot approach is shown here. The technology(ies) used would be dependent on the use case.
  4. By initiating a polyglot approach – One can utilize existing SQL based architecture and databases while still gaining the competitive advantage that the latest NOSQL technologies provide. One example of this Polyglot approach is shown here. The technology(ies) used would be dependent on the use case.
  5. By initiating a polyglot approach – One can utilize existing SQL based architecture and databases while still gaining the competitive advantage that the latest NOSQL technologies provide. One example of this Polyglot approach is shown here. The technology(ies) used would be dependent on the use case.
  6. By initiating a polyglot approach – One can utilize existing SQL based architecture and databases while still gaining the competitive advantage that the latest NOSQL technologies provide. One example of this Polyglot approach is shown here. The technology(ies) used would be dependent on the use case.
  7. This section seems out of place.
  8. By having a scalable and distributed platform that can manage connections between all types of disparate data, enterprise can easily capitalize on the best tools for the job at hand.
  9. By having a scalable and distributed platform that can manage connections between all types of disparate data, enterprise can easily capitalize on the best tools for the job at hand.