SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Tuesday, May 22, 12
Eric.kavanagh@bloorgroup.com




    Twitter Tag: #briefr
Tuesday, May 22, 12
Reveal the essential characteristics of enterprise
                 software, good and bad

                 Provide a forum for detailed analysis of today’s
                 innovative technologies

                 Give vendors a chance to explain their product to
                 savvy analysts

                 Allow audience members to pose serious questions...
                 and get answers!



    Twitter Tag: #briefr
Tuesday, May 22, 12
May: Analytics

                      June: Intelligence

                      July: Governance

                      August: Analytics

                      September: Integration

                      October: Database



     Twitter Tag: #briefr
Tuesday, May 22, 12
Ultimately analytics is about businesses making optimal
                      decisions, although the range of technologies that inhabit
                      this area is wide: statistical analysis, data mining, process
                      mining, predictive analytics, predictive modeling, business
                      process modeling and complex event processing.

                      With the advent of big data, analytics has become “big
                      analytics” with organizations diving into large heaps of data
                      that previously was not available or usable.

                      A major challenge with this market trend is to be able to
                      provide adequate performance for all BI and analytics
                      workloads on the volumes of data that are now being
                      assembled and which are continuously growing.


     Twitter Tag: #briefr
Tuesday, May 22, 12
Robin Bloor is Chief
                             Analyst at The
                              Bloor Group.



                            Robin.Bloor@Bloorgroup.com




    Twitter Tag: #briefr
Tuesday, May 22, 12
SAP Sybase has a history of database innovation and
                      application from the corporate RDBMS through to the
                      mobile and embedded market.
                      Sybase IQ has been deployed in many areas of
                      application and is used in many complex predictive
                      analytics deployments, where speed data capacity and
                      versatility are critical.
                      Recently it has been upgraded to be used in a
                      symbiotic manner with Hadoop in order to provide a
                      comprehensive capability as a BI and analytics engine
                      for Big Data applications


   Twitter Tag: #briefr
Tuesday, May 22, 12
David Jonker works in the area of Data Management &
                                       Analytics for SAP and is Product Marketing Director for Sybase
                                       IQ. In the last 5 years David has led product marketing teams
                                       for Sybase’s Data Management & Analytics product lines,
                                       including Sybase IQ, Sybase ASE, SQL Anywhere, and Advantage
                                       Database Server. His career includes over 10 years in software
                                       engineering and product management. Before joining Sybase,
                                       David had consulting, product management and software
                                       development roles.


                 Courtney Claussen is a product manager at Sybase, Inc.,
              focusing on Sybase's data warehousing and analytics products.
               She has enjoyed a 30 year career in software development,
                 technical support and product marketing in the areas of
              computer aided design, computer aided software engineering,
               database management systems, middleware, and analytics.




    Twitter Tag: #briefr
Tuesday, May 22, 12
Sybase IQ 15.4 Overview —
      Big data analytics & Hadoop




Tuesday, May 22, 12
Sybase IQ
    Widespread success


                                                                                                 Stands out as the
                                                                                                 leading enterprise
                                                                                                 data warehouse
                                                                                                 among the largest
                                                                                                 banks, insurance
                                                                                                 agencies, and
      Manage and analyze                  Analyze ALL Federal tax   Analyze complex              telecom operators
      statistical measures for               returns in the US      models in more than          worldwide
      the entire nation                                             200 financial institutions
      of Canada                                                     worldwide




      Store and analyze massive amounts of industry segment data in 30 of the largest
      information providers in the world, including Transunion, Nielsen
      and Axiom




    © 2012 SAP AG. All rights reserved.                                                                               10

Tuesday, May 22, 12
BIG DATA ANALYTICS ISSUES
     Dealing with volume, variety, velocity, costs, skills


                                                         Volume
                                                        Managing and
                                                         harnessing
                                                      terabytes of data
                               Skills                                              Variety
                   Lack of adequate                           BIG            Harmonizing silos of
                     skills for non-                                           structured and
                  standard platforms                      DATA                unstructured data
                       and APIs
                                                      ANALYTICS
                                              Costs                    Velocity
                                          Too expensive to           Keeping up with
                                          acquire, operate,         unpredictable data
                                            and expand               and query flows


    © 2012 SAP AG. All rights reserved.                                                             11

Tuesday, May 22, 12
Sybase IQ 15
    A powerful big data analytics platform in the making

                   2009                    2009       2010        2011        2011
                                                                                              Big data
                  v15.0                    v15.1      v15.2       v15.3       v15.4
                                                                                              analytics



                                                                           Skills        MapReduce API


                                                               Costs          PlexQ™ MPP Foundation


                                                     Variety      Text Search, Web 2.0 API


                                          Velocity    In-Database Analytics API


             Volume                         VLDB Platform Foundation

    © 2012 SAP AG. All rights reserved.                                                                  12

Tuesday, May 22, 12
Sybase IQ 15.4
    A comprehensive platform for big data analytics




                                    Sybase
  Eco-System                  CONTROL                       Sybase       CERTIFITED
                                                                         ISV TOOLS
                               CENTER                  POWERDESIGNER                      Unstructured
                                                                                          Data
                                                                       Ingest + Persist   (Hadoop,
          App                                                                             Content Mgmt)
      Services                   Web 2.0     Java   C/C++     SQL          Federation



                                                                                          Structured Data
                                                                                          (DBMS)



         DMBS




    © 2012 SAP AG. All rights reserved.                                                              13

Tuesday, May 22, 12
Details: In-Database
    Analytics & Hadoop




Tuesday, May 22, 12
In-database analytics in Sybase IQ


    No compromise for complex analytics
     Basic to advanced analytical functions available to SQL directly from Sybase IQ engine
     Data never leaves the database until results are materialized
     Analytics code / models must be shareable yet must allow AD-HOC analysis
     Analytics code / models must be applicable to the latest data set
     Standards based access, concept extensibility is compulsory
     Performance and scalability is a given
     Average developer must be able to build In-database analytical models


                                          Sybase	
  IQ	
  Process                                   Database	
  =	
  
                                                                                                  Logic/Filtering
                   Built-­‐In	
  func6ons               External	
  DLL	
  “A”                  Applied	
  in	
  database




                                                                                 	
  	
  Analy7cs	
  simplified:	
  Logic	
  To	
  Data	
  	
  =	
  Fast	
  +	
  Efficient
                                                        External	
  DLL	
  “A”
                                                                                                                            	
  




    © 2012 SAP AG. All rights reserved.                                                                                                                                   15

Tuesday, May 22, 12
Tuesday, May 22, 12
In-database analytics in Sybase IQ
    Custom functions APIs


    Several different forms of C++ and JAVA UDF APIs for building custom In-database
    analytics, each valid at different locations within queries

    1.{Scalar} to {Scalar functions} e.g. sin, cosine, …

    2.{Scalar set} to {Scalar functions} e.g. max, min, …

    3.{Scalar set} to {Scalar set} e.g. OLAP windows, …

    4.{Scalar set} to {Tables} e.g. join result sets, …

    5.{Scalar set, Tables} to {Tables} e.g. MapReduce, …

    All variants are parallelizable, but (5) is also distributable across the PlexQ™ grid



    © 2012 SAP AG. All rights reserved.                                                     17

Tuesday, May 22, 12
In-database analytics in Sybase IQ
    Java custom functions


     3            Feature                         Characteristics                     Big Data Use Cases
            JAVA User                     •External algorithms written as      • Ideal for ISV or custom Data Mining
         Defined Function                                                        libraries for Healthcare, eCommerce,
                                           JAVA fns, plugged into Sybase IQ      Public Sector
          offers a new in-                                                       Apps include:
                                          •JAVA fns via SQL: runs In-
         database analytics                                                        – ISV partner Zementis built a plug-in
                                           Database, much faster than client
                API                                                                  for PMML (Predictive Modeling
                                           side                                      Markup Language) models
                                          •JAVA fns run protected/fault            – Validates PMML from SAS, R,..
                                                                                   – Translates PMML to JAVA UDFs
                                           tolerant (in separate process)          – JAVA UDFs called from SQL
                                          •Supports scalar and table outputs
                                          •Supports all data types

                                                                                                  Plug-In


                                                                                     PMML        Zementis     Sybase IQ


                                                                                                 JAVA UDF




    © 2012 SAP AG. All rights reserved.                                                                                     18

Tuesday, May 22, 12
SYBASE IQ 15.4 DECONSTRUCTED
        App services — integrating Sybase IQ + Hadoop: at client side

      6a            Feature                       Characteristics                      Big Data Use Cases
              Client side                 •Client tool capable of querying      • Ideal for bringing together Big Data
            federation: Join                                                      Analytics pre-computations from
                                           Sybase IQ and Hadoop                   different domains
               data from
                                          •Currently certified client tool is   • Example — In Telecommunication: Sybase
            Sybase IQ AND                                                         IQ with aggregated customer loyalty data &
                                           Quest Toad for Cloud
           Hadoop at a client                                                     Hadoop with aggregated network utilization
            application level             •Better performance when results        data; Quest Toad for Cloud can bring data
                                                                                  from both sources, linking customer loyalty
                                           from sources are pre-computed/         to network utilization or network faults (e.g.
                                           pre-aggregated                         dropped calls)




                                                                                                   Toad for Cloud
                                                                                                     Databases

                                                                                           $
                                                                                                                    Hadoop
                                                                                                                     Hive
                                                                                       Sybase IQ




    © 2012 SAP AG. All rights reserved.                                                                                       19

Tuesday, May 22, 12
SYBASE IQ 15.4 DECONSTRUCTED
        App services — integrating Sybase IQ + Hadoop: using ETL

      6b            Feature                          Characteristics                          Big Data Use Cases
             Load Hadoop                  • Extract & load subsets of HDFS data        • Ideal for combining subsets of HDFS
                                            into Sybase IQ column store                  unstructured data or summary of HDFS
           data into Sybase
                                              – Raw data from HDFS                       data into Sybase IQ for mid to long term
           IQ column store:                                                              usage in business reports
                                              – Results of Hadoop MR jobs
           Extract, transform,                                                         • Example — In eCommerce: clickstream data
                                          • HDFS data stored in Sybase IQ is
             load data from                 treated like other Sybase IQ data
                                                                                         from weblogs stored in HDFS and outputs of
            HDFS (Hadoop                                                                 MR jobs on that data (to study browsing
                                              – Gets ACID properties of a DBMS           behavior) ETL’d into Sybase IQ. The
            Distributed File                  – Can be indexed, joined, parallelized     transactional sales data in Sybase IQ joined
              System) into                    – Can be queried in an ad-hoc way          with clickstream data to understand and
                                                                                         predict customer browsing to buying behavior
               Sybase IQ                  • Visible to BI and other client tools via
                schemas                     Sybase IQ ANSI SQL API only
                                          • Currently, the Apache bulk data transfer
                                            utility SQOOP (built by Cloudera) is
                                            certified to provide this ETL capability


                         ETL                                                                     Clickstream            Sales Data
                                                                                                     Data

                                                                                              HDFS         SQOOP         Sybase IQ




    © 2012 SAP AG. All rights reserved.                                                                                              20

Tuesday, May 22, 12
SYBASE IQ 15.4 DECONSTRUCTED
        App services — integrating Sybase IQ + Hadoop: using Data Federation

      6c            Feature                           Characteristics                      Big Data Use Cases
            Join HDFS data                • Scan and fetch specified data subsets    • Ideal for combining subsets of HDFS
            with Sybase IQ                  from HDFS via table UDF                    data with Sybase IQ data for
                                             – Can read and fetch HDFS data            operational (transient) business
            data on the fly:                                                           reports
                                               subsets
              Fetch and join                 – Called as part of Sybase IQ SQL       • Example — In Retail: Point Of Sale
            subsets of HDFS                    query                                   (POS) detailed data stored in HDFS.
            data on-demand                   – Output joinable with Sybase IQ data     Sybase IQ EDW fetches POS data at
           using SQL queries              • HDFS data not stored in Sybase IQ          fixed intervals from HDFS of specific
             from Sybase IQ                  – Fetched into Sybase IQ In-memory        hot selling SKUs, combines with
                                               tables                                  inventory data in Sybase IQ to predict
           (Data Federation                                                            and prevent inventory “stockouts”
                                             – ACID properties not applicable
                technique)
                                          •   Visible to BI/other client tools via
                                              Sybase IQ ANSI SQL API



                                                                                                 POS Data            Inventory Data

                                                                                          HDFS          UDF Bridge     Sybase IQ




    © 2012 SAP AG. All rights reserved.                                                                                               21

Tuesday, May 22, 12
SYBASE IQ 15.4 DECONSTRUCTED
        App services — integrating Sybase IQ + Hadoop: using Query Federation

      6d            Feature                            Characteristics
                                                    Characteristics                           Big Data Use Cases
           Combine results of             • Trigger and fetch Hadoop MR job            • Ideal for combining results of Hadoop
          Hadoop MR jobs with               results via table UDF                        MR job results with Sybase IQ data for
            Sybase IQ data on                                                            operational (transient) business reports
            the fly: Initiate and          – Can trigger Hadoop MR jobs                • Example – In Utilities: Smart meter and
          Join results of Hadoop                                                         smart grid data can be combined for
           MR jobs on-demand               – Called as part of Sybase IQ
                                             SQL query                                   load monitoring and demand forecast.
             using SQL queries                                                           Smart grid transmission quality data
           from Sybase IQ data             – Output joinable with Sybase IQ data         (multi-attribute time series data) stored
            (Query Federation                                                            in HDFS can be computed via Hadoop
                 technique)               • HDFS data not stored in Sybase IQ
                                                                                         MR jobs triggered from Sybase IQ and
                                           – Fetched into Sybase IQ In-memory            combined with Smart meter data stored
                                             tables                                      in Sybase IQ to analyze demand and
                                                                                         workload.
                                           – ACID properties not applicable
                                             • Repeated use: put fetched data
                                               in tables
                                          • Visible to BI and other client tools via           Smart Grid           Smart Meter
                                            Sybase IQ ANSI SQL API                         Transmission Data   Consumption Data


                                                                                          HDFS           UDF Bridge        Sybase IQ




    © 2012 SAP AG. All rights reserved.                                                                                                22

Tuesday, May 22, 12
SYBASE IQ 15.4
     Unique, user community focused platform for big data analytics



                                          Data	
  Discovery	
  (Data	
                                         Applica6on	
  Modeling	
                                           Reports/Dashboards	
                                              Business	
  Decisions	
  
                                                  Scien7sts)                                                    (Business	
  Analysts)                                             (BI	
  Programmers)                                             (Business	
  End	
  Users)


                                                                                                                    Full	
  Mesh	
  High	
  Speed	
  Interconnect

                                                        	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
                         	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
                       	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
                         	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
               Infrastructure	
  
               Management	
  
                   (DBAs)                               	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
                 	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
               	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
                 	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  

                                                                                                                                                                          SAN Fabric




              • Dynamic, elastic PlexQ™ MPP grid
                   – Grow, shrink, provision on-demand
                   – Heavy parallelization
              • Load, prepare, mine, report in a workflow
                   – Privacy through isolation of resources
                   – Collaboration through sharing of results/data via sharing of resources


    © 2012 SAP AG. All rights reserved.                                                                                                                                                                                                                                                                               23

Tuesday, May 22, 12
Thank you

    Courtney Claussen
    Product Manager, Sybase IQ
    courtney.claussen@sap.com

    David Jonker
    Product Marketing Director, Sybase IQ
    david.jonker@sap.com



Tuesday, May 22, 12
Twitter Tag: #briefr
Tuesday, May 22, 12
Tuesday, May 22, 12
Most of the Big Data opportunity is, in the end, a Big
                      Analytics opportunity.

                      There are two challenges in this:
                        Managing the data and the data flow
                        Providing acceptable performance for analytics
                        applications

                      Hadoop and its associated technologies can be both a
                      blessing and a curse.



   Twitter Tag: #briefr
Tuesday, May 22, 12
• Hadoop = Key-value store & Parallel processing framework
                      • Some NoSQL databases are DHT-based, some are specialized DBMS
                      • Column-store DBMS vary, but in general they are MPP RDBMS and NewSQL DBMS


   Twitter Tag: #briefr
Tuesday, May 22, 12
Data volumes (includes
                      complexity of data
                      structure)

                      Concurrency (includes
                      also workload
                      variability)

                      Computation (is
                      application dependent)

                      Data flow architecture
                      is a factor


   Twitter Tag: #briefr
Tuesday, May 22, 12
In many ways this is similar to the
                      Data Warehouse data flow
                      challenge; writ larger

                      Latency is about application service
                      levels

                      This is probably still a three stage
                      process

                      This is, by the way, a simplification



   Twitter Tag: #briefr
Tuesday, May 22, 12
Big Analytics is here to stay

            In some analytical application areas
            speed is desirable, in others speed is
            critical.

            Warning: Workloads can be mixed

            Analytic speed depends upon the
            database engine, but also data flow
            architecture

            Business effectiveness depends upon
            integration with the business process

   Twitter Tag: #briefr
Tuesday, May 22, 12
The prebuilt functions clearly make sense (for speed of
                      processing). Are they intended to make some analytic tools
                      unnecessary or simply to be called directly by such tools?

                      What does SAP see as the appropriate role(s) for Hadoop in most
                      businesses?

                      As I understand it, Sybase IQ can fully replace Hadoop in some
                      contexts. What are the situations where you think Hadoop AND
                      Sybase IQ is appropriate?

                      I’m intrigued by the idea of JOINing data between Hadoop
                      results and Sybase IQ, but I’m not sure of the role of such a
                      capability. How is this different from using MR for data ingest?

                      As you can link up to Hadoop/Sybase IQ at the front or at the
                      back-end, which would you tend to use when?




   Twitter Tag: #briefr
Tuesday, May 22, 12
You speak of broad and comprehensive capability, in combination
                      with Hadoop.

                         So which areas do you think are sweet spots?

                         And which kinds of application and/or data collections do you
                         think require different approaches?

                      Who have been the early adopters of this Hadoop/Sybase IQ
                      capability and what kind of business problems are they trying to
                      solve?

                      What do you see as SAP HANA’s role in this? Are the same
                      analytical capabilities being added to SAP HANA?




   Twitter Tag: #briefr
Tuesday, May 22, 12
Tuesday, May 22, 12
May: Analytics

                      June: Intelligence

                      July: Governance

                      August: Analytics

                      September: Integration

                      October: Database


     Twitter Tag: #briefr
Tuesday, May 22, 12
Tuesday, May 22, 12

Weitere ähnliche Inhalte

Mehr von Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

Mehr von Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Kürzlich hochgeladen

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 

Kürzlich hochgeladen (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 

In the Mix: How Native Integration Improves MapReduce and Hadoop

  • 2. Eric.kavanagh@bloorgroup.com Twitter Tag: #briefr Tuesday, May 22, 12
  • 3. Reveal the essential characteristics of enterprise software, good and bad Provide a forum for detailed analysis of today’s innovative technologies Give vendors a chance to explain their product to savvy analysts Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr Tuesday, May 22, 12
  • 4. May: Analytics June: Intelligence July: Governance August: Analytics September: Integration October: Database Twitter Tag: #briefr Tuesday, May 22, 12
  • 5. Ultimately analytics is about businesses making optimal decisions, although the range of technologies that inhabit this area is wide: statistical analysis, data mining, process mining, predictive analytics, predictive modeling, business process modeling and complex event processing. With the advent of big data, analytics has become “big analytics” with organizations diving into large heaps of data that previously was not available or usable. A major challenge with this market trend is to be able to provide adequate performance for all BI and analytics workloads on the volumes of data that are now being assembled and which are continuously growing. Twitter Tag: #briefr Tuesday, May 22, 12
  • 6. Robin Bloor is Chief Analyst at The Bloor Group. Robin.Bloor@Bloorgroup.com Twitter Tag: #briefr Tuesday, May 22, 12
  • 7. SAP Sybase has a history of database innovation and application from the corporate RDBMS through to the mobile and embedded market. Sybase IQ has been deployed in many areas of application and is used in many complex predictive analytics deployments, where speed data capacity and versatility are critical. Recently it has been upgraded to be used in a symbiotic manner with Hadoop in order to provide a comprehensive capability as a BI and analytics engine for Big Data applications Twitter Tag: #briefr Tuesday, May 22, 12
  • 8. David Jonker works in the area of Data Management & Analytics for SAP and is Product Marketing Director for Sybase IQ. In the last 5 years David has led product marketing teams for Sybase’s Data Management & Analytics product lines, including Sybase IQ, Sybase ASE, SQL Anywhere, and Advantage Database Server. His career includes over 10 years in software engineering and product management. Before joining Sybase, David had consulting, product management and software development roles. Courtney Claussen is a product manager at Sybase, Inc., focusing on Sybase's data warehousing and analytics products. She has enjoyed a 30 year career in software development, technical support and product marketing in the areas of computer aided design, computer aided software engineering, database management systems, middleware, and analytics. Twitter Tag: #briefr Tuesday, May 22, 12
  • 9. Sybase IQ 15.4 Overview — Big data analytics & Hadoop Tuesday, May 22, 12
  • 10. Sybase IQ Widespread success Stands out as the leading enterprise data warehouse among the largest banks, insurance agencies, and Manage and analyze Analyze ALL Federal tax Analyze complex telecom operators statistical measures for returns in the US models in more than worldwide the entire nation 200 financial institutions of Canada worldwide Store and analyze massive amounts of industry segment data in 30 of the largest information providers in the world, including Transunion, Nielsen and Axiom © 2012 SAP AG. All rights reserved. 10 Tuesday, May 22, 12
  • 11. BIG DATA ANALYTICS ISSUES Dealing with volume, variety, velocity, costs, skills Volume Managing and harnessing terabytes of data Skills Variety Lack of adequate BIG Harmonizing silos of skills for non- structured and standard platforms DATA unstructured data and APIs ANALYTICS Costs Velocity Too expensive to Keeping up with acquire, operate, unpredictable data and expand and query flows © 2012 SAP AG. All rights reserved. 11 Tuesday, May 22, 12
  • 12. Sybase IQ 15 A powerful big data analytics platform in the making 2009 2009 2010 2011 2011 Big data v15.0 v15.1 v15.2 v15.3 v15.4 analytics Skills MapReduce API Costs PlexQ™ MPP Foundation Variety Text Search, Web 2.0 API Velocity In-Database Analytics API Volume VLDB Platform Foundation © 2012 SAP AG. All rights reserved. 12 Tuesday, May 22, 12
  • 13. Sybase IQ 15.4 A comprehensive platform for big data analytics Sybase Eco-System CONTROL Sybase CERTIFITED ISV TOOLS CENTER POWERDESIGNER Unstructured Data Ingest + Persist (Hadoop, App Content Mgmt) Services Web 2.0 Java C/C++ SQL Federation Structured Data (DBMS) DMBS © 2012 SAP AG. All rights reserved. 13 Tuesday, May 22, 12
  • 14. Details: In-Database Analytics & Hadoop Tuesday, May 22, 12
  • 15. In-database analytics in Sybase IQ No compromise for complex analytics  Basic to advanced analytical functions available to SQL directly from Sybase IQ engine  Data never leaves the database until results are materialized  Analytics code / models must be shareable yet must allow AD-HOC analysis  Analytics code / models must be applicable to the latest data set  Standards based access, concept extensibility is compulsory  Performance and scalability is a given  Average developer must be able to build In-database analytical models Sybase  IQ  Process Database  =   Logic/Filtering Built-­‐In  func6ons External  DLL  “A” Applied  in  database    Analy7cs  simplified:  Logic  To  Data    =  Fast  +  Efficient External  DLL  “A”   © 2012 SAP AG. All rights reserved. 15 Tuesday, May 22, 12
  • 17. In-database analytics in Sybase IQ Custom functions APIs Several different forms of C++ and JAVA UDF APIs for building custom In-database analytics, each valid at different locations within queries 1.{Scalar} to {Scalar functions} e.g. sin, cosine, … 2.{Scalar set} to {Scalar functions} e.g. max, min, … 3.{Scalar set} to {Scalar set} e.g. OLAP windows, … 4.{Scalar set} to {Tables} e.g. join result sets, … 5.{Scalar set, Tables} to {Tables} e.g. MapReduce, … All variants are parallelizable, but (5) is also distributable across the PlexQ™ grid © 2012 SAP AG. All rights reserved. 17 Tuesday, May 22, 12
  • 18. In-database analytics in Sybase IQ Java custom functions 3 Feature Characteristics Big Data Use Cases JAVA User •External algorithms written as • Ideal for ISV or custom Data Mining Defined Function libraries for Healthcare, eCommerce, JAVA fns, plugged into Sybase IQ Public Sector offers a new in- Apps include: •JAVA fns via SQL: runs In- database analytics – ISV partner Zementis built a plug-in Database, much faster than client API for PMML (Predictive Modeling side Markup Language) models •JAVA fns run protected/fault – Validates PMML from SAS, R,.. – Translates PMML to JAVA UDFs tolerant (in separate process) – JAVA UDFs called from SQL •Supports scalar and table outputs •Supports all data types Plug-In PMML Zementis Sybase IQ JAVA UDF © 2012 SAP AG. All rights reserved. 18 Tuesday, May 22, 12
  • 19. SYBASE IQ 15.4 DECONSTRUCTED App services — integrating Sybase IQ + Hadoop: at client side 6a Feature Characteristics Big Data Use Cases Client side •Client tool capable of querying • Ideal for bringing together Big Data federation: Join Analytics pre-computations from Sybase IQ and Hadoop different domains data from •Currently certified client tool is • Example — In Telecommunication: Sybase Sybase IQ AND IQ with aggregated customer loyalty data & Quest Toad for Cloud Hadoop at a client Hadoop with aggregated network utilization application level •Better performance when results data; Quest Toad for Cloud can bring data from both sources, linking customer loyalty from sources are pre-computed/ to network utilization or network faults (e.g. pre-aggregated dropped calls) Toad for Cloud Databases $ Hadoop Hive Sybase IQ © 2012 SAP AG. All rights reserved. 19 Tuesday, May 22, 12
  • 20. SYBASE IQ 15.4 DECONSTRUCTED App services — integrating Sybase IQ + Hadoop: using ETL 6b Feature Characteristics Big Data Use Cases Load Hadoop • Extract & load subsets of HDFS data • Ideal for combining subsets of HDFS into Sybase IQ column store unstructured data or summary of HDFS data into Sybase – Raw data from HDFS data into Sybase IQ for mid to long term IQ column store: usage in business reports – Results of Hadoop MR jobs Extract, transform, • Example — In eCommerce: clickstream data • HDFS data stored in Sybase IQ is load data from treated like other Sybase IQ data from weblogs stored in HDFS and outputs of HDFS (Hadoop MR jobs on that data (to study browsing – Gets ACID properties of a DBMS behavior) ETL’d into Sybase IQ. The Distributed File – Can be indexed, joined, parallelized transactional sales data in Sybase IQ joined System) into – Can be queried in an ad-hoc way with clickstream data to understand and predict customer browsing to buying behavior Sybase IQ • Visible to BI and other client tools via schemas Sybase IQ ANSI SQL API only • Currently, the Apache bulk data transfer utility SQOOP (built by Cloudera) is certified to provide this ETL capability ETL Clickstream Sales Data Data HDFS SQOOP Sybase IQ © 2012 SAP AG. All rights reserved. 20 Tuesday, May 22, 12
  • 21. SYBASE IQ 15.4 DECONSTRUCTED App services — integrating Sybase IQ + Hadoop: using Data Federation 6c Feature Characteristics Big Data Use Cases Join HDFS data • Scan and fetch specified data subsets • Ideal for combining subsets of HDFS with Sybase IQ from HDFS via table UDF data with Sybase IQ data for – Can read and fetch HDFS data operational (transient) business data on the fly: reports subsets Fetch and join – Called as part of Sybase IQ SQL • Example — In Retail: Point Of Sale subsets of HDFS query (POS) detailed data stored in HDFS. data on-demand – Output joinable with Sybase IQ data Sybase IQ EDW fetches POS data at using SQL queries • HDFS data not stored in Sybase IQ fixed intervals from HDFS of specific from Sybase IQ – Fetched into Sybase IQ In-memory hot selling SKUs, combines with tables inventory data in Sybase IQ to predict (Data Federation and prevent inventory “stockouts” – ACID properties not applicable technique) • Visible to BI/other client tools via Sybase IQ ANSI SQL API POS Data Inventory Data HDFS UDF Bridge Sybase IQ © 2012 SAP AG. All rights reserved. 21 Tuesday, May 22, 12
  • 22. SYBASE IQ 15.4 DECONSTRUCTED App services — integrating Sybase IQ + Hadoop: using Query Federation 6d Feature Characteristics Characteristics Big Data Use Cases Combine results of • Trigger and fetch Hadoop MR job • Ideal for combining results of Hadoop Hadoop MR jobs with results via table UDF MR job results with Sybase IQ data for Sybase IQ data on operational (transient) business reports the fly: Initiate and – Can trigger Hadoop MR jobs • Example – In Utilities: Smart meter and Join results of Hadoop smart grid data can be combined for MR jobs on-demand – Called as part of Sybase IQ SQL query load monitoring and demand forecast. using SQL queries Smart grid transmission quality data from Sybase IQ data – Output joinable with Sybase IQ data (multi-attribute time series data) stored (Query Federation in HDFS can be computed via Hadoop technique) • HDFS data not stored in Sybase IQ MR jobs triggered from Sybase IQ and – Fetched into Sybase IQ In-memory combined with Smart meter data stored tables in Sybase IQ to analyze demand and workload. – ACID properties not applicable • Repeated use: put fetched data in tables • Visible to BI and other client tools via Smart Grid Smart Meter Sybase IQ ANSI SQL API Transmission Data Consumption Data HDFS UDF Bridge Sybase IQ © 2012 SAP AG. All rights reserved. 22 Tuesday, May 22, 12
  • 23. SYBASE IQ 15.4 Unique, user community focused platform for big data analytics Data  Discovery  (Data   Applica6on  Modeling   Reports/Dashboards   Business  Decisions   Scien7sts) (Business  Analysts) (BI  Programmers) (Business  End  Users) Full  Mesh  High  Speed  Interconnect                                                                                         Infrastructure   Management   (DBAs)                                                                                                         SAN Fabric • Dynamic, elastic PlexQ™ MPP grid – Grow, shrink, provision on-demand – Heavy parallelization • Load, prepare, mine, report in a workflow – Privacy through isolation of resources – Collaboration through sharing of results/data via sharing of resources © 2012 SAP AG. All rights reserved. 23 Tuesday, May 22, 12
  • 24. Thank you Courtney Claussen Product Manager, Sybase IQ courtney.claussen@sap.com David Jonker Product Marketing Director, Sybase IQ david.jonker@sap.com Tuesday, May 22, 12
  • 27. Most of the Big Data opportunity is, in the end, a Big Analytics opportunity. There are two challenges in this: Managing the data and the data flow Providing acceptable performance for analytics applications Hadoop and its associated technologies can be both a blessing and a curse. Twitter Tag: #briefr Tuesday, May 22, 12
  • 28. • Hadoop = Key-value store & Parallel processing framework • Some NoSQL databases are DHT-based, some are specialized DBMS • Column-store DBMS vary, but in general they are MPP RDBMS and NewSQL DBMS Twitter Tag: #briefr Tuesday, May 22, 12
  • 29. Data volumes (includes complexity of data structure) Concurrency (includes also workload variability) Computation (is application dependent) Data flow architecture is a factor Twitter Tag: #briefr Tuesday, May 22, 12
  • 30. In many ways this is similar to the Data Warehouse data flow challenge; writ larger Latency is about application service levels This is probably still a three stage process This is, by the way, a simplification Twitter Tag: #briefr Tuesday, May 22, 12
  • 31. Big Analytics is here to stay In some analytical application areas speed is desirable, in others speed is critical. Warning: Workloads can be mixed Analytic speed depends upon the database engine, but also data flow architecture Business effectiveness depends upon integration with the business process Twitter Tag: #briefr Tuesday, May 22, 12
  • 32. The prebuilt functions clearly make sense (for speed of processing). Are they intended to make some analytic tools unnecessary or simply to be called directly by such tools? What does SAP see as the appropriate role(s) for Hadoop in most businesses? As I understand it, Sybase IQ can fully replace Hadoop in some contexts. What are the situations where you think Hadoop AND Sybase IQ is appropriate? I’m intrigued by the idea of JOINing data between Hadoop results and Sybase IQ, but I’m not sure of the role of such a capability. How is this different from using MR for data ingest? As you can link up to Hadoop/Sybase IQ at the front or at the back-end, which would you tend to use when? Twitter Tag: #briefr Tuesday, May 22, 12
  • 33. You speak of broad and comprehensive capability, in combination with Hadoop. So which areas do you think are sweet spots? And which kinds of application and/or data collections do you think require different approaches? Who have been the early adopters of this Hadoop/Sybase IQ capability and what kind of business problems are they trying to solve? What do you see as SAP HANA’s role in this? Are the same analytical capabilities being added to SAP HANA? Twitter Tag: #briefr Tuesday, May 22, 12
  • 35. May: Analytics June: Intelligence July: Governance August: Analytics September: Integration October: Database Twitter Tag: #briefr Tuesday, May 22, 12