SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
InfoSphere BigInsights
Hadoop business ready
Wilfried Hoge
IT Architect Big Data
© 2013 International Business Machines Corporation 2
Getting the Value from Big Data – Why a Platform?
§  Almost all big data use cases require
an integrated set of big data technologies
to address the business pain completely
§  Reduce time and cost and provide quick ROI
by leveraging pre-integrated components
§  Be flexible in the combination of technologies
§  Start small with a single project and progress
to others over your big data journey
Accelerators
Information Integration & Governance
Data
Warehouse
Stream
Computing
Hadoop
System
DiscoveryApplication
Development
Systems
Management
Data Media Content Machine Social
BIG DATA PLATFORM
© 2013 International Business Machines Corporation 3
Accelerators
Information Integration & Governance
Data
Warehouse
Stream
Computing
Hadoop
System
DiscoveryApplication
Development
Systems
Management
Data Media Content Machine Social
BIG DATA PLATFORM
InfoSphere BigInsights is IBM‘s distribution of
Hadoop that delivers additional value
Accelerators
Speed time to value with analytic
and application accelerators
InfoSphere BigInsights
Bringing Hadoop to the enterprise
© 2013 International Business Machines Corporation 4
New Architecture to Leverage All Data and Analytics
Data	
  in	
  
Mo)on	
  
Data	
  at	
  
Rest	
  
Data	
  in	
  
Many	
  Forms	
  
Information
Ingestion and
Operational
Information
Decision
Management
BI and Predictive
Analytics
Navigation
and Discovery
Intelligence
Analysis
Landing Area,
Analytics Zone
and Archive
§  Raw Data
§  Structured Data
§  Text Analytics
§  Data Mining
§  Entity Analytics
§  Machine Learning
Real-time
Analytics
§  Video/Audio
§  Network/Sensor
§  Entity Analytics
§  Predictive
Exploration,
Integrated
Warehouse,
and Mart Zones
§  Discovery
§  Deep Reflection
§  Operational
§  Predictive
§  Stream Processing
§  Data Integration
§  Master Data
Streams
Information Governance, Security and Business Continuity
© 2013 International Business Machines Corporation 5
New Architecture to Leverage All Data and Analytics
Data	
  in	
  
Mo)on	
  
Data	
  at	
  
Rest	
  
Data	
  in	
  
Many	
  Forms	
  
Information
Ingestion and
Operational
Information
Decision
Management
BI and Predictive
Analytics
Navigation
and Discovery
Intelligence
Analysis
Landing Area,
Analytics Zone
and Archive
§  Raw Data
§  Structured Data
§  Text Analytics
§  Data Mining
§  Entity Analytics
§  Machine Learning
Real-time
Analytics
§  Video/Audio
§  Network/Sensor
§  Entity Analytics
§  Predictive
Exploration,
Integrated
Warehouse,
and Mart Zones
§  Discovery
§  Deep Reflection
§  Operational
§  Predictive
§  Stream Processing
§  Data Integration
§  Master Data
Streams
Information Governance, Security and Business Continuity
•  brings Hadoop to the Enterprise
•  enhances ease of use and
consumability
•  takes the complexity out of
getting started with Hadoop
•  users across the organization
can build applications, and get
insights at their fingertips without
having to learn new skill sets
InfoSphere BigInsights
© 2013 International Business Machines Corporation 6
Tools for Administrators
6
•  Monitoring capabilities provide a centralized dashboard view to visualize key performance
indicators including CPU, disk, and memory and network usage for the cluster, data services
such as HDFS, HBase, Zookeeper and Flume, and application services including MapReduce,
Hive, and Oozie
•  Status information and control
over the major cluster
capabilities
•  Advanced capabilities to control
application permissions and
deployment
•  Capability to view and control
all applications from a single
page
© 2013 International Business Machines Corporation 7
BigSheets to analyze and visualize
•  Model “big data” collected
from various sources in
spreadsheet-like structures
•  Filter and enrich content with
built-in functions
•  Combine data in different
workbooks
•  Visualize results through
spreadsheets, charts
•  Export data into common
formats (if desired)
No programming knowledge needed!
© 2013 International Business Machines Corporation 8
8
A centralized dashboard to visualize
analytic results:
•  BigSheets collections
•  Analytic application results
•  Monitoring metrics
•  Ability to view BigSheets data flows between
and across data sets to quickly navigate and
relate analysis and charts
•  Visualize inner outer joins, enhanced filters
for BigSheets columns, column data-type
mapping for collections and application of
analytics to BigSheets
columns, … etc
Centralized dashboard & data flows
© 2013 International Business Machines Corporation 9
9
Editors
•  A workflow editor that greatly simplifies the creation of
complex Oozie workflows with a consumable interface
•  A Pig/Jaql Editor with content assist and syntax
highlighting that enables users to create and execute
new applications using Pig or Jaql in local or cluster
mode from the Eclipse IDE
Application development & deployment
•  Enablement of BigSheets macro
and BigSheets reader development
•  Text Analytics development,
including support for modular
rule sets
•  Publish new application: BigSheets
Macro, BigSheets Reader, AQL
module, Jaql module
Tools for Developers 1. Sample your
Data
2. Develop your
application using
BigInsights tools
3. Test your
application
4. Package and publish your
application
5. Deploy your
application on the
cluster
© 2013 International Business Machines Corporation 10
Running Applications on Big Data
•  Browse available applications
•  Deploy published applications
(administrators only)
•  Launch (or schedule for launch) a
deployed application
•  Monitor job (application) execution
status
•  Predefined applications
•  Import & Export Data
•  Database & Files
•  Web and Social
•  Analyze and Query
•  Predictive Analytics
•  Text Analytics
•  SQL/Hive, Jaql, Pig, Hbase
•  Accelerators
© 2013 International Business Machines Corporation 11
Application linking and interfaces to build new apps
11
•  Compose new
applications from
existing applications
and BigSheets
•  Invoke analytics
applications from the
web console, including
integration within
BigSheets
•  REST data source App
that enables users to
load data from any data source supporting REST APIs into BigInsights, including
popular social media services
•  Sampling App that enables users to sample data for analysis
•  Subsetting App that enables users to subset data for data analysis
© 2013 International Business Machines Corporation 12
Collaborative Big Data for many roles
•  Business Users can get their hands on big
data and use big data applications and
BigSheets to get insights into their data
§  Data scientists can perform deeper analysis
and get richer insights
§  Administrators are empowered to be more
agile through better controls and views into key
performance indicators
§  Developers can leverage unified tooling in a Big Data
Application Development Lifecycle and are able to create and
deploy new types of applications, with enhancements that
simplify even complex workflows
© 2013 International Business Machines Corporation 13
Build-in accelerators
•  Software components that accelerate development and/or implementation of specific
solutions or use cases on top of the Big Data platform
•  Provide business logic, data processing, and UI/visualization, tailored for a given use case
•  Bundled with Big Data platform components – InfoSphere BigInsights and InfoSphere
Streams
•  Key Benefits
–  Time to value
–  Leverage best practices around implementation of a given use case.
•  Analytical Accelerators
–  Text analytics – Geospatial analytics
–  Machine learning – Time series
–  Data mining
•  Application Accelerators
–  Machine Data Analytics – operational data including logs for operations efficiency
–  Social Data Analytics – sentiment analytics, Intent to purchase
–  Telecommunications – CDR streaming analytics deep customer event analytics
–  Finance Analysis – streaming options, trading, Insurance and banking DW models
© 2013 International Business Machines Corporation 14
Machine Data Analytics Accelerator
What does it do?
§  Provides the ability to ingest, parse and extract a wide
variety of machine data
– Faceted search enables easy navigation and discovery
– Visualization enables easy analysis of the data
Machine Data Analytics
Example Application: Facilities Management
• Use real time data from building devices such as meters, sensors and motion
detectors to monitor and manage power usage
Why should you care?
§  It enables clients to gain insights into operations, customer experience,
transactions and behavior, processing machine data in minutes instead of days
and weeks
§  With these insights, clients can:
– Proactively plan to increase operational efficiency
– Troubleshoot problems and investigate security incidents
– Monitor end-to-end infrastructure to avoid service degradation or outages
© 2013 International Business Machines Corporation 15
Machine Data Analytics Accelerator High-Level Workflow
© 2013 IBM Corporation
© 2013 International Business Machines Corporation 16
Use the Machine Data Analytics Accelerator by starting the
predefined applications
© 2013 International Business Machines Corporation 17
© 2013 IBM Corporation
View results of MDA in web, BigSheets and dashboard
© 2013 International Business Machines Corporation 18
BigInsights Enterprise Edition
Connectivity and Integration Streams
Netezza
Text
processing
engine and
library
JDBC
Flume
Infrastructure Jaql
Hive
Pig
HBase
MapReduce
HDFS
ZooKeeper
Indexing Lucene
Adaptive
MapReduce
Oozie
Text compression
Enhanced
security
Flexible
scheduler
Optional
IBM and
partner
offerings
Analytics and discovery “Apps”
DB2
BigSheets
Web Crawler
Distrib file
copy
DB export
Boardreader
DB import
Ad hoc query
Machine
learning
Data
processing
. . .
Administrative and
development tools
Web console
•  Monitor cluster health, jobs,
etc.
•  Add / remove nodes
•  Start / stop services
•  Inspect job status
•  Inspect workflow status
•  Deploy applications
•  Launch apps / jobs
•  Work with distrib file system
•  Work with spreadsheet
Interface
•  Support REST-based API
•  . . .
R
Eclipse tools
•  Text analytics
•  MapReduce programming
•  Jaql, Hive, Pig development
•  BigSheets plug-in
development
•  Oozie workflow generation
Integrated
installer
Open Source IBMIBM
Cognos BI
GPFS (EAP)
Accelerator for
machine data
analysis
Accelerator for
social data
analysis
Guardium DataStageData Explorer
Sqoop
HCatalog
© 2013 International Business Machines Corporation 19
BigInsights: Value Beyond Open Source
Enterprise Capabilities
Administration & Security
Workload Optimization
Connectors
Open source
components
Advanced Engines
Visualization & Exploration
Development Tools
IBM-certified
Apache Hadoop or or …
Key differentiators
•  Built-in analytics
•  Enterprise software integration
•  Spreadsheet-style analysis
•  Integrated installation of supported open
source and other components
•  Web Console for admin and application
access
•  Platform enrichment: additional security,
performance features, . . .
•  World-class support
•  Full open source compatibility
Business benefits
•  Quicker time-to-value due to IBM
technology and support
•  Reduced operational risk
•  Enhanced business knowledge with flexible
analytical platform
•  Leverages and complements existing
software
© 2013 International Business Machines Corporation 20
If this were easy, everyone would already be
leveraging big data
“Big Data offers big business gains but hidden costs and complexity present
barriers that most organizations will struggle with”
- The Cost of Big Data, Eric Savitz, Forbes 5/2012
§  Open source Apache Hadoop for enterprise usage is incomplete
§  Hadoop skills are in short supply
§  Custom built solutions lack integrated cluster management
§  Requires integration effort within the existing analytic ecosystem
§  Most integrated solutions do not help with archival
© 2013 International Business Machines Corporation 21
Simplifying Big Data for the Enterprise
The new PureData System for Hadoop
§  Accelerate time to value
§  Accelerate time to insight
§  Simplify big data adoption and consumption
§  Extend the value of the data warehouse
§  Implement enterprise class big data
§  Minimize system setup and administration
§  Available in 2H2013
System for Hadoop
© 2013 International Business Machines Corporation 22
Accelerate Big Data
Time to Value
Simplify Big Data
Adoption & Consumption
Implement Enterprise Class
Big Data
1 Based on IBM internal testing and customer feedback. "Custom built clusters" refer to clusters that are not professionally pre-
built, pre-tested and optimized. Individual results may vary.
2 Based on current commercially available Big Data appliance product data sheets from large vendors. US ONLY CLAIM.
Built-in Expertise
Simplified Experience
Integration by Design
Benefits of IBM PureData System for Hadoop
§  Deploy 8x faster than custom-built solutions1
§  Built-in visualization to accelerate insight
§  Built-in analytic accelerators2
unlike big data appliances on the market
§  Single system console for full system administration
§  Rapid maintenance updates with automation
§  No assembly required, data load ready in hours
§  Only integrated Hadoop system
with built-in archiving tools2
§  Delivered with more robust security
than open source software
§  Architected for high availability
© 2013 International Business Machines Corporation 23
SQL Access for Hadoop: Why?
•  Data warehouse augmentation is
a leading Hadoop use case
•  MapReduce is difficult
–  MapReduce Java API is tedious and
requires programming expertise
–  Unfamiliar languages (ie. Pig) also require special skills
•  SQL support would open the data to a much wider audience
–  Familiar, widely known syntax
–  Common catalog for identifying data and structure
–  Declarative – clear separation of the what (the data you’re after) vs.
the how (processing)
Pre-Processing Hub Query-able Archive Exploratory Analysis
Information
Integration
Data Warehouse
Streams
Real-time
processing
BigInsights
Landing zone
for all data
Data Warehouse
BigInsights Can combine
with
unstructured
information
Data Warehouse
1 2 3
© 2013 International Business Machines Corporation 24
SQL for Hadoop: What’s the Problem?
•  SQL Access to data in Hadoop is challenging
–  Data is in many formats
•  CSV, JSON, Hive RCFile, HBase, ...
•  Some formats (HBase composite keys) don’t map cleanly
to relational models
–  No schemas or statistics
–  Hadoop was not designed to be a query engine
•  Hive (with HiveQL): limited query access for Hadoop
–  SQL-like, but NOT SQL
•  Limited data types – no varchar(n), decimal(p,s), etc…
•  Limited join support
•  No subqueries
•  No windowed aggregates
–  Very limited JDBC/ODBC driver
–  Everything executes in MapReduce
•  Even very small queries requiring little processing
© 2013 International Business Machines Corporation 25
Big SQL: Native SQL Query Access for Hadoop
•  Native SQL access to data
stored in BigInsights
–  ANSI SQL 92+
–  Standard syntax support (joins, data types, …)
•  Real JDBC/ODBC drivers
–  Prepared statements
–  Cancel support
–  Database metadata API support
–  Secure socket connections (SSL)
•  Optimization
–  Leveraging MapReduce parallelism
or…
–  Direct access for low-latency queries
•  Varied data sources
–  HBase (including secondary indexes)
–  CSV, Delimited files, Sequence files
–  JSON
–  Hive tables
Big SQL Engine
BigInsights
Data Sources
SQL
Hive Tables HBase tables CSV Files
Application
JDBC / ODBC Server
JDBC / ODBC Driver
© 2013 International Business Machines Corporation 26
From Getting Starting to Enterprise Deployment
InfoSphere BigInsights Brings Hadoop to the Enterprise
Basic Edition
Enterprise Edition
- Accelerators
- Performance Optimization
- Visualization Capabilities
- Pre-built applications
- Text analytics
- Spreadsheet-style tool
- RDBMS, warehouse connectivity
- Administrative tools, security
- Eclipse development tools
- Enterprise Integration . . . .
- Web-based
mgmt console
- Jaql
- Integrated install
Breadth of capabilities
Enterpriseclass
Free download
Sold by # of terabytes managed
Apache
Hadoop
PureData for Hadoop
- Appliance simplicity for the
enterprise
© 2013 International Business Machines Corporation 27
Where to start with BigInsights?
•  Learn it at BigDataUniversity.com
•  Try it on Smart Cloud Enterprise: ibm.biz/Bdx8FF
•  Read about it in “Harness the Power of Big Data”
at ibm.biz/Bdx8RP
•  Learn about Big Data at www.ibmbigdatahub.com
•  Register for “Big Data at the speed of business” event on
April 30th at ibm.co/bigdataevent
•  Try BigSQL: bigsql.imdemocloud.com
•  YouTube Videos - Big Data Channel: youtube.com/user/ibmbigdata
© 2013 International Business Machines Corporation 28
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without
notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and it
should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise, or legal
obligation to deliver any material, code or functionality. Information about potential future products may not
be incorporated into any contract. The development, release, and timing of any future features or
functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending upon
many factors, including considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can
be given that an individual user will achieve results similar to those stated here.
Please Note

Weitere ähnliche Inhalte

Was ist angesagt?

Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architectureYisal Khan
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Distribution transparency and Distributed transaction
Distribution transparency and Distributed transactionDistribution transparency and Distributed transaction
Distribution transparency and Distributed transactionshraddha mane
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemAnamika Singh
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issuesEsar Qasmi
 
Mobile Network Layer
Mobile Network LayerMobile Network Layer
Mobile Network LayerRahul Hada
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATAGauravBiswas9
 
Parallel computing
Parallel computingParallel computing
Parallel computingVinay Gupta
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dosvanamali_vanu
 
Eucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaEucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaAmar Myana
 
Distributed dbms architectures
Distributed dbms architecturesDistributed dbms architectures
Distributed dbms architecturesPooja Dixit
 
Deployment Models of Cloud Computing.pptx
Deployment Models of Cloud Computing.pptxDeployment Models of Cloud Computing.pptx
Deployment Models of Cloud Computing.pptxJaya Silwal
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classificationKrish_ver2
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 

Was ist angesagt? (20)

Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architecture
 
Deductive databases
Deductive databasesDeductive databases
Deductive databases
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Distribution transparency and Distributed transaction
Distribution transparency and Distributed transactionDistribution transparency and Distributed transaction
Distribution transparency and Distributed transaction
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issues
 
Mobile Network Layer
Mobile Network LayerMobile Network Layer
Mobile Network Layer
 
Common Standards in Cloud Computing
Common Standards in Cloud ComputingCommon Standards in Cloud Computing
Common Standards in Cloud Computing
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 
Chapter 6 synchronization
Chapter 6 synchronizationChapter 6 synchronization
Chapter 6 synchronization
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
 
Eucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaEucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebula
 
Distributed dbms architectures
Distributed dbms architecturesDistributed dbms architectures
Distributed dbms architectures
 
Routing Protocols in WSN
Routing Protocols in WSNRouting Protocols in WSN
Routing Protocols in WSN
 
Deployment Models of Cloud Computing.pptx
Deployment Models of Cloud Computing.pptxDeployment Models of Cloud Computing.pptx
Deployment Models of Cloud Computing.pptx
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Parallel Database
Parallel DatabaseParallel Database
Parallel Database
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 

Andere mochten auch

Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Cynthia Saracco
 
The Eclipse Modeling Framework and MDA
The Eclipse Modeling Framework and MDAThe Eclipse Modeling Framework and MDA
The Eclipse Modeling Framework and MDAelliando dias
 
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUXInfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUXIBMInfoSphereUGFR
 
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceInfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceWilfried Hoge
 
2014.07.11 biginsights data2014
2014.07.11 biginsights data20142014.07.11 biginsights data2014
2014.07.11 biginsights data2014Wilfried Hoge
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...Romeo Kienzler
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopWilfried Hoge
 
Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Niu Bai
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetSAP Technology
 
MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...
MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...
MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...Kaseya
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Planning, implementation, monitoring and evaluation of health education progr...
Planning, implementation, monitoring and evaluation of health education progr...Planning, implementation, monitoring and evaluation of health education progr...
Planning, implementation, monitoring and evaluation of health education progr...Jimma University
 
5 Steps To Effective Jad Sessions
5 Steps To Effective Jad Sessions5 Steps To Effective Jad Sessions
5 Steps To Effective Jad SessionsLizLavaveshkul
 

Andere mochten auch (16)

Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
The Eclipse Modeling Framework and MDA
The Eclipse Modeling Framework and MDAThe Eclipse Modeling Framework and MDA
The Eclipse Modeling Framework and MDA
 
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUXInfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
 
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceInfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experience
 
2014.07.11 biginsights data2014
2014.07.11 biginsights data20142014.07.11 biginsights data2014
2014.07.11 biginsights data2014
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on Hadoop
 
Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
 
MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...
MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...
MSP Best Practice: Using Service Blueprints and Strategic IT Roadmaps to Get ...
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Planning, implementation, monitoring and evaluation of health education progr...
Planning, implementation, monitoring and evaluation of health education progr...Planning, implementation, monitoring and evaluation of health education progr...
Planning, implementation, monitoring and evaluation of health education progr...
 
5 Steps To Effective Jad Sessions
5 Steps To Effective Jad Sessions5 Steps To Effective Jad Sessions
5 Steps To Effective Jad Sessions
 

Ähnlich wie InfoSphere BigInsights

Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...DataStax Academy
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini
 
Data & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft PlatformsData & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft PlatformsSonata Software
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSNicolas Georgeault
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
SoftWatch Overview_short (1)
SoftWatch Overview_short (1)SoftWatch Overview_short (1)
SoftWatch Overview_short (1)Dror Leshem
 
SoftWatch Overview_short (1)
SoftWatch Overview_short (1)SoftWatch Overview_short (1)
SoftWatch Overview_short (1)Moshe Kozlovski
 
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarBuild and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarImpetus Technologies
 
25 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 202225 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 2022Kavika Roy
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBMongoDB
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data AnalyticsCynthia Saracco
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData Inc.
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Pactera_US
 
EMC Pivotal overview deck
EMC Pivotal overview deckEMC Pivotal overview deck
EMC Pivotal overview deckmister_moun
 

Ähnlich wie InfoSphere BigInsights (20)

Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 
Data & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft PlatformsData & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft Platforms
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
SoftWatch Overview_short (1)
SoftWatch Overview_short (1)SoftWatch Overview_short (1)
SoftWatch Overview_short (1)
 
SoftWatch Overview_short (1)
SoftWatch Overview_short (1)SoftWatch Overview_short (1)
SoftWatch Overview_short (1)
 
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarBuild and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
 
25 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 202225 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 2022
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
 
EMC Pivotal overview deck
EMC Pivotal overview deckEMC Pivotal overview deck
EMC Pivotal overview deck
 

Mehr von Wilfried Hoge

Cloud Data Services - from prototyping to scalable analytics on cloud
Cloud Data Services - from prototyping to scalable analytics on cloudCloud Data Services - from prototyping to scalable analytics on cloud
Cloud Data Services - from prototyping to scalable analytics on cloudWilfried Hoge
 
Is it harder to find a taxi when it is raining?
Is it harder to find a taxi when it is raining? Is it harder to find a taxi when it is raining?
Is it harder to find a taxi when it is raining? Wilfried Hoge
 
innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...Wilfried Hoge
 
2015.05.07 watson rp15
2015.05.07 watson rp152015.05.07 watson rp15
2015.05.07 watson rp15Wilfried Hoge
 
Twitter analytics in Bluemix
Twitter analytics in BluemixTwitter analytics in Bluemix
Twitter analytics in BluemixWilfried Hoge
 
2013.12.12 big data heise webcast
2013.12.12 big data heise webcast2013.12.12 big data heise webcast
2013.12.12 big data heise webcastWilfried Hoge
 
2012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum22012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum2Wilfried Hoge
 
IBM - Big Value from Big Data
IBM - Big Value from Big DataIBM - Big Value from Big Data
IBM - Big Value from Big DataWilfried Hoge
 

Mehr von Wilfried Hoge (8)

Cloud Data Services - from prototyping to scalable analytics on cloud
Cloud Data Services - from prototyping to scalable analytics on cloudCloud Data Services - from prototyping to scalable analytics on cloud
Cloud Data Services - from prototyping to scalable analytics on cloud
 
Is it harder to find a taxi when it is raining?
Is it harder to find a taxi when it is raining? Is it harder to find a taxi when it is raining?
Is it harder to find a taxi when it is raining?
 
innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...
 
2015.05.07 watson rp15
2015.05.07 watson rp152015.05.07 watson rp15
2015.05.07 watson rp15
 
Twitter analytics in Bluemix
Twitter analytics in BluemixTwitter analytics in Bluemix
Twitter analytics in Bluemix
 
2013.12.12 big data heise webcast
2013.12.12 big data heise webcast2013.12.12 big data heise webcast
2013.12.12 big data heise webcast
 
2012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum22012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum2
 
IBM - Big Value from Big Data
IBM - Big Value from Big DataIBM - Big Value from Big Data
IBM - Big Value from Big Data
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Kürzlich hochgeladen (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

InfoSphere BigInsights

  • 1. InfoSphere BigInsights Hadoop business ready Wilfried Hoge IT Architect Big Data
  • 2. © 2013 International Business Machines Corporation 2 Getting the Value from Big Data – Why a Platform? §  Almost all big data use cases require an integrated set of big data technologies to address the business pain completely §  Reduce time and cost and provide quick ROI by leveraging pre-integrated components §  Be flexible in the combination of technologies §  Start small with a single project and progress to others over your big data journey Accelerators Information Integration & Governance Data Warehouse Stream Computing Hadoop System DiscoveryApplication Development Systems Management Data Media Content Machine Social BIG DATA PLATFORM
  • 3. © 2013 International Business Machines Corporation 3 Accelerators Information Integration & Governance Data Warehouse Stream Computing Hadoop System DiscoveryApplication Development Systems Management Data Media Content Machine Social BIG DATA PLATFORM InfoSphere BigInsights is IBM‘s distribution of Hadoop that delivers additional value Accelerators Speed time to value with analytic and application accelerators InfoSphere BigInsights Bringing Hadoop to the enterprise
  • 4. © 2013 International Business Machines Corporation 4 New Architecture to Leverage All Data and Analytics Data  in   Mo)on   Data  at   Rest   Data  in   Many  Forms   Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis Landing Area, Analytics Zone and Archive §  Raw Data §  Structured Data §  Text Analytics §  Data Mining §  Entity Analytics §  Machine Learning Real-time Analytics §  Video/Audio §  Network/Sensor §  Entity Analytics §  Predictive Exploration, Integrated Warehouse, and Mart Zones §  Discovery §  Deep Reflection §  Operational §  Predictive §  Stream Processing §  Data Integration §  Master Data Streams Information Governance, Security and Business Continuity
  • 5. © 2013 International Business Machines Corporation 5 New Architecture to Leverage All Data and Analytics Data  in   Mo)on   Data  at   Rest   Data  in   Many  Forms   Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis Landing Area, Analytics Zone and Archive §  Raw Data §  Structured Data §  Text Analytics §  Data Mining §  Entity Analytics §  Machine Learning Real-time Analytics §  Video/Audio §  Network/Sensor §  Entity Analytics §  Predictive Exploration, Integrated Warehouse, and Mart Zones §  Discovery §  Deep Reflection §  Operational §  Predictive §  Stream Processing §  Data Integration §  Master Data Streams Information Governance, Security and Business Continuity •  brings Hadoop to the Enterprise •  enhances ease of use and consumability •  takes the complexity out of getting started with Hadoop •  users across the organization can build applications, and get insights at their fingertips without having to learn new skill sets InfoSphere BigInsights
  • 6. © 2013 International Business Machines Corporation 6 Tools for Administrators 6 •  Monitoring capabilities provide a centralized dashboard view to visualize key performance indicators including CPU, disk, and memory and network usage for the cluster, data services such as HDFS, HBase, Zookeeper and Flume, and application services including MapReduce, Hive, and Oozie •  Status information and control over the major cluster capabilities •  Advanced capabilities to control application permissions and deployment •  Capability to view and control all applications from a single page
  • 7. © 2013 International Business Machines Corporation 7 BigSheets to analyze and visualize •  Model “big data” collected from various sources in spreadsheet-like structures •  Filter and enrich content with built-in functions •  Combine data in different workbooks •  Visualize results through spreadsheets, charts •  Export data into common formats (if desired) No programming knowledge needed!
  • 8. © 2013 International Business Machines Corporation 8 8 A centralized dashboard to visualize analytic results: •  BigSheets collections •  Analytic application results •  Monitoring metrics •  Ability to view BigSheets data flows between and across data sets to quickly navigate and relate analysis and charts •  Visualize inner outer joins, enhanced filters for BigSheets columns, column data-type mapping for collections and application of analytics to BigSheets columns, … etc Centralized dashboard & data flows
  • 9. © 2013 International Business Machines Corporation 9 9 Editors •  A workflow editor that greatly simplifies the creation of complex Oozie workflows with a consumable interface •  A Pig/Jaql Editor with content assist and syntax highlighting that enables users to create and execute new applications using Pig or Jaql in local or cluster mode from the Eclipse IDE Application development & deployment •  Enablement of BigSheets macro and BigSheets reader development •  Text Analytics development, including support for modular rule sets •  Publish new application: BigSheets Macro, BigSheets Reader, AQL module, Jaql module Tools for Developers 1. Sample your Data 2. Develop your application using BigInsights tools 3. Test your application 4. Package and publish your application 5. Deploy your application on the cluster
  • 10. © 2013 International Business Machines Corporation 10 Running Applications on Big Data •  Browse available applications •  Deploy published applications (administrators only) •  Launch (or schedule for launch) a deployed application •  Monitor job (application) execution status •  Predefined applications •  Import & Export Data •  Database & Files •  Web and Social •  Analyze and Query •  Predictive Analytics •  Text Analytics •  SQL/Hive, Jaql, Pig, Hbase •  Accelerators
  • 11. © 2013 International Business Machines Corporation 11 Application linking and interfaces to build new apps 11 •  Compose new applications from existing applications and BigSheets •  Invoke analytics applications from the web console, including integration within BigSheets •  REST data source App that enables users to load data from any data source supporting REST APIs into BigInsights, including popular social media services •  Sampling App that enables users to sample data for analysis •  Subsetting App that enables users to subset data for data analysis
  • 12. © 2013 International Business Machines Corporation 12 Collaborative Big Data for many roles •  Business Users can get their hands on big data and use big data applications and BigSheets to get insights into their data §  Data scientists can perform deeper analysis and get richer insights §  Administrators are empowered to be more agile through better controls and views into key performance indicators §  Developers can leverage unified tooling in a Big Data Application Development Lifecycle and are able to create and deploy new types of applications, with enhancements that simplify even complex workflows
  • 13. © 2013 International Business Machines Corporation 13 Build-in accelerators •  Software components that accelerate development and/or implementation of specific solutions or use cases on top of the Big Data platform •  Provide business logic, data processing, and UI/visualization, tailored for a given use case •  Bundled with Big Data platform components – InfoSphere BigInsights and InfoSphere Streams •  Key Benefits –  Time to value –  Leverage best practices around implementation of a given use case. •  Analytical Accelerators –  Text analytics – Geospatial analytics –  Machine learning – Time series –  Data mining •  Application Accelerators –  Machine Data Analytics – operational data including logs for operations efficiency –  Social Data Analytics – sentiment analytics, Intent to purchase –  Telecommunications – CDR streaming analytics deep customer event analytics –  Finance Analysis – streaming options, trading, Insurance and banking DW models
  • 14. © 2013 International Business Machines Corporation 14 Machine Data Analytics Accelerator What does it do? §  Provides the ability to ingest, parse and extract a wide variety of machine data – Faceted search enables easy navigation and discovery – Visualization enables easy analysis of the data Machine Data Analytics Example Application: Facilities Management • Use real time data from building devices such as meters, sensors and motion detectors to monitor and manage power usage Why should you care? §  It enables clients to gain insights into operations, customer experience, transactions and behavior, processing machine data in minutes instead of days and weeks §  With these insights, clients can: – Proactively plan to increase operational efficiency – Troubleshoot problems and investigate security incidents – Monitor end-to-end infrastructure to avoid service degradation or outages
  • 15. © 2013 International Business Machines Corporation 15 Machine Data Analytics Accelerator High-Level Workflow © 2013 IBM Corporation
  • 16. © 2013 International Business Machines Corporation 16 Use the Machine Data Analytics Accelerator by starting the predefined applications
  • 17. © 2013 International Business Machines Corporation 17 © 2013 IBM Corporation View results of MDA in web, BigSheets and dashboard
  • 18. © 2013 International Business Machines Corporation 18 BigInsights Enterprise Edition Connectivity and Integration Streams Netezza Text processing engine and library JDBC Flume Infrastructure Jaql Hive Pig HBase MapReduce HDFS ZooKeeper Indexing Lucene Adaptive MapReduce Oozie Text compression Enhanced security Flexible scheduler Optional IBM and partner offerings Analytics and discovery “Apps” DB2 BigSheets Web Crawler Distrib file copy DB export Boardreader DB import Ad hoc query Machine learning Data processing . . . Administrative and development tools Web console •  Monitor cluster health, jobs, etc. •  Add / remove nodes •  Start / stop services •  Inspect job status •  Inspect workflow status •  Deploy applications •  Launch apps / jobs •  Work with distrib file system •  Work with spreadsheet Interface •  Support REST-based API •  . . . R Eclipse tools •  Text analytics •  MapReduce programming •  Jaql, Hive, Pig development •  BigSheets plug-in development •  Oozie workflow generation Integrated installer Open Source IBMIBM Cognos BI GPFS (EAP) Accelerator for machine data analysis Accelerator for social data analysis Guardium DataStageData Explorer Sqoop HCatalog
  • 19. © 2013 International Business Machines Corporation 19 BigInsights: Value Beyond Open Source Enterprise Capabilities Administration & Security Workload Optimization Connectors Open source components Advanced Engines Visualization & Exploration Development Tools IBM-certified Apache Hadoop or or … Key differentiators •  Built-in analytics •  Enterprise software integration •  Spreadsheet-style analysis •  Integrated installation of supported open source and other components •  Web Console for admin and application access •  Platform enrichment: additional security, performance features, . . . •  World-class support •  Full open source compatibility Business benefits •  Quicker time-to-value due to IBM technology and support •  Reduced operational risk •  Enhanced business knowledge with flexible analytical platform •  Leverages and complements existing software
  • 20. © 2013 International Business Machines Corporation 20 If this were easy, everyone would already be leveraging big data “Big Data offers big business gains but hidden costs and complexity present barriers that most organizations will struggle with” - The Cost of Big Data, Eric Savitz, Forbes 5/2012 §  Open source Apache Hadoop for enterprise usage is incomplete §  Hadoop skills are in short supply §  Custom built solutions lack integrated cluster management §  Requires integration effort within the existing analytic ecosystem §  Most integrated solutions do not help with archival
  • 21. © 2013 International Business Machines Corporation 21 Simplifying Big Data for the Enterprise The new PureData System for Hadoop §  Accelerate time to value §  Accelerate time to insight §  Simplify big data adoption and consumption §  Extend the value of the data warehouse §  Implement enterprise class big data §  Minimize system setup and administration §  Available in 2H2013 System for Hadoop
  • 22. © 2013 International Business Machines Corporation 22 Accelerate Big Data Time to Value Simplify Big Data Adoption & Consumption Implement Enterprise Class Big Data 1 Based on IBM internal testing and customer feedback. "Custom built clusters" refer to clusters that are not professionally pre- built, pre-tested and optimized. Individual results may vary. 2 Based on current commercially available Big Data appliance product data sheets from large vendors. US ONLY CLAIM. Built-in Expertise Simplified Experience Integration by Design Benefits of IBM PureData System for Hadoop §  Deploy 8x faster than custom-built solutions1 §  Built-in visualization to accelerate insight §  Built-in analytic accelerators2 unlike big data appliances on the market §  Single system console for full system administration §  Rapid maintenance updates with automation §  No assembly required, data load ready in hours §  Only integrated Hadoop system with built-in archiving tools2 §  Delivered with more robust security than open source software §  Architected for high availability
  • 23. © 2013 International Business Machines Corporation 23 SQL Access for Hadoop: Why? •  Data warehouse augmentation is a leading Hadoop use case •  MapReduce is difficult –  MapReduce Java API is tedious and requires programming expertise –  Unfamiliar languages (ie. Pig) also require special skills •  SQL support would open the data to a much wider audience –  Familiar, widely known syntax –  Common catalog for identifying data and structure –  Declarative – clear separation of the what (the data you’re after) vs. the how (processing) Pre-Processing Hub Query-able Archive Exploratory Analysis Information Integration Data Warehouse Streams Real-time processing BigInsights Landing zone for all data Data Warehouse BigInsights Can combine with unstructured information Data Warehouse 1 2 3
  • 24. © 2013 International Business Machines Corporation 24 SQL for Hadoop: What’s the Problem? •  SQL Access to data in Hadoop is challenging –  Data is in many formats •  CSV, JSON, Hive RCFile, HBase, ... •  Some formats (HBase composite keys) don’t map cleanly to relational models –  No schemas or statistics –  Hadoop was not designed to be a query engine •  Hive (with HiveQL): limited query access for Hadoop –  SQL-like, but NOT SQL •  Limited data types – no varchar(n), decimal(p,s), etc… •  Limited join support •  No subqueries •  No windowed aggregates –  Very limited JDBC/ODBC driver –  Everything executes in MapReduce •  Even very small queries requiring little processing
  • 25. © 2013 International Business Machines Corporation 25 Big SQL: Native SQL Query Access for Hadoop •  Native SQL access to data stored in BigInsights –  ANSI SQL 92+ –  Standard syntax support (joins, data types, …) •  Real JDBC/ODBC drivers –  Prepared statements –  Cancel support –  Database metadata API support –  Secure socket connections (SSL) •  Optimization –  Leveraging MapReduce parallelism or… –  Direct access for low-latency queries •  Varied data sources –  HBase (including secondary indexes) –  CSV, Delimited files, Sequence files –  JSON –  Hive tables Big SQL Engine BigInsights Data Sources SQL Hive Tables HBase tables CSV Files Application JDBC / ODBC Server JDBC / ODBC Driver
  • 26. © 2013 International Business Machines Corporation 26 From Getting Starting to Enterprise Deployment InfoSphere BigInsights Brings Hadoop to the Enterprise Basic Edition Enterprise Edition - Accelerators - Performance Optimization - Visualization Capabilities - Pre-built applications - Text analytics - Spreadsheet-style tool - RDBMS, warehouse connectivity - Administrative tools, security - Eclipse development tools - Enterprise Integration . . . . - Web-based mgmt console - Jaql - Integrated install Breadth of capabilities Enterpriseclass Free download Sold by # of terabytes managed Apache Hadoop PureData for Hadoop - Appliance simplicity for the enterprise
  • 27. © 2013 International Business Machines Corporation 27 Where to start with BigInsights? •  Learn it at BigDataUniversity.com •  Try it on Smart Cloud Enterprise: ibm.biz/Bdx8FF •  Read about it in “Harness the Power of Big Data” at ibm.biz/Bdx8RP •  Learn about Big Data at www.ibmbigdatahub.com •  Register for “Big Data at the speed of business” event on April 30th at ibm.co/bigdataevent •  Try BigSQL: bigsql.imdemocloud.com •  YouTube Videos - Big Data Channel: youtube.com/user/ibmbigdata
  • 28. © 2013 International Business Machines Corporation 28 IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. Please Note