Weitere ähnliche Inhalte
Ähnlich wie Haven 2 0 (20)
Mehr von Data Science Warsaw (20)
Haven 2 0
- 1. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HPHAVEn
BigDataUseCases
Mikolaj Nietz, Solution Architect
Application Services Global Delivery,
Hewlett-Packard
- 2. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The changing Big Data landscape
Human InformationMachine Data
Business
Data
10% of Information
90% of Information
Annual
Growth
~100%
~10%
- 3. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Interact with and process 100% of your data seamlessly
Imagine if you could…
Transactional
data Social media Images AudioVideoMobile Email TextsDocumentsIn-memoryHadoop
Standard APIs and tools
Dashboards & alerts Business intelligence Your custom appsPackaged apps
Ingest Analyze Understand
Machine Data Business Data Human Information
Open connectors
- 4. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data Platform
HAVEn
HAVEn
Social media IT/OT ImagesAudioVideo Transactional
data
Mobile Search engineEmail Texts
Catalogue massive
volumes of
distributed data
Hadoop/
HDFS
Process and index
all information
Autonomy
IDOL
Analyze at
extreme scale
in real-time
Vertica
Collect & unify
machine data
Enterprise
Security
Powering
HP Software
+ your apps
nApps
Documents
hp.com/Haven
- 5. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Why HAVEn?
Hadoop
Autonomy IDOL
Vertica
Enterprise Security (HP ArcSight)
n – a numer of other apps
„Safe Haven” = „Bezpieczna Przystań”
- 6. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP HAVEn/Big Data
Reference Architecture
Rich-media data
Unstructured
text data
Mixed-structure
data
Unknown-structure
data
Semi-structured
text data
Structured
text data
ODS
EDW
Data marts
Hadoop
HDFS
Map Reduce
Data integration
NotOnly SQL
Analytics
Operational mgt.
Access-in-place
Meaning-based
analytics
(Autonomy IDOL)
Autonomy
value-add
applications
BI/
Visualization
tools
Analytic
tools
Lightweight
ETL
Hadoop Extended Tools
Access-in-place
Indexed metadata
Vertica
Analytics RDBMS
Native analytics
UDx extensions
R-Functions
Access-in-place
Indexed metadata
WWW
- 7. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Apache Hadoop
Has flexibility to store and mine
any type of data
• Query previously inaccessible
structured and unstructured data
• Not bound by single schema
Excels at processing
complex data
• Scale-out architecture divides
workloads across multiple nodes
• Flexible file system eliminates
ETL* bottlenecks
Scales
economically
• Deployable on commodity
hardware
• Open source platform guards
against vendor lock
Hadoop
Distributed File
System (HDFS)
Self-healing,
high bandwidth
clustered storage
MapReduce
Distributed
Computing
Framework
Open source Linux-based platform for
data storage and processing that is…
Scalable
Fault tolerant
Distributed
Core HADOOP system components (Workloads)
Like Linux, there are several distributions of Hadoop
- 8. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP Autonomy IDOL
Social Media Video Audio Email Texts Mobile Transactional
Data
Documents XML Search Engine Images
HP Autonomy
IDOL Applications
Autonomy Connectors
eDiscovery
Enterprise Search
Media
Monitoring
Social Media
Analytics
Decision
Support
Augmented
Reality
Partner/
In-house apps
HC Analytics
Repositories
Information
Types
Apps
500
Functions
IDOL Services Multimedia
Informatics
Enrichment
Capture
InteractionAnalytics
Discovery
Concept
Clouds
Active
MatchingVisualization
ACA
MediaBin
Connected LiveVault
TRIM
AeD
Data Protector
WorkSite
DigitalSafe
Connectors
…
CloudEnterprise
IDOL
OS for Human Information
ERP
CRM
Database Jive…
Image
HIS
Data Warehouse
Hadoop
SharePoint
- 9. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Seamlessly access virtually any enterprise content repository, including file systems, email, or
knowledge bases
400+ connectors
All data types, all content repositories – unmatched understanding
HP Autonomy IDOL platform
High-performance human information processing
HP Autonomy IDOL
Leverage the power of functions like sentiment, categorization, and clustering to deliver intelligence and
insight
Over 500 functions
Process virtually any file type such as text (email, tweet, document), audio, video, and even people
profiles & behavior
1,000+ file types
Achieve big data scalability and high performance with distributable ingest and query architecture
Distributable architecture
- 10. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP Vertica
Real Time Analytics Platform
Standard SQL Interface
Native
High
Availability
Auto
Database
Design
Advanced
Compression
Column
Orientation
MPP Massively
Parallel
Processing
Leverages BI, ETL,
Hadoop/MapReduce and
OLTP investments
Automatic setup,
optimization, and
DB management
Built-in redundancy
that also speeds up
queries
Native DB-aware
clustering on low-cost
x86 Linux nodes
Up to 90% space
reduction using 12+
algorithms
• 10x – 100x performance than
classic RDBMS
• High scalability from TBs to
PBs
• Simple integration with
existing ETL and BI solutions
• Superior performance on off-
the-shelf hardware
• Ultimate deployment flexibility
• 24/7 Load and Query
• Flexzone
• Very close Hadoop integration
• Soon-to-come: Vertica-on-
Yarn
- 11. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Why Hadoop and Vertica are complementary
• Designed for Performance
• Interactive Analytics
• A Rich SQL Ecosystem
• Designed for Fault Tolerance
• Storage & Batch Processing
• A Rich Programming Model
Both purpose-built scalable platforms
- 12. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Gain insight into your data in near-real time by running queries 50x-1,000x faster than legacy products
Blazing fast analytics
Speed, scalability, and openness at lower TCO
HP Vertica Analytics platform
High-performance data analytics platform purpose-built for big data
HP Vertica
Infinitely scale your solution by adding an unlimited number of industry-standard servers
Massive scalability
Protect and embrace your investment in hardware and software with built-in support for Hadoop, R, and
a range of ETL and BI tools
Open architecture
Store 10x-30x more data per server than row databases with patented columnar compression
Optimized data storage
- 13. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Collect, normalize, and categorize machine data such as logs, events, and flows from any device, any
time, anywhere from any vendor
315+ connectors
Collect, store, and analyze any machine data across IT
HP ArcSight Universal log management platform
High-performance universal log management to consolidate machine data across IT
HP ArcSight
The unified machine data through filtering and parsing is enriched with rich metadata, which allows you to search
machine data through simple text-based keywords without the need of domain expertise
Search over 1,000,000 events per second
The unified data is stored through high compression ratio in any of your existing storage formats,
eliminating the need for expensive databases and DBAs
Store years’ worth of data
Built-in content packs, algorithms, rules, and the unified machine data help you deploy IT
security, IT operations, IT GRC, and log analytics
Analytics & intelligence
- 14. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The „n”
- 15. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Autonomy + Vertica + Tableau + HP Anywhere on Tablet
- 16. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
German Car Manufacturer
Early Warning System
Business problem
Detect unusual increases in the number of
warranty repairs (OT warranty) as soon as they
appear.
Data analysis problem
Detect anomalies (outliers) in time series.
- 17. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
External
Internal
German Car Manufacturer
Big Data Labs
Warranty
Repairs
Landing
Zone
Integrated
Data
Analytical
Record
Analytical
Processing
Visualization
HP HAVEn Platform
Repairs
Claims
Sales
Storage
Parts &
Production
Diagnostics
Reference
Weather
- 18. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Global Telecommunication Group
Log Analysis
Vertica ClusterNFS
Hadoop Cluster
Log System
POC environment
Vertica Hadoop Connector
JDBC
3 Vertica nodes:
• 2x2 core Intel XEON @ 2.7 GHz
• 32 GB RAM
• 9.7 TB storage
Java applications
Analytics & Reporting
clients
- 19. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Global Cranes Manufacturer
Sensor Data Analysis
Remote
- 20. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Facebook
Big Data Architecture for Log Analysis
Mobile
PC/Laptop
Web Servers
Logs
Hadoop/
HDFS 2 huge Hadoop
Clusters
• 1.7 ExaBytes
• 15000 nodes
• 40000 nodes
Job
Scheduler
Vertica
Logs
15
mins
Hourly
Daily
Legacy
• 600K MR Jobs/day
• 50K Informatica Jobs/day
- 21. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Develop Operate
SecureMonetize
Govern
HAVEn
hp.com/haven
Thank you!
- 22. © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Resources:
• www.hp.comhaven
• www.vertica.com
• www.autonomy.com
• www.hortonworks.com
• Vertica to try:
https://my.vertica.com/?redirect_to=https%3A%2F%2Fmy.vertica.com%2Fdownl
oad-community-edition%2F
• About HAVEn-on-demand:
http://www.datacenterknowledge.com/archives/2014/12/03/hp-launches-big-
data-cloud-called-haven-ondemand/