Implementing Big Data at the Speed of Business

Copyright © 2013 Splunk Inc.

Big Data at the
Speed of Business

Raanan Dagan, Big Data PM, Splunk

Maciej Jagiellowicz, Monitoring and Response Senior
Specialist , Allegro

What We’ll Talk About

• What is Splunk?
• Real-Time Monitoring and Alerts at Allegro
• Integration Platform with Splunk Applications
• Archiving Big Data at Allegro
• Q&A

• Company (NASDAQ: SPLK) • Online transaction platform
– Founded 2004, first software • Was formed in 1999
release in 2006 • E-commerce leader in
– HQ: San Francisco, CA Central and Eastern Europe,
• 5,200+ Enterprise Customers a group of companies
• #1 Big Data Innovator* managing 129 platforms in
over 23 countries
• #1 Big Data – Pure Play Vendor**
• More then 12.5 million users
* Fast Company's Most Innovative Companies Issue (March 2013)
** Forbes/Wikibon (Feb 2013) • Web site: allegro.pl

Big Data Comes from Machines
Volume | Velocity | Variety | Variability

Machine-generated data is one of the GPS,
RFID,
fastest growing, most complex Hypervisor,
and most valuable segments of big data Web Servers,
Email, Messaging
Clickstreams, Mobile,
Telephony, IVR, Databases,
Sensors, Telematics, Storage,
Servers, Security Devices, Desktops

What Does Machine Data Look Like?
Sources

Order Processing

Middleware
Error

Care IVR

Twitter

Machine Data Contains Critical Insights
Sources
Customer ID Order ID Product ID

Order Processing
Order ID Customer ID

Middleware
Error
Time Waiting On Hold

Customer ID
Care IVR

Twitter ID Customer’s Tweet

Twitter
Company’s Twitter ID

Splunk: The Platform for Machine Data
Machine Data Operational Intelligence

Insight and Visualizations
for Executives

Statistical Analysis

Proactive Monitoring

Splunk Index
Search and Investigation

Serves Needs Across IT and Business
IT Operations Management Web Intelligence

Application Management Business Analytics

Security and Compliance

Customer LOB Owners/
Support Executives

Operations Website/Business
Teams Analysts

System Application IT
Security Auditors
Administrator Developers Executives
Analysts
8

Splunk for Real-Time Monitoring and
Alerts

Why do we like Splunk …

• Meets strategic needs across IT
• Scales from laptop to datacenter to cloud
• For all types of users
• Users want to use it

Where do we use Splunk
• Real time monitoring
- Web servers
- App servers
- Active Directory
- Security devices
• Post incident log analyze
- Historical data analyze
• Application debugging
- Real time log analyze

Splunk Architecture
• Concurrent Users = 250
• Search Heads = 5
• Indexers = 2
• Forwarders = 1500

• Total Data Processed
Per Day = 100GB

Visualizing Real-Time Data in Splunk
Real time monitoring:
• Transactions with financial
institutions and banks
• Monitoring of key referrals to
allegro.pl web site
• Monitoring of applications JMS
queues
• Top areas of application errors
• Business transactions
• Monitoring of SMS and mobile
devices communications

Key Functions

• Searching and Reporting (Search Head)

• Indexing and Search Services (Indexer)

• Local and Distributed Management (Deployment Server)

• Data Collection and Forwarding (Forwarder)

A Splunk install can be one or all roles…

Splunk Components and Scalability
• Distributed analysis
• Automatic load balancing
linearly scales indexing
Search Heads
• Role-based security
Offload search load to Splunk Search Heads

Indexers

Auto load-balanced forwarding to as many Splunk Indexers as you need to index terabytes/day
Forwarders

Send data from 1000s of servers using combination of Splunk Forwarders, syslog, WMI, message queues, or other remote protocols

Splunk Real-time Analytics

Data

Monitor Input Parsing Pipeline Real-time
• Source, event typing Search

TCP/UDP Input • Character set
normalization
• Line breaking
Scripted Input Splunk
• Timestamp identification Raw data Index
Index Files

Splunk Delivers Big Data in Days or Weeks

Product-based Real-time Performance
Solution Platform at scale

Easy to download and Collects data from tens of Proven at multi-terabyte
deploy thousands of sources scale per day
Pre-integrated, end-to- Advanced real-time and Upwards of PB under
end functionality historical analysis of data management
Enterprise-grade features Fast, custom visualizations Thousands of enterprise
for IT and business users customers

Splunk: A Platform for Big Data Integration

Splunk Dev Platform
Ad hoc Monitor Report Custom Developer
search and alert and dashboards Platform • API and SDKs to build
analyze Big Data apps

Splunk DB Connect Splunk Hadoop Connect
• Real-time integration • Reliable bi-directional
to relational DBs integration to Hadoop

SQL

19

Splunk Hadoop Connect

Delivers reliable integration between Splunk and Hadoop

Splunk DB Connect
Reliable, scalable, real-time
integration between Splunk and
traditional relational databases Java Bridge Server
Database Connection Database
Enrich search results with additional Lookup Pooling Query
business context JDBC
Easily import data into Splunk for
deeper analysis
Integrate multiple DBs concurrently Oracle Microsoft SQL Other
Database Server Databases
Simple set-up, non-evasive and secure

21

Splunk Developer Platform
1 2 3
Accelerate Integrate with IT Build Real- me Data
Dev & Test Infrastructure Applica ons

Developer Platform (REST API, SDKs)

Enables enterprise developers to extend the power of Splunk Enterprise with
robust API and Java, JavaScript and Python SDKs

Splunk Hadoop Monitoring
Splunk HadoopOps Splunk HadoopOps
Forwarder Package on every Dashboards, alerts and notifications,
host powered by Splunk search

Add Collect & Distributed Monitor Rich UI
Knowledge Index Data Search & Alert Framewor
k

Host

Operating
System

Infrastructure

Hadoop Components

• Hive
• Flume
• Mahout
• MapReduce

Why and Where do we Use Hadoop

• Big Data archive
• Web services statistics
• Mail flow statistics

Where we do not use Hadoop

• Not for Visualization
• Not for Analytics
• Not for Real-time
• Not for Access Control

Where we are today and where do we
want to be tomorrow

Splunk 5,200+ Licensed Customers

Cloud and Online Services Education Energy and Utilities Financial Services and Insurance

Government Healthcare Manufacturing Media

Retail Technology Telecommunications Travel and Leisure

Splunk Big Data Platform

Product-based Real-time Performance
solution Platform at scale

Visit Splunk Booth

Implementing Big Data at the Speed of Business

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Implementing Big Data at the Speed of Business

Similar to Implementing Big Data at the Speed of Business (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Implementing Big Data at the Speed of Business

Editor's Notes