2. What We’ll Talk About
• Our quest for visibility
• Analyzing at scale
• Splunk and Big Data
• Where do you start?
• Q&A
3. About Splunk
Company (NASDAQ: SPLK)
Founded 2004, first software release in 2006
HQ: San Francisco
Business Model / Products
Industry-leading machine data platform
On-premise, in the cloud and SaaS
5,600+ Customers
63 of the Fortune 100
Largest license: 100 Terabytes per day
#1 Big Data Innovator*
* Fast Company's Most Innovative Companies Issue (March 2013)
4. About ShareThis and Socialize
ShareThis makes the world more
connected, trusted and valuable through sharing
Powers the social web, touching the lives
of 95 percent of U.S.
Acquires Socialize, which makes mobile
and social more engaging
Socialized integrated into thousands of
iOS and Android Apps
Installed on 80M+ devices
15. Expanding Universe of Data Sources
Machine-generated DataBusiness Application Data Human-generated Data
Highly Structured Arbitrarily Structured
2012-12-05 07:04:44
Id=00Q000000Rd910EAJ City=New York
Country=US CreatedDate=“2012-12-05
07:06:44” Email.jdoe@gmail.com
Email_Opt_In_c Customer_Street
_Address_c=“123 Main St.”
purchased_product_id=
product_i BD-01 twitter_username
john_t_doe
16. Industry Leading Platform for Machine Data
Any Machine Data Operational Intelligence
HA Indexes
and Storage
Commodity
Servers
Developer
Platform
Custom
dashboards
Monitor
and alert
Ad hoc
search
Report and
analyze
17. Analyzing Heterogeneous Data
Universal Index Schema-on-the-fly Flexibility and
Fast Time to Value
• No data normalization
• Automatically handles
timestamps
• Parsers not required
• Index every term &
pattern “blindly”
• No attempt to
“understand” up front
• Structure applied at
search-time
• No brittle schema to
work around
• Automatically find
transactions, patterns
and trends
• Normalization as it’s
needed
• Faster implementation
• Easy search language
• Multiple views into the
same data
18. Gain Critical Insights … in Real-time
Order ID
Customer’s Tweet
Time Waiting On Hold
Product ID
Company’s Name
Sources
Twitter
Care IVR
Middleware
Error
Order Processing
Order ID
Customer ID
Twitter ID
Customer ID
Customer ID
19. Deep Visibility and Insight for IT and Business
IT Operations Management Web Intelligence
Business AnalyticsApplication Management
Security and Compliance Industrial Data / Internet of Things
Over 5,600 organizations using Splunk across IT and business users
21. Hadoop
The ShareThis Insights Platform
On Father’s day:
“Who were the most shared about topics?”
“What type of type of beers do people drink?”
API ETL Pre-
aggregation
Analytics
?
22. Finding the Optimal Approach
Hadoop and MapReduce are great for complex data science on data
at rest – the previous architecture took 9 months with a team of
engineers, data architects, etc.
The Splunk platform delivers real-time, interactive analysis –
we can build many of the same insights within 1 hour
What should be the core focus or competency of your team?
Conclusion: find the most optimal approach for the business
24. PR Insights Example
What was the situation? (e.g. fast moving business, needed
real-time insights)
What was the PR team struggling with? Difficult to find useful
data to build interesting use-cases
What did they want? They wanted a flexible real-time reporting
environment to extract insights useful for the market
How my team helped? Delivered a single dashboard that contained
real-time data into the sharing behaviors across our network
27. Operational Analytics for an Online World
website
API Notification
Google (GCM)
Feedback
Processor
Apple (APNS)
? !
Notifications Systems
Driving Superior Customer Experience
How many 500 errors
have I had over time?
Look for anomalies
and spikes!
Zone in directly
to the customer!!
Online Device Notifications
30. Derive Actionable Insights from Raw Data
Hadoop
Storage
Immediately
start
exploring, analyz
ing and
visualizing raw
data in Hadoop
1 2Point
Splunk at
Hadoop
Cluster
Explore Analyze Visualize Dashboards Share
Splunk $186 million Turns machine data into valuable insightsSplunk now has more than 600 employees worldwide, with headquarters in San Francisco and 14 offices around the world.Since first shipping its software in 2006, Splunk now has over 4,400 customers in 80+ countries. These organizations are using Splunk software to improve service levels, reduce operations costs, mitigate security risks, enable compliance, enhance DevOps collaboration and create new product and service offerings. Please always refer to latest company data found here: http://www.splunk.com/company.
Talk specifically about how Splunk supports:Volume – scalable real-time architecture.Velocity – horizontal scalability.Variety – universal forwarding and indexing for highly diverse data from thousands of heterogeneous sources.Variability –late-binding schema for maximum search time analysis.
When we look more closely at the data we see that it contains critical information – customer id, order id, time waiting on hold, twitter id … what was tweeted. What’s important is first of all the ability to actually see across all these disparate data sources, but then to correlate related events across disparate sources, to deliver If you can correlate and visualize related events across these disparate sources, you can build a picture of activity, behavior and experience. And what if you can do all of this in real-time? You can respond more quickly to events that matter. You can extrapolate this example to a wide range of use cases – security and fraud, transaction monitoring and analysis, web analytics, IT operations and so on.
Splunk turns raw machine data to new visibility, insights and analytics for IT and business professionals. Intelligence from operational data can help organizations meaningfully improve performance in a wide range of areas e.g. meet service levels, reduce costs, mitigate security risks, maintain compliance and gain insights. As well as providing analysis of real-time activity and behavior of products, users, services, servers.Example users of Splunk today include:Customer supportOperations teamsSysadminsApp developersSecurity analystsAuditorsIT execsWeb/biz analystsLOB owners / execs
API -> Notification Server -> Either Apple or Google -> At some time later, they will respond back with whether there were any real problems. With Splunk I can look at each individual piece as a whole and look at how the message traversed through the system.Without Splunk – would not know how to do it.
This is why we are announcing a new product from Splunk. It’s in Beta, it’s called “Hunk” and it’s SPLUNK ANALYTICS FOR HADOOP.This is a NEW PRODUCT from Splunk that delivers INTERACTIVE DATA EXPLORATION, ANALYSIS and VISUALIZATIONS FOR HADOOP.
Because it’s based on proven Splunk technology – deployed at thousands of organizations, we’ve naturally made it easy to deploy.Simply point it at your Hadoop cluster and start interacting with and analyzing data immediately.