The document discusses machine data and how it can be used. It defines machine data as logs and usage data produced by devices and systems. It provides examples of sources such as sensors, applications, websites, and smart home devices. The document also outlines how machine data can help governments and organizations improve services, security, business processes, and compliance.
We are nearing the dawn of a very interesting age. From robotics, to smart homes, to web-connected light bulbs, HVAC units, servers and routers—machines are in use everywhere. These machines have a lot to say, but what happens when you start listening? What things come to light and what new discoveries can you make? What questions can you now ask of your world? This session will explore machine to machine analytics as government organizations deploy more applications for their citizens and contend with an exploding Internet of Things.
Data is growing and embodies new characteristics not found in traditional structured data: Volume, Velocity, Variety, Variability.
Machine data is one of the fastest, growing, most complex and most valuable segments of big data. "Big data" is a term applied to these expanding data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time.
All the webservers, applications, network devices – all of the technology infrastructure running an enterprise or organization – generates massive streams of data, in an array of unpredictable formats that are difficult to process and analyze by traditional methods or in a timely manner.
Why is this “machine data” valuable? Because it contains a trace - a categorical record - of user behavior, cyber-security risks, application behavior, service levels, fraudulent activity and customer experience.
Physically, the data may look like streams of cryptic information, but it contains valuable information. Weather sensors may stream temperature, barometric pressure, and humidity information. Door swipes log the entry of employees to a facility. Wireless routers track user online access, and ez-pass reader monitor traffic flowing though [CITE CURRENT/BIGGEST CITY] midtown Manhattan. All of of this data is continually produced, but thousands of devices, and it can contain a blueprint to what is going on in the world around us.
If you can correlate and visualize related events across these disparate sources, you can build a picture of activity, behavior and experience. And what if you can do all of this in real-time? You can respond more quickly to events that matter.
For example, if an organizations captured the customers twitter ID in their customer profile this correlation would be possible. Where that didn’t exist, they could at least group by demographic with the tweets.
You can extrapolate this example to a wide range of use cases – security and fraud, transaction monitoring and analysis, web analytics, IT operations and so on.
So what do most organizations do with all of this Machine data?
IT security of course!
But today we’ll be exploring a bunch of other use cases, all of which could be implemented today by one department or another in this room.
-----------------
OLD CONTENT – JUST FOR BACKGROUND ON BORROWED SLIDE
For years, state & local government has been aggregating and storing the massive amounts of data they collect, all resting on the promise that someday the information will be valuable. Now is the time for agencies to capitalize on their data.
The technology now exists to quickly process and analyze data. Organizations can finally derive value from their data lakes as well as from their machine data, or data that is created without the intervention of humans, from transactions, APIs, call centers, sensors, and more.
Currently, machine data is coming from every one of New York’s departments in one form or another.
<animation>
Add to that machine data from the general public, retail and manufacturing segments, and you have a treasure trove of information.
But all of this data may as well be passing river water, if we don’t collect it. So where can can we get it from, and where are we getting it from now?
Lets look at 6 examples of machine data being collected by various departments in major cities.
Many of you in this room may be directly tied to some of these example – so please see me afterward with any corrections!
Already, there are many city services collecting machine data. A few notable one include these:
This 3D tracking Radar and camera are a red-light safety system manufactured by Arizona’s “American Traffic Solutions” (ATS). The radar portion of the system tracks passing vehicles’ location and, if a car runs a red light, the camera is activated. That image is then processed to recognize the license plate number, and a ticket is mailed to the registered address. A ticket issued by this system is treated differently to one issued by a police officer, with a lower fine and no points on the driver’s license, as it is difficult to prove who was driving the vehicle at the time. Due to NY state legislation, the number of red light systems is heavily regulated.
Source: http://www.invisibleboxes.info/3d-tracking-radar/
Already, there are many city services collecting machine data. A few notable ones include:
Noise & Vibration Monitor
Data is automatically sent to a compliance website
Provides e-mail alerts, reports, and warnings of excessive noise or vibration
The monitoring system includes four (4) noise and five (5) vibration monitors placed along the perimeter of the WTC site in Lower Manhattan. ButterJAM established a web interface which queries noise and vibration monitoring databases within the off-site server. The data is automatically sent to an innovative compliance website which displays appropriate data and allows access by authorized users. The system provides e-mail alerts, warning of any noise or vibration limit exceedance or equipment malfunction. In addition, the system automatically generates reports and directly e-mails reports to authorized personnel.
Source: http://www.invisibleboxes.info/noise-monitoring-terminal/
Already, there are many city services collecting machine data. A few notable one include these:
[ORIGINAL TEXT: Subway turnstiles, police cameras and license plate readers, traffic signal control systems, and the soon-to-be release LinkNYC pay-hone replacement.]
[FOR TX, ADD/REPLACE WITH:
License plate scanners/toll booths/tag readers
HVAC
Traffic light]
Security Monitoring
Intelligent decision making
Alerting
Reporting
OLD CONTENT:
Web analytics insights
Improving citizen self-service
Monitoring busiest usage time for resource allocation
Analyzing form abandonment rate
There are 4 key areas where Splunk helps accelerate business analytics:
Digital Marketing – Real-time insights into marketing campaigns, user engagement and shopping cart conversion across multiple channels. Digital marketers, web/digital analyst looking to complement free tools and moving beyond single source of data (clickstream) benefit from using Splunk.
Customer Experience Analytics – Measurement and analysis of customer behavior and identifying opportunities to increase customer engagement, conversion. Web/Digital Analyst or WebOps teams responsible for providing a better user experience on the site require going deeper into the data and combining/correlating data across various sources.
Product Analytics – Analysis of product feature adoption, usage and effectiveness resulting in better conversion or user engagement. Product managers/Product Analyst that monitor and optimze the website or mobile app benefit from Splunk as they get usage/adoption of the features in real-time and can pinpoint areas of opportunities for improvement.
Business Process Analytics - Business process analytics provides end-to-end real-time insights across the complete business process. Taking data from middleware and from various applications or touchpoints within websites or services help business owners, customer service organizations, business analyst monitor and optimize business processes.
CHOOSE MAJOR EVENT IN YOUR CITY
MA: PATRIOTS GAME?
TX: SXSW/ACL OR RODEO
CHICAGO: OBAMA COMING HOME?
If a street is closed:
How will traffic be affected?
How will parking revenue be impacted?
How should busses be rerouted?
How do we measure the effectiveness of our decisions in real-time?
SXSW/ACL – city of austin/travis county – picture
Generic rodeo picture – houston – police officer upset w/ people and lobbed stun grenade at rodeo - caught
I’m looking for better pictures
Transactions are determined by UIDs within several systems (transaction_id, correlation_id)
At first, I had no requirements. No one knew the questions they would have to answer. Then I had Santa’s wishlist, with questions I could not hope to answer like, “How many people live in this customer’s residence?” So I took my best stab at what I thought the users wanted based on the available data and went from there. I call this ‘Requirement Inference’.