CNIC Information System with Pakdata Cf In Pakistan
Â
Building a system for machine and event-oriented data - SF HUG Nov 2015
1. Š Rocana, Inc. All Rights Reserved. | 1
JOEY ECHEVERRIA | @fwiffo | November 4th, 2015
San Francisco Hadoop Users Group
Building a System for Machine and
Event-Oriented Data
3. Š Rocana, Inc. All Rights Reserved. | 3
Joey
⢠Where I work: Rocana â Director of Engineering
⢠Where I used to work: Cloudera (â11 â â15), NSA
⢠Distributed systems, security, data processing, âbig dataâ
4. Š Rocana, Inc. All Rights Reserved. | 4
Free stuff!
⢠Tweet @rocanainc with
#SFHUG â best three
tweets get a book
5. Š Rocana, Inc. All Rights Reserved. | 5
What we do
⢠Build a system for the operation of modern data centers
⢠Triage and diagnostics, exploration, trends, advanced analytics of
complex systems
⢠Our data:
⢠logs, metrics, human activity, anything that occurs in the data center
⢠âEnterprise Softwareâ (i.e. we build for others.)
⢠Today: how we built what we built
6. Š Rocana, Inc. All Rights Reserved. | 6
Our typical customer use cases
⢠>100K events / sec (8.6B events / day), sub-second end to end latency,
full fidelity retention, critical use cases
⢠Quality of service - âare credit card transactions happening fast enough?â
⢠Fraud detection - âdetect, investigate, prosecute, and learn from fraud.â
⢠Forensic diagnostics - âwhat really caused the outage last friday?â
⢠Security - âwhoâs doing what, where, when, why, and how, and is that ok?â
⢠User behavior - âcapture and correlate user behavior with system
performance, then feed it to downstream systems in realtime.â
8. Š Rocana, Inc. All Rights Reserved. | 8
High level architecture
9. Š Rocana, Inc. All Rights Reserved. | 9
Guarantees
⢠No single point of failure exists
⢠All components scale horizontally[1]
⢠Data retention and latency is a function of cost, not tech[1]
⢠Every event is delivered provided no more than N - 1 failures occur
(where N is the kafka replication level)
⢠All operations, including upgrade, are online[2]
⢠Every event is (or appears to be) delivered exactly once[3]
[1] weâre positive thereâs a limit, but thus far it has been cost.
[2] from the userâs perspective, at a system level.
[3] when queried via our UI. lots of details here.
11. Š Rocana, Inc. All Rights Reserved. | 11
Modeling our world
⢠Everything is an event
⢠Each event contains a timestamp, type, location, host, service, body, and
type-specific attributes (k/v pairs)
⢠Build specialized aggregates as necessary - just optimized views of the
data
13. Š Rocana, Inc. All Rights Reserved. | 13
Event types
⢠Some event types are standard
⢠syslog, http, log4j, generic text record, âŚ
⢠Users define custom event types
⢠Producers populate event type
⢠Transformations can turn one event type into another
⢠Event type metadata tells downstream systems how to interpret body and
attributes
14. Š Rocana, Inc. All Rights Reserved. | 14
Ex: generic syslog event
event_type_id: 100, // rfc3164, rfc5424 (syslog)
body: ⌠// raw syslog message bytes
attributes: { // extracted fields from body
syslog_message: âDHCPACK from 10.10.0.1 (xid=0x45b63bdc)â,
syslog_severity: â6â, // info severity
syslog_facility: â3â, // daemon facility
syslog_process: âdhclientâ,
syslog_pid: â668â,
âŚ
}
17. Š Rocana, Inc. All Rights Reserved. | 17
Consumers
⢠âŚdo most of the work
⢠Parallelism
⢠Kafka offset management
⢠Message de-duplication
⢠Transformation (embedded library)
⢠Dead letter queue support
⢠Downstream system knowledge
19. Š Rocana, Inc. All Rights Reserved. | 19
Metrics and time series
20. Š Rocana, Inc. All Rights Reserved. | 20
Aggregation
⢠Mostly for time series metrics
⢠Two halves: on write and on query
⢠Data model: (dimensions) => (aggregates)
⢠On write
⢠reduce(a: A, b: A) => A over window
⢠Store âbaseâ aggregates, all associative and commutative
⢠On query
⢠Perform same aggregate or derivative aggregates
⢠Group by the same dimensions
⢠SQL (Impala)
21. Š Rocana, Inc. All Rights Reserved. | 21
Aside: late arriving data (itâs a thing)
⢠Never trust a (wall) clock
⢠Producer determines observation time, rest of the system uses this always
⢠Data that shows up late always processed according to observation time
⢠Aggregation consequences
⢠The same time window can appear multiple times
⢠Solution: aggregate every N seconds, potentially generating multiple aggregates for
the same time bin
⢠This is real and you must deal with it
⢠Do what we did or
⢠Build a system that mutates/replaces aggregates already output or
⢠Delay aggregate output for some slop time; drop it if late data shows up
22. Š Rocana, Inc. All Rights Reserved. | 22
Ex: service event volume by host and minute
⢠Dimensions: ts, window, location, host, service, metric
⢠On write, aggregates: count, sum, min, max, last
⢠epoch, 60000, us-west-2a, w2a-demo-1, sshd, event_volume =>
17, 42, 1, 10, 8
⢠On query:
⢠SELECT floor(ts / 60000) as bin, loc, host, service, metric, sum(value_sum) FROM metrics
WHERE ts BETWEEN x AND y AND metric = âevent_volumeâ GROUP BY bin, loc, host,
service, metric
⢠If late arriving data existed in events, the same dimensions would repeat with a
another set of aggregates and would be rolled up as a result of the group by
⢠tl;dr: normal window aggregation operations
23. Š Rocana, Inc. All Rights Reserved. | 23
Extension, pain, and advice
24. Š Rocana, Inc. All Rights Reserved. | 24
Extending the system
⢠Custom producers
⢠Custom consumers
⢠Event types
⢠Parser / transformation plugins
⢠Custom metric definition and aggregate functions
⢠Custom processing jobs on landed data
25. Š Rocana, Inc. All Rights Reserved. | 25
Pain (aka: the struggle is real)
⢠Lots of tradeoffs when picking a stream processing solution
⢠Apache Samza: right features, but low level programming model, not supported
by vendors. missing security features.
⢠Apache Storm: too rigid, too slow. not supported by all Hadoop vendors.
⢠Apache Spark streaming: tons of issues initially, but lots of community energy.
improving.
⢠@digitallogic: âMy heart says Samza, but my head says Spark Streaming.â
⢠Our (current) needs are meager; do work inside consumers.
⢠Stack complexity, (relative im)maturity
⢠Scaling solr cloud to billions of events per day
26. Š Rocana, Inc. All Rights Reserved. | 26
If youâre going to try thisâŚ
⢠Read all the literature on stream processing[1]
⢠Treat it like the distributed systems problem it is
⢠Understand, make, and make good on guarantees
⢠Find the right abstractions
⢠Never trust the hand waving or âhello worldsâ
⢠Fully evaluate the projects/products in this space
⢠Understand itâs not just about search
[1] wait, like all of it? yeah, like all of it.
27. Š Rocana, Inc. All Rights Reserved. | 27
Things I didnât talk about
⢠Reprocessing data when bad code / transformations are detected
⢠Dealing with data quality issues (âthe struggle is realâ part 2)
⢠The user interface and all the fancy analytics
⢠data visualization and exploration
⢠event search
⢠anomalous trend and event detection
⢠metric, source, and event correlation
⢠motif finding
⢠noise reduction and dithering
⢠Event delivery semantics (e.g. at least once, exactly once, etc.)
⢠Alerting
28. Š Rocana, Inc. All Rights Reserved. | 28
Questions?
@fwiffo | batman@rocana.com