Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014

© 2014 MapR Technologies 2
Agenda
• The Internet is turning upside down
• Distributed nervous system
• The last (mile) shall be first
• Time series on NO-SQL
• Faster time series on NO-SQL

How the Internet Works
• Big content servers feed data across the backbone to
• Regional caches and servers feed data across neighborhood
transport to
• The “last mile”
• Bits are nearly conserved, $ are concentrated centrally
– But total $ mass at the edge is much higher

How The Internet Works
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2

Conservation of Bits Decreases Bandwidth
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2

Total Investment Dominated by Last Mile
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2

The Rub
• What's the problem?
– Speed (end-to-end latency, backbone bw)
– Feasibility (cost for consumer links)
– Caching
• What do we need?
– Cheap last-mile hardware
– Good caches

What has changed?
Where will it lead?

Things

Emitting data

How The Internet Works
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2

How the Internet is Going to Work
Server
Cache
Cache
Controller Switch Gateway
m4
m3
Gateway
Switch
Controller
m6
m5
Switch
m2 Controller
m1

Where Will The $ Go?
Server
Cache
Cache
Controller Switch Gateway
m4
m3
Gateway
Switch
Controller
m6
m5
Switch
m2 Controller
m1

Sensors

Controllers

The Problems
• Sensors and controllers have little processing or space
– SIM cards = 20Mhz processor, 128kb space = 16kB
– Arduino mini = 15kB RAM (more EPROM)
– BeagleBone/Raspberry Pi = 500 kB RAM
• Sensors and controllers have little power
– Very common to power down 99% of the time
• Sensors and controls often have very low bandwidth
– Mesh networks with base rates << 1Mb/s
– Power line networking
– Intermittent 3G/4G/LTE connectivity

What Do We Need to Do With a Time Series
• Acquire
– Measurement, transmission, reception
– Mostly not our problem
• Store
– We own this
• Retrieve
– We have to allow this
• Analyze and visualize
– We facilitate this via retrieval

Retrieval Requirements
• Retrieve by time-series, time range, tags
– Possibly pull millions of data points at a time
– Possibly do on-the-fly windowed aggregations
• Search by unstructured data
– Typically require time windowed facetting after search
– Also need to dive in with first kind of retrieval

Storage choices and trade-offs
• Flat files
– Great for rapid ingest with massive data
– Handles essentially any data type
– Less good for data requiring frequent updates
– Harder to find specific ranges
• Traditional relational db
– Ingests up to 10,000’s/ sec; prefers well structured (numerical) data; expensive
• Non-relational db: Tables (such as MapR tables in M7 or HBase)
– Ingests up to 100,000 rows/sec
– Handles wide variety of data
– Good for frequent updates
– Easily scanned in a range

Specific Example
• Consider a server farm
• Lots of system metrics
• Typically 100-300 stats / 30 s
• Loads, RPC’s, packets, requests/s
• Common to have 100 – 10,000 machines

The General Outline
• 10 samples / second / machine
x 1,000 machines
= 10,000 samples / second
• This is what Open TSDB was designed to handle
• Install and go, but don’t test at scale

Specific Example
• Consider oil drilling rigs
• When drilling wells, there are *lots* of moving parts
• Typically a drilling rig makes about 10K samples/s
• Temperatures, pressures, magnetics,
machine vibration levels, salinity, voltage,
currents, many others
• Typical project has 100 rigs

The General Outline
• 10K samples / second / rig
x 100 rigs
= 1M samples / second

The General Outline
• 10K samples / second / rig
x 100 rigs
= 1M samples / second
• But wait, there’s more
– Suppose you want to test your system
– Perhaps with a year of data
– And you want to load that data in << 1 year
• 100x real-time = 100M samples / second

How does that Work (Open TSDB on MapR)?
Message
queue
Collector
MapR
table Samples
Web service Users

Introduction to Open TSDB
HBase
or
MapR-DB

Wide Table Design: Point-by-Point

Wide Table Design: Hybrid Point-by-Point + Blob
Insertion of data as blob makes original columns redundant
This is the way that TSD should work, not quite how it does work

Speeding up OpenTSDB
20,000 data points per second per node in the test cluster
Why can’t it be faster ?

Status to This Point
• Each sample requires one insertion, compaction requires
another
• Typical performance on SE cluster
– 1 edge node + 4 cluster nodes
– 20,000 samples per second observed
– Would be faster on performance cluster, possibly not a lot
• Suitable for server monitoring
• Not suitable for large scale history ingestion
• Bulk load helps a little, but not much
• Still 1000x too slow for industrial work

Small Trick … Buffer Data in Memory
Message
queue Samples
Users
Collector
MapR
table
Web service
Log
Buffering data for 1 hour in
collector allows >1000x
decrease in insertion rate
Logging latest hour of data allows
clean restart of collector
(lambda + epsilon architecture)
Web service queries
database and
collector

Speeding up OpenTSDB: open source MapR extensions
Available on Github: https://github.com/mapr-demos/opentsdb

Status to This Point
• 3600 samples require one insertion
• Typical results on SE cluster
– 1 edge node + 4 cluster nodes
– 14 million samples per second observed
– ~700x faster ingestion
• Typical results on performance cluster
– 2-4 edge nodes + 4-9 cluster nodes
– 110 million samples/s (4 nodes) to >200 million samples/s (8 nodes)
• Suitable for large scale history ingestion
• 30 million data points retrieved in 20s
• Ready for industrial work

Key Lessons
• Ingestion is network limited
– Edge nodes are the critical resource
– Number of edge nodes defines a limit to scaling
• With enough edge nodes scaling is near perfect
• Performance of raw OpenTSDB is limited by stateless demon
• Modified OpenTSDB can run 1000x faster

Overall Ingestion Rate
Nodes
Total Ingestion Rate (millions of points / second)
4 5 8 9
0 50 150 250

Normalized Ingestion Rate
Nodes
Ingestion per node (millions of points / second)
4 5 8 9
0 10 20 30 40

Why MapR?
• MapR tables are inherently faster, safer
– Sustained > 1GB/s ingest rate in tests
• Mirror to M5 or M7 cluster to isolate analytics load
• Transaction logs involves frequent appends, many files

When is this All Wrong?
• In some cases, retrieval by series-id + time range not sufficient
• May need very flexible retrieval of events based on text-like
criteria
• Search may be better than class time-series database
• Can scale Lucene based search to > 1 million events / second

Summary
• The internet is turning upside down
• This will make time series ubiquitous
• Current open source systems are much too slow
• We can fix that with modern NoSQL systems
– (I wear a red hat for a reason)

Questions

Thank You
@mapr maprtech
tdunning@mapr.com
tdunning@apache.org
Ted Dunning, Chief Application Architect
MapRTechnologies
maprtech
mapr-technologies

Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014

Ähnlich wie Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014 (20)

Mehr von NoSQLmatters

Mehr von NoSQLmatters (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014