SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Downloaden Sie, um offline zu lesen
© 2014 MapR Techno©lo 2g0ie1s4 MapR Technologies 1
© 2014 MapR Technologies 2 
Agenda 
• The Internet is turning upside down 
• Distributed nervous system 
• The last (mile) shall be first 
• Time series on NO-SQL 
• Faster time series on NO-SQL
© 2014 MapR Technologies 3 
How the Internet Works 
• Big content servers feed data across the backbone to 
• Regional caches and servers feed data across neighborhood 
transport to 
• The “last mile” 
• Bits are nearly conserved, $ are concentrated centrally 
– But total $ mass at the edge is much higher
© 2014 MapR Technologies 4 
How The Internet Works 
Server 
Cache 
Cache 
Gateway 
Switch 
Firewall 
c1 
c2 
Gateway 
Switch Firewall 
c1 
c2 
Switch 
Firewall c1 
c2
© 2014 MapR Technologies 5 
Conservation of Bits Decreases Bandwidth 
Server 
Cache 
Cache 
Gateway 
Switch 
Firewall 
c1 
c2 
Gateway 
Switch Firewall 
c1 
c2 
Switch 
Firewall c1 
c2
© 2014 MapR Technologies 6 
Total Investment Dominated by Last Mile 
Server 
Cache 
Cache 
Gateway 
Switch 
Firewall 
c1 
c2 
Gateway 
Switch Firewall 
c1 
c2 
Switch 
Firewall c1 
c2
© 2014 MapR Technologies 7 
The Rub 
• What's the problem? 
– Speed (end-to-end latency, backbone bw) 
– Feasibility (cost for consumer links) 
– Caching 
• What do we need? 
– Cheap last-mile hardware 
– Good caches
© 2014 MapR Technologies 8 
What has changed? 
Where will it lead?
© 2014 MapR Technologies 9
© 2014 MapR Technologies 10
© 2014 MapR Technologies 11
© 2014 MapR Technologies 12
© 2014 MapR Technologies 13
© 2014 MapR Technologies 14
© 2014 MapR Technologies 15
© 2014 MapR Technologies 16
© 2014 MapR Technologies 17
© 2014 MapR Technologies 18
© 2014 MapR Technologies 19 
Things
© 2014 MapR Technologies 20 
Emitting data
© 2014 MapR Technologies 21 
How The Internet Works 
Server 
Cache 
Cache 
Gateway 
Switch 
Firewall 
c1 
c2 
Gateway 
Switch Firewall 
c1 
c2 
Switch 
Firewall c1 
c2
© 2014 MapR Technologies 22 
How the Internet is Going to Work 
Server 
Cache 
Cache 
Controller Switch Gateway 
m4 
m3 
Gateway 
Switch 
Controller 
m6 
m5 
Switch 
m2 Controller 
m1
© 2014 MapR Technologies 23 
Where Will The $ Go? 
Server 
Cache 
Cache 
Controller Switch Gateway 
m4 
m3 
Gateway 
Switch 
Controller 
m6 
m5 
Switch 
m2 Controller 
m1
© 2014 MapR Technologies 24 
Sensors
© 2014 MapR Technologies 25 
Controllers
© 2014 MapR Technologies 26 
The Problems 
• Sensors and controllers have little processing or space 
– SIM cards = 20Mhz processor, 128kb space = 16kB 
– Arduino mini = 15kB RAM (more EPROM) 
– BeagleBone/Raspberry Pi = 500 kB RAM 
• Sensors and controllers have little power 
– Very common to power down 99% of the time 
• Sensors and controls often have very low bandwidth 
– Mesh networks with base rates << 1Mb/s 
– Power line networking 
– Intermittent 3G/4G/LTE connectivity
© 2014 MapR Technologies 27 
What Do We Need to Do With a Time Series 
• Acquire 
– Measurement, transmission, reception 
– Mostly not our problem 
• Store 
– We own this 
• Retrieve 
– We have to allow this 
• Analyze and visualize 
– We facilitate this via retrieval
© 2014 MapR Technologies 28 
Retrieval Requirements 
• Retrieve by time-series, time range, tags 
– Possibly pull millions of data points at a time 
– Possibly do on-the-fly windowed aggregations 
• Search by unstructured data 
– Typically require time windowed facetting after search 
– Also need to dive in with first kind of retrieval
© 2014 MapR Technologies 29 
Storage choices and trade-offs 
• Flat files 
– Great for rapid ingest with massive data 
– Handles essentially any data type 
– Less good for data requiring frequent updates 
– Harder to find specific ranges 
• Traditional relational db 
– Ingests up to 10,000’s/ sec; prefers well structured (numerical) data; expensive 
• Non-relational db: Tables (such as MapR tables in M7 or HBase) 
– Ingests up to 100,000 rows/sec 
– Handles wide variety of data 
– Good for frequent updates 
– Easily scanned in a range
© 2014 MapR Technologies 30 
Specific Example 
• Consider a server farm 
• Lots of system metrics 
• Typically 100-300 stats / 30 s 
• Loads, RPC’s, packets, requests/s 
• Common to have 100 – 10,000 machines
© 2014 MapR Technologies 31 
The General Outline 
• 10 samples / second / machine 
x 1,000 machines 
= 10,000 samples / second 
• This is what Open TSDB was designed to handle 
• Install and go, but don’t test at scale
© 2014 MapR Technologies 32 
Specific Example 
• Consider oil drilling rigs 
• When drilling wells, there are *lots* of moving parts 
• Typically a drilling rig makes about 10K samples/s 
• Temperatures, pressures, magnetics, 
machine vibration levels, salinity, voltage, 
currents, many others 
• Typical project has 100 rigs
© 2014 MapR Technologies 33 
The General Outline 
• 10K samples / second / rig 
x 100 rigs 
= 1M samples / second
© 2014 MapR Technologies 34 
The General Outline 
• 10K samples / second / rig 
x 100 rigs 
= 1M samples / second 
• But wait, there’s more 
– Suppose you want to test your system 
– Perhaps with a year of data 
– And you want to load that data in << 1 year 
• 100x real-time = 100M samples / second
© 2014 MapR Technologies 35 
How does that Work (Open TSDB on MapR)? 
Message 
queue 
Collector 
MapR 
table Samples 
Web service Users
© 2014 MapR Technologies 36 
Introduction to Open TSDB 
HBase 
or 
MapR-DB
© 2014 MapR Technologies 37 
Wide Table Design: Point-by-Point
© 2014 MapR Technologies 38 
Wide Table Design: Hybrid Point-by-Point + Blob 
Insertion of data as blob makes original columns redundant 
This is the way that TSD should work, not quite how it does work
© 2014 MapR Technologies 39 
Speeding up OpenTSDB 
20,000 data points per second per node in the test cluster 
Why can’t it be faster ?
© 2014 MapR Technologies 40 
Status to This Point 
• Each sample requires one insertion, compaction requires 
another 
• Typical performance on SE cluster 
– 1 edge node + 4 cluster nodes 
– 20,000 samples per second observed 
– Would be faster on performance cluster, possibly not a lot 
• Suitable for server monitoring 
• Not suitable for large scale history ingestion 
• Bulk load helps a little, but not much 
• Still 1000x too slow for industrial work
© 2014 MapR Technologies 41 
Small Trick … Buffer Data in Memory 
Message 
queue Samples 
Users 
Collector 
MapR 
table 
Web service 
Log 
Buffering data for 1 hour in 
collector allows >1000x 
decrease in insertion rate 
Logging latest hour of data allows 
clean restart of collector 
(lambda + epsilon architecture) 
Web service queries 
database and 
collector
© 2014 MapR Technologies 42 
Speeding up OpenTSDB: open source MapR extensions 
Available on Github: https://github.com/mapr-demos/opentsdb
© 2014 MapR Technologies 43 
Status to This Point 
• 3600 samples require one insertion 
• Typical results on SE cluster 
– 1 edge node + 4 cluster nodes 
– 14 million samples per second observed 
– ~700x faster ingestion 
• Typical results on performance cluster 
– 2-4 edge nodes + 4-9 cluster nodes 
– 110 million samples/s (4 nodes) to >200 million samples/s (8 nodes) 
• Suitable for large scale history ingestion 
• 30 million data points retrieved in 20s 
• Ready for industrial work
© 2014 MapR Technologies 44 
Key Lessons 
• Ingestion is network limited 
– Edge nodes are the critical resource 
– Number of edge nodes defines a limit to scaling 
• With enough edge nodes scaling is near perfect 
• Performance of raw OpenTSDB is limited by stateless demon 
• Modified OpenTSDB can run 1000x faster
© 2014 MapR Technologies 45 
Overall Ingestion Rate 
Nodes 
Total Ingestion Rate (millions of points / second) 
4 5 8 9 
0 50 150 250
© 2014 MapR Technologies 46 
Normalized Ingestion Rate 
Nodes 
Ingestion per node (millions of points / second) 
4 5 8 9 
0 10 20 30 40
© 2014 MapR Technologies 47 
Why MapR? 
• MapR tables are inherently faster, safer 
– Sustained > 1GB/s ingest rate in tests 
• Mirror to M5 or M7 cluster to isolate analytics load 
• Transaction logs involves frequent appends, many files
© 2014 MapR Technologies 48 
When is this All Wrong? 
• In some cases, retrieval by series-id + time range not sufficient 
• May need very flexible retrieval of events based on text-like 
criteria 
• Search may be better than class time-series database 
• Can scale Lucene based search to > 1 million events / second
© 2014 MapR Technologies 49 
Summary 
• The internet is turning upside down 
• This will make time series ubiquitous 
• Current open source systems are much too slow 
• We can fix that with modern NoSQL systems 
– (I wear a red hat for a reason)
© 2014 MapR Technologies 50 
Questions
© 2014 MapR Technologies 51 
Thank You 
@mapr maprtech 
tdunning@mapr.com 
tdunning@apache.org 
Ted Dunning, Chief Application Architect 
MapRTechnologies 
maprtech 
mapr-technologies

Weitere ähnliche Inhalte

Was ist angesagt?

Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and SparkJosef Adersberger
 
Apache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseApache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseFlorian Lautenschlager
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...DataStax
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Florian Lautenschlager
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPTathagata Das
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Markus Höfer
 
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...Tathagata Das
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3Rob Skillington
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXzznate
 
Efficient and Fast Time Series Storage - The missing link in dynamic software...
Efficient and Fast Time Series Storage - The missing link in dynamic software...Efficient and Fast Time Series Storage - The missing link in dynamic software...
Efficient and Fast Time Series Storage - The missing link in dynamic software...Florian Lautenschlager
 
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Lucidworks
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrQAware GmbH
 
Gnocchi v4 - past and present
Gnocchi v4 - past and presentGnocchi v4 - past and present
Gnocchi v4 - past and presentGordon Chung
 
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Florian Lautenschlager
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeFlink Forward
 
HBaseCon 2013: OpenTSDB at Box
HBaseCon 2013: OpenTSDB at BoxHBaseCon 2013: OpenTSDB at Box
HBaseCon 2013: OpenTSDB at BoxCloudera, Inc.
 
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...DataStax
 

Was ist angesagt? (20)

Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and Spark
 
Apache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseApache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series database
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017
 
The new time series kid on the block
The new time series kid on the blockThe new time series kid on the block
The new time series kid on the block
 
Matlab netcdf guide
Matlab netcdf guideMatlab netcdf guide
Matlab netcdf guide
 
Gnocchi v3
Gnocchi v3Gnocchi v3
Gnocchi v3
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
Efficient and Fast Time Series Storage - The missing link in dynamic software...
Efficient and Fast Time Series Storage - The missing link in dynamic software...Efficient and Fast Time Series Storage - The missing link in dynamic software...
Efficient and Fast Time Series Storage - The missing link in dynamic software...
 
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache Solr
 
Gnocchi v4 - past and present
Gnocchi v4 - past and presentGnocchi v4 - past and present
Gnocchi v4 - past and present
 
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
 
HBaseCon 2013: OpenTSDB at Box
HBaseCon 2013: OpenTSDB at BoxHBaseCon 2013: OpenTSDB at Box
HBaseCon 2013: OpenTSDB at Box
 
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
 

Ähnlich wie Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014

How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownTed Dunning
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015Ted Dunning
 
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet  With High Performance Time Series DatabaseDealing with an Upside Down Internet  With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series DatabaseDataWorks Summit
 
Dealing with an Upside Down Internet
Dealing with an Upside Down InternetDealing with an Upside Down Internet
Dealing with an Upside Down InternetMapR Technologies
 
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownHow the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownDataWorks Summit
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series WorldMapR Technologies
 
Building HBase Applications - Ted Dunning
Building HBase Applications - Ted DunningBuilding HBase Applications - Ted Dunning
Building HBase Applications - Ted DunningMapR Technologies
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoopTed Dunning
 
HUG_Ireland_Streaming_Ted_Dunning
HUG_Ireland_Streaming_Ted_DunningHUG_Ireland_Streaming_Ted_Dunning
HUG_Ireland_Streaming_Ted_DunningJohn Mulhall
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop DataWorks Summit/Hadoop Summit
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...balmanme
 
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Coburn Watson
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...balmanme
 
Lawrence Livermore Labs talk 2011
Lawrence Livermore Labs talk 2011Lawrence Livermore Labs talk 2011
Lawrence Livermore Labs talk 2011MapR Technologies
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataAlexMiowski
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaDataStax Academy
 
Business Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksBusiness Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksTal Lavian Ph.D.
 

Ähnlich wie Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014 (20)

How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015
 
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet  With High Performance Time Series DatabaseDealing with an Upside Down Internet  With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series Database
 
Dealing with an Upside Down Internet
Dealing with an Upside Down InternetDealing with an Upside Down Internet
Dealing with an Upside Down Internet
 
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownHow the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside Down
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series World
 
Building HBase Applications - Ted Dunning
Building HBase Applications - Ted DunningBuilding HBase Applications - Ted Dunning
Building HBase Applications - Ted Dunning
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoop
 
HUG_Ireland_Streaming_Ted_Dunning
HUG_Ireland_Streaming_Ted_DunningHUG_Ireland_Streaming_Ted_Dunning
HUG_Ireland_Streaming_Ted_Dunning
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Lawrence Livermore Labs talk 2011
Lawrence Livermore Labs talk 2011Lawrence Livermore Labs talk 2011
Lawrence Livermore Labs talk 2011
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning Data
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
Kafka talk
Kafka talkKafka talk
Kafka talk
 
Business Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksBusiness Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical Networks
 

Mehr von NoSQLmatters

Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...NoSQLmatters
 
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...NoSQLmatters
 
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015NoSQLmatters
 
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...NoSQLmatters
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015NoSQLmatters
 
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...NoSQLmatters
 
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...NoSQLmatters
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015NoSQLmatters
 
Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...NoSQLmatters
 
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...NoSQLmatters
 
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...NoSQLmatters
 
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015NoSQLmatters
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...NoSQLmatters
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...NoSQLmatters
 
David Pilato - Advance search for your legacy application - NoSQL matters Par...
David Pilato - Advance search for your legacy application - NoSQL matters Par...David Pilato - Advance search for your legacy application - NoSQL matters Par...
David Pilato - Advance search for your legacy application - NoSQL matters Par...NoSQLmatters
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015NoSQLmatters
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015NoSQLmatters
 
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...NoSQLmatters
 
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015NoSQLmatters
 

Mehr von NoSQLmatters (20)

Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
 
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
 
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
 
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
 
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
 
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...
 
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
 
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
 
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
 
David Pilato - Advance search for your legacy application - NoSQL matters Par...
David Pilato - Advance search for your legacy application - NoSQL matters Par...David Pilato - Advance search for your legacy application - NoSQL matters Par...
David Pilato - Advance search for your legacy application - NoSQL matters Par...
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
 
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
 
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
 

Kürzlich hochgeladen

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 

Kürzlich hochgeladen (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 

Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL matters Barcelona 2014

  • 1. © 2014 MapR Techno©lo 2g0ie1s4 MapR Technologies 1
  • 2. © 2014 MapR Technologies 2 Agenda • The Internet is turning upside down • Distributed nervous system • The last (mile) shall be first • Time series on NO-SQL • Faster time series on NO-SQL
  • 3. © 2014 MapR Technologies 3 How the Internet Works • Big content servers feed data across the backbone to • Regional caches and servers feed data across neighborhood transport to • The “last mile” • Bits are nearly conserved, $ are concentrated centrally – But total $ mass at the edge is much higher
  • 4. © 2014 MapR Technologies 4 How The Internet Works Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 5. © 2014 MapR Technologies 5 Conservation of Bits Decreases Bandwidth Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 6. © 2014 MapR Technologies 6 Total Investment Dominated by Last Mile Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 7. © 2014 MapR Technologies 7 The Rub • What's the problem? – Speed (end-to-end latency, backbone bw) – Feasibility (cost for consumer links) – Caching • What do we need? – Cheap last-mile hardware – Good caches
  • 8. © 2014 MapR Technologies 8 What has changed? Where will it lead?
  • 9. © 2014 MapR Technologies 9
  • 10. © 2014 MapR Technologies 10
  • 11. © 2014 MapR Technologies 11
  • 12. © 2014 MapR Technologies 12
  • 13. © 2014 MapR Technologies 13
  • 14. © 2014 MapR Technologies 14
  • 15. © 2014 MapR Technologies 15
  • 16. © 2014 MapR Technologies 16
  • 17. © 2014 MapR Technologies 17
  • 18. © 2014 MapR Technologies 18
  • 19. © 2014 MapR Technologies 19 Things
  • 20. © 2014 MapR Technologies 20 Emitting data
  • 21. © 2014 MapR Technologies 21 How The Internet Works Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 22. © 2014 MapR Technologies 22 How the Internet is Going to Work Server Cache Cache Controller Switch Gateway m4 m3 Gateway Switch Controller m6 m5 Switch m2 Controller m1
  • 23. © 2014 MapR Technologies 23 Where Will The $ Go? Server Cache Cache Controller Switch Gateway m4 m3 Gateway Switch Controller m6 m5 Switch m2 Controller m1
  • 24. © 2014 MapR Technologies 24 Sensors
  • 25. © 2014 MapR Technologies 25 Controllers
  • 26. © 2014 MapR Technologies 26 The Problems • Sensors and controllers have little processing or space – SIM cards = 20Mhz processor, 128kb space = 16kB – Arduino mini = 15kB RAM (more EPROM) – BeagleBone/Raspberry Pi = 500 kB RAM • Sensors and controllers have little power – Very common to power down 99% of the time • Sensors and controls often have very low bandwidth – Mesh networks with base rates << 1Mb/s – Power line networking – Intermittent 3G/4G/LTE connectivity
  • 27. © 2014 MapR Technologies 27 What Do We Need to Do With a Time Series • Acquire – Measurement, transmission, reception – Mostly not our problem • Store – We own this • Retrieve – We have to allow this • Analyze and visualize – We facilitate this via retrieval
  • 28. © 2014 MapR Technologies 28 Retrieval Requirements • Retrieve by time-series, time range, tags – Possibly pull millions of data points at a time – Possibly do on-the-fly windowed aggregations • Search by unstructured data – Typically require time windowed facetting after search – Also need to dive in with first kind of retrieval
  • 29. © 2014 MapR Technologies 29 Storage choices and trade-offs • Flat files – Great for rapid ingest with massive data – Handles essentially any data type – Less good for data requiring frequent updates – Harder to find specific ranges • Traditional relational db – Ingests up to 10,000’s/ sec; prefers well structured (numerical) data; expensive • Non-relational db: Tables (such as MapR tables in M7 or HBase) – Ingests up to 100,000 rows/sec – Handles wide variety of data – Good for frequent updates – Easily scanned in a range
  • 30. © 2014 MapR Technologies 30 Specific Example • Consider a server farm • Lots of system metrics • Typically 100-300 stats / 30 s • Loads, RPC’s, packets, requests/s • Common to have 100 – 10,000 machines
  • 31. © 2014 MapR Technologies 31 The General Outline • 10 samples / second / machine x 1,000 machines = 10,000 samples / second • This is what Open TSDB was designed to handle • Install and go, but don’t test at scale
  • 32. © 2014 MapR Technologies 32 Specific Example • Consider oil drilling rigs • When drilling wells, there are *lots* of moving parts • Typically a drilling rig makes about 10K samples/s • Temperatures, pressures, magnetics, machine vibration levels, salinity, voltage, currents, many others • Typical project has 100 rigs
  • 33. © 2014 MapR Technologies 33 The General Outline • 10K samples / second / rig x 100 rigs = 1M samples / second
  • 34. © 2014 MapR Technologies 34 The General Outline • 10K samples / second / rig x 100 rigs = 1M samples / second • But wait, there’s more – Suppose you want to test your system – Perhaps with a year of data – And you want to load that data in << 1 year • 100x real-time = 100M samples / second
  • 35. © 2014 MapR Technologies 35 How does that Work (Open TSDB on MapR)? Message queue Collector MapR table Samples Web service Users
  • 36. © 2014 MapR Technologies 36 Introduction to Open TSDB HBase or MapR-DB
  • 37. © 2014 MapR Technologies 37 Wide Table Design: Point-by-Point
  • 38. © 2014 MapR Technologies 38 Wide Table Design: Hybrid Point-by-Point + Blob Insertion of data as blob makes original columns redundant This is the way that TSD should work, not quite how it does work
  • 39. © 2014 MapR Technologies 39 Speeding up OpenTSDB 20,000 data points per second per node in the test cluster Why can’t it be faster ?
  • 40. © 2014 MapR Technologies 40 Status to This Point • Each sample requires one insertion, compaction requires another • Typical performance on SE cluster – 1 edge node + 4 cluster nodes – 20,000 samples per second observed – Would be faster on performance cluster, possibly not a lot • Suitable for server monitoring • Not suitable for large scale history ingestion • Bulk load helps a little, but not much • Still 1000x too slow for industrial work
  • 41. © 2014 MapR Technologies 41 Small Trick … Buffer Data in Memory Message queue Samples Users Collector MapR table Web service Log Buffering data for 1 hour in collector allows >1000x decrease in insertion rate Logging latest hour of data allows clean restart of collector (lambda + epsilon architecture) Web service queries database and collector
  • 42. © 2014 MapR Technologies 42 Speeding up OpenTSDB: open source MapR extensions Available on Github: https://github.com/mapr-demos/opentsdb
  • 43. © 2014 MapR Technologies 43 Status to This Point • 3600 samples require one insertion • Typical results on SE cluster – 1 edge node + 4 cluster nodes – 14 million samples per second observed – ~700x faster ingestion • Typical results on performance cluster – 2-4 edge nodes + 4-9 cluster nodes – 110 million samples/s (4 nodes) to >200 million samples/s (8 nodes) • Suitable for large scale history ingestion • 30 million data points retrieved in 20s • Ready for industrial work
  • 44. © 2014 MapR Technologies 44 Key Lessons • Ingestion is network limited – Edge nodes are the critical resource – Number of edge nodes defines a limit to scaling • With enough edge nodes scaling is near perfect • Performance of raw OpenTSDB is limited by stateless demon • Modified OpenTSDB can run 1000x faster
  • 45. © 2014 MapR Technologies 45 Overall Ingestion Rate Nodes Total Ingestion Rate (millions of points / second) 4 5 8 9 0 50 150 250
  • 46. © 2014 MapR Technologies 46 Normalized Ingestion Rate Nodes Ingestion per node (millions of points / second) 4 5 8 9 0 10 20 30 40
  • 47. © 2014 MapR Technologies 47 Why MapR? • MapR tables are inherently faster, safer – Sustained > 1GB/s ingest rate in tests • Mirror to M5 or M7 cluster to isolate analytics load • Transaction logs involves frequent appends, many files
  • 48. © 2014 MapR Technologies 48 When is this All Wrong? • In some cases, retrieval by series-id + time range not sufficient • May need very flexible retrieval of events based on text-like criteria • Search may be better than class time-series database • Can scale Lucene based search to > 1 million events / second
  • 49. © 2014 MapR Technologies 49 Summary • The internet is turning upside down • This will make time series ubiquitous • Current open source systems are much too slow • We can fix that with modern NoSQL systems – (I wear a red hat for a reason)
  • 50. © 2014 MapR Technologies 50 Questions
  • 51. © 2014 MapR Technologies 51 Thank You @mapr maprtech tdunning@mapr.com tdunning@apache.org Ted Dunning, Chief Application Architect MapRTechnologies maprtech mapr-technologies