SlideShare ist ein Scribd-Unternehmen logo
1 von 25
www.infobip.com
REAL-TIME BIG DATA INGESTION
AND QUERYING OF
AGGREGATED DATA
Davor Poldrugo
software engineer
Davor Poldrugo @ Infobip
Software engineer with interest in backend development,
high availability and distributed systems.
https://about.me/davor.poldrugo
●
MOBILE SERVICES: Professional SMS, number validation, voice, USSD,
mobile payments; deeply integrated into the telecoms world
●
ENTERPRISE PRODUCTS for businesses of any scale and need (mGate,
fully-featured web apps, SMS authentication solutions, reseller solutions...)
●
APP ENGAGEMENT PLATFORM based on advanced push notifications
●
APIs and protocols for EASY INTEGRATION: xml, soap/rest, smpp, http,
json
●
Full 24/7 TECHNICAL SUPPORT regardless of location
●
QUALITY guaranteed by a strict SLA
Our services
Presentation overview
●
Dictionary
●
The real-time use case and the challenges
(because there are no problems ;)
●
The platform and how we got here
●
Our path towards real-time data
●
Architecture and component overview
●
Numbers and conclusion
Dictionary
REAL-TIME noun
“the actual time during which something takes place <the computer may
partly analyze the data in real time (as it comes in) — R. H. March>
<chatted online in real time>
– real-time adjective”
http://www.merriam-webster.com/dictionary/real%20time
BIG DATA noun
“an accumulation of data that is too large and complex for processing by
traditional database management tools”
http://www.merriam-webster.com/dictionary/big%20data
Dictionary
INGEST verb
“to take (something, such as food) into your body : to swallow
(something)
— sometimes used figuratively
She ingested [=absorbed] large amounts of information very quickly.”
http://www.learnersdictionary.com/definition/ingest
I'll use this figurative meaning... in context of data ingestion.
The real-time use case and the challenges
●
Our new web requirement: provide real-time data and graphs of
traffic
●
SMS Campaigns Web application
Near real-time
● But we wanted real-time!
The platform and how we got here
●
There was only one node – a monolith
●
One transactional database (OLTP)
●
Traffic increased
●
After a while the database began to be a bottleneck
●
Then we introduced multiple transaction databases
●
Then multiple monolith nodes were introduced – one per database
●
Then load balancers were needed
The platform and how we got here
●
After that querying has become complex:
– when one or more databases down for maintenance - data from
that DB is missing
– queries had to span over multiple databases and then results
had to be joined
– aggregate reports become a problem (complexity, availability)
– aggregation databases introduced (ETL) that pulled from
transactional databases
●
In the meantime we decoupled our monolithic node to lots of
microservice nodes (IpCore, Billing, Contacts, Campaigns, ...)
●
As traffic increased, non-transactional (apps, reports) queries
become a problem – throughput decrease
The platform and how we got here
●
Our Database Team introduced GREEN – our ODS/DWH
– named after the color of the pencil used to draw on the board ;)
– Near real-time ETL (for traffic tables with 150+ columns)
– Centralized reporting
– Decreased workload from transactional databases
– Throughput increase of our core nodes (IpCore)
– Specialized indexes
– Specialized aggregations
– But still... near real-time...
– 1 to 60 minutes out of sync – with the transactional databases
(depending on the load)
Our path towards real-time data
●
GREEN ODS/DWH provided an abstract solution for all our traffic
data but was not in REAL-TIME
●
GREEN consists of big hardware – scales vertically
●
This approach tries to solve a particular REAL-TIME use case – one
by one – not a silver bullet!
●
Because REAL-TIME isn't always needed
●
Resources are limited
●
The path towards horizontal scalability
Our path towards real-time data
1. All data entering the system is dispatched
to both the batch layer and the speed
layer for processing.
2. The batch layer has two functions: (i)
managing the master dataset (an
immutable, append-only set of raw data),
and (ii) to pre-compute the batch views.
3. The serving layer indexes the batch views
so that they can be queried in low-latency,
ad-hoc way.
4. The speed layer compensates for the high
latency of updates to the serving layer
and deals with recent data only.
5. Any incoming query can be answered by
merging results from batch views and
real-time views.
Lambda architecture ( http://lambda-architecture.net/ )
Our path towards real-time data
Know thyself! Adapt lambda architecture to fit your needs!
IpCore
(Core Message
Processing)
IpCore
(Core Message
Processing)Messaging Cloud Transactional
Databases
(OLTP)
App
Message
Event
App
Message
Event
App
Message
Event
GREEN DB
ODS
DWH
(newly proclaimed
BATCH/SERVING
LAYER)
REAL-TIME
LAYER
QUERY
LAYER
(queries REAL-TIME
OR
BATCH)
Ingest
point
Ingest
point
Messaging
Cloud
App
Messaging
Cloud
App
...
Architecture and component overview
Messaging Cloud
App
Message
Event
App
Message
Event
App
Message
Event
REAL-TIME
LAYER
...
Data Ingestion Service
Process
Message
Process
Delta
Pairing and composing
a new message
Kafka cluster
Druid cluster
Billing
ingest
point
IpCore
ingest
point
Architecture and component overview
REAL-TIME
LAYER
Data Ingestion Service
Kafka cluster
Druid cluster
{
"sendDateTime":"2016-02-19T12:07:47Z",
"campaignId":29680,
"currencyId":2,
"currencyHNBCode":"EUR",
"currencySymbol":"€",
"countDelta":1,
"priceDelta":0.02
}
Architecture and component overview
REAL-TIME LAYER
Kafka cluster
Druid cluster
Data Ingestion Service
GREEN DB
ODS
DWH
BATCH LAYER
QUERY LAYER
Data
Query
Service
Messaging
Cloud
App
Messaging
Cloud
App
Messaging
Cloud
App
Messaging
Cloud
App
Messaging
Cloud
App
Is
realtime?
TRUE FALSE
Architecture and component overview
REAL-TIME LAYER
Druid cluster
QUERY LAYER
Data
Query
Service
POST /druid/v2 HTTP/1.1
Host: druid-broker-node:8080
Content-Type: application/json
{
"queryType": "groupBy",
"dataSource": "campaign-totals-v2",
"granularity": "all",
"intervals": [ "2012-01-01T00:00:00.000/2100-01-01T00:00:00.000" ],
"dimensions": ["campaignId", "currencyId", "currencySymbol", "currencyHNBCode"],
"filter": { "type": "selector", "dimension": "campaignId", "value": 29680 },
"aggregations": [
{ "type": "longSum", "name": "totalCountSum", "fieldName": "totalCount" },
{ "type": "doubleSum", "name": "totalPriceSum", "fieldName": "price" }
]
}
Request to Druid
Architecture and component overview
REAL-TIME LAYER
Druid cluster
QUERY LAYER
Data
Query
Service
Response from Druid
[
{
"version": "v1",
"timestamp": "2012-01-01T00:00:00.000Z",
"event": {
"totalCountSum": 1000000,
"currencyid": "2",
"totalPriceSum": 20000,
"currencysymbol": "€",
"currencyhnbcode": "EUR",
"campaignid": "29680"
}
}
]
Architecture and component overview
KAFKA - https://kafka.apache.org/
●
Kafka maintains feeds of messages in categories called
topics
●
A distributed, partitioned, replicated commit log service. It
provides the functionality of a messaging system, but with
a unique design.
FEATURES
●
two messaging models incorporated in an abstraction
called consumer group (group id) – queue and publish-
subscribe
– queue - a pool of consumers may read from a server
and each message goes to one of them
– publish-subscribe - the message is broadcast to all
consumers
●
constant performance with respect to data size
●
replay – all messages are stored and can be accessd
with a sequential id number called the offset
REAL-TIME LAYER
Kafka cluster
Druid cluster
Data Ingestion Service
REAL-TIME LAYER
Kafka cluster
Druid cluster
Data Ingestion Service
Architecture and component overview
DRUID - http://druid.io/
Druid is a fast column-oriented distributed data store.
Real-time Streams
Druid supports streaming data ingestion and offers insights on
events immediately after they occur. Retain events indefinitely and
unify real-time and historical views.
Sub-Second Queries
Druid supports fast aggregations and sub-second OLAP queries.
Scalable to Petabytes
Existing Druid clusters have scaled to petabytes of data and trillions
of events, ingesting millions of events every second. Druid is
extremely cost effective, even at scale.
Deploy Anywhere
Druid runs on commodity hardware. Deploy it in the cloud or on-
premise. Integrate with existing data systems such as Hadoop,
Spark, Kafka, Storm, and Samza.
REAL-TIME LAYER
Kafka cluster
Druid cluster
Data Ingestion Service
REAL-TIME LAYER
Kafka cluster
Druid cluster
Data Ingestion Service
Numbers and conlusion
Data pipeline
Max. throughput
(msg/s)
Ingest points → Data Ingestion Service 7700
Billing ingest point → Data Ingestion Service 5500
IpCore ingest point → Data Ingestion Service 2200
Data Ingestion service → Kafka 2130
Druid firehose pull and aggregate from Kafka 29000
Real-time!
<2 sec delay
Numbers and conclusion
PROBLEMS / CHALLENGES ;)
●
Added complexity to the flow
– Maintenance of “ingest point” code
– Maintenance of Data Ingestion Service
– Operational knowledge of Kafka / Druid
●
Scaling Druid – problems with “Druid realtme nodes” and Kafka
topics with multiple partitions
●
Druid - Exactly once semantics are not guaranteed with real-time
ingestion in Druid – but we didn't have problems with our
configuration - definitive solution – Druid batch ingestion using
Tranquility
www.infobip.com
Q/A
Davor Poldrugo
software engineer
davor.poldrugo@infobip.com
dpoldrugo@gmail.com
REAL-TIME BIG DATA INGESTION AND QUERYING OF AGGREGATED DATA

Weitere ähnliche Inhalte

Was ist angesagt?

Lambda architecture @ Indix
Lambda architecture @ IndixLambda architecture @ Indix
Lambda architecture @ IndixRajesh Muppalla
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureDatabricks
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf
 
Unified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamUnified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamDataWorks Summit/Hadoop Summit
 
Using Hazelcast in the Kappa architecture
Using Hazelcast in the Kappa architectureUsing Hazelcast in the Kappa architecture
Using Hazelcast in the Kappa architectureOliver Buckley-Salmon
 
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021StreamNative
 
Introduction to Streaming Distributed Processing with Storm
Introduction to Streaming Distributed Processing with StormIntroduction to Streaming Distributed Processing with Storm
Introduction to Streaming Distributed Processing with StormBrandon O'Brien
 
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin Databricks
 
RUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey Kharlamov
RUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey KharlamovRUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey Kharlamov
RUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey KharlamovBig Data Spain
 
The State of Stream Processing
The State of Stream ProcessingThe State of Stream Processing
The State of Stream Processingconfluent
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with SparkVincent GALOPIN
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...HostedbyConfluent
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsAsis Mohanty
 
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at HuluLessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at HuluDataWorks Summit
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Sparktsliwowicz
 
Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka confluent
 
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data PipelinesETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelinesconfluent
 

Was ist angesagt? (20)

Lambda architecture @ Indix
Lambda architecture @ IndixLambda architecture @ Indix
Lambda architecture @ Indix
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
Unified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamUnified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache Beam
 
Using Hazelcast in the Kappa architecture
Using Hazelcast in the Kappa architectureUsing Hazelcast in the Kappa architecture
Using Hazelcast in the Kappa architecture
 
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
 
Introduction to Streaming Distributed Processing with Storm
Introduction to Streaming Distributed Processing with StormIntroduction to Streaming Distributed Processing with Storm
Introduction to Streaming Distributed Processing with Storm
 
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
 
RUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey Kharlamov
RUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey KharlamovRUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey Kharlamov
RUNNING A PETASCALE DATA SYSTEM: GOOD, BAD, AND UGLY CHOICES by Alexey Kharlamov
 
The State of Stream Processing
The State of Stream ProcessingThe State of Stream Processing
The State of Stream Processing
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with Spark
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
 
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at HuluLessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
 
Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka Real-time Data Streaming from Oracle to Apache Kafka
Real-time Data Streaming from Oracle to Apache Kafka
 
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data PipelinesETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
 

Andere mochten auch

Andere mochten auch (20)

Javantura v3 - The Internet of (Lego) Trains – Johan Janssen, Ingmar van der ...
Javantura v3 - The Internet of (Lego) Trains – Johan Janssen, Ingmar van der ...Javantura v3 - The Internet of (Lego) Trains – Johan Janssen, Ingmar van der ...
Javantura v3 - The Internet of (Lego) Trains – Johan Janssen, Ingmar van der ...
 
Javantura v3 - ES6 – Future Is Now – Nenad Pečanac
Javantura v3 - ES6 – Future Is Now – Nenad PečanacJavantura v3 - ES6 – Future Is Now – Nenad Pečanac
Javantura v3 - ES6 – Future Is Now – Nenad Pečanac
 
Javantura v3 - Just say it – using language to communicate with the computer ...
Javantura v3 - Just say it – using language to communicate with the computer ...Javantura v3 - Just say it – using language to communicate with the computer ...
Javantura v3 - Just say it – using language to communicate with the computer ...
 
Javantura v3 - Spring Boot under the hood– Nicolas Fränkel
Javantura v3 - Spring Boot under the hood– Nicolas FränkelJavantura v3 - Spring Boot under the hood– Nicolas Fränkel
Javantura v3 - Spring Boot under the hood– Nicolas Fränkel
 
Javantura v4 - Security architecture of the Java platform - Martin Toshev
Javantura v4 - Security architecture of the Java platform - Martin ToshevJavantura v4 - Security architecture of the Java platform - Martin Toshev
Javantura v4 - Security architecture of the Java platform - Martin Toshev
 
Javantura v4 - DMN – supplement your BPMN - Željko Šmaguc
Javantura v4 - DMN – supplement your BPMN - Željko ŠmagucJavantura v4 - DMN – supplement your BPMN - Željko Šmaguc
Javantura v4 - DMN – supplement your BPMN - Željko Šmaguc
 
Javantura v4 - CroDuke Indy and the Kingdom of Java Skills - Branko Mihaljevi...
Javantura v4 - CroDuke Indy and the Kingdom of Java Skills - Branko Mihaljevi...Javantura v4 - CroDuke Indy and the Kingdom of Java Skills - Branko Mihaljevi...
Javantura v4 - CroDuke Indy and the Kingdom of Java Skills - Branko Mihaljevi...
 
Javantura v4 - JVM++ The GraalVM - Martin Toshev
Javantura v4 - JVM++ The GraalVM - Martin ToshevJavantura v4 - JVM++ The GraalVM - Martin Toshev
Javantura v4 - JVM++ The GraalVM - Martin Toshev
 
Javantura v4 - FreeMarker in Spring web - Marin Kalapać
Javantura v4 - FreeMarker in Spring web - Marin KalapaćJavantura v4 - FreeMarker in Spring web - Marin Kalapać
Javantura v4 - FreeMarker in Spring web - Marin Kalapać
 
Javantura v4 - Let me tell you a story why Scrum is not for you - Roko Roić
Javantura v4 - Let me tell you a story why Scrum is not for you - Roko RoićJavantura v4 - Let me tell you a story why Scrum is not for you - Roko Roić
Javantura v4 - Let me tell you a story why Scrum is not for you - Roko Roić
 
Javantura v4 - The power of cloud in professional services company - Ivan Krn...
Javantura v4 - The power of cloud in professional services company - Ivan Krn...Javantura v4 - The power of cloud in professional services company - Ivan Krn...
Javantura v4 - The power of cloud in professional services company - Ivan Krn...
 
Javantura v4 - Test-driven documentation with Spring REST Docs - Danijel Mitar
Javantura v4 - Test-driven documentation with Spring REST Docs - Danijel MitarJavantura v4 - Test-driven documentation with Spring REST Docs - Danijel Mitar
Javantura v4 - Test-driven documentation with Spring REST Docs - Danijel Mitar
 
Javantura v4 - True RESTful Java Web Services with JSON API and Katharsis - M...
Javantura v4 - True RESTful Java Web Services with JSON API and Katharsis - M...Javantura v4 - True RESTful Java Web Services with JSON API and Katharsis - M...
Javantura v4 - True RESTful Java Web Services with JSON API and Katharsis - M...
 
Javantura v4 - What’s NOT new in modular Java - Milen Dyankov
Javantura v4 - What’s NOT new in modular Java - Milen DyankovJavantura v4 - What’s NOT new in modular Java - Milen Dyankov
Javantura v4 - What’s NOT new in modular Java - Milen Dyankov
 
Javantura v4 - Java and lambdas and streams - are they better than for loops ...
Javantura v4 - Java and lambdas and streams - are they better than for loops ...Javantura v4 - Java and lambdas and streams - are they better than for loops ...
Javantura v4 - Java and lambdas and streams - are they better than for loops ...
 
Javantura v4 - (Spring)Boot your application on Red Hat middleware stack - Al...
Javantura v4 - (Spring)Boot your application on Red Hat middleware stack - Al...Javantura v4 - (Spring)Boot your application on Red Hat middleware stack - Al...
Javantura v4 - (Spring)Boot your application on Red Hat middleware stack - Al...
 
Javantura v4 - Cloud-native Architectures and Java - Matjaž B. Jurič
Javantura v4 - Cloud-native Architectures and Java - Matjaž B. JuričJavantura v4 - Cloud-native Architectures and Java - Matjaž B. Jurič
Javantura v4 - Cloud-native Architectures and Java - Matjaž B. Jurič
 
Javantura v4 - Android App Development in 2017 - Matej Vidaković
Javantura v4 - Android App Development in 2017 - Matej VidakovićJavantura v4 - Android App Development in 2017 - Matej Vidaković
Javantura v4 - Android App Development in 2017 - Matej Vidaković
 
Javantura v4 - Keycloak – instant login for your app - Marko Štrukelj
Javantura v4 - Keycloak – instant login for your app - Marko ŠtrukeljJavantura v4 - Keycloak – instant login for your app - Marko Štrukelj
Javantura v4 - Keycloak – instant login for your app - Marko Štrukelj
 
Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...
Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...
Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...
 

Ähnlich wie Javantura v3 - Real-time BigData ingestion and querying of aggregated data – Davor Poldrugo

Io t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moeIo t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moeShawn Moe
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWSAmazon Web Services
 
The Most Trusted In-Memory database in the world- Altibase
The Most Trusted In-Memory database in the world- AltibaseThe Most Trusted In-Memory database in the world- Altibase
The Most Trusted In-Memory database in the world- AltibaseAltibase
 
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...confluent
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
 
CS8091_BDA_Unit_IV_Stream_Computing
CS8091_BDA_Unit_IV_Stream_ComputingCS8091_BDA_Unit_IV_Stream_Computing
CS8091_BDA_Unit_IV_Stream_ComputingPalani Kumar
 
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...AboutYouGmbH
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureLuan Moreno Medeiros Maciel
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13DECK36
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi
 
Data & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architectureData & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architectureNiels Naglé
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
 
Enabling Microservices Frameworks to Solve Business Problems
Enabling Microservices Frameworks to Solve  Business ProblemsEnabling Microservices Frameworks to Solve  Business Problems
Enabling Microservices Frameworks to Solve Business ProblemsKen Owens
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBconfluent
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafkaconfluent
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaAttunity
 

Ähnlich wie Javantura v3 - Real-time BigData ingestion and querying of aggregated data – Davor Poldrugo (20)

Real time analytics
Real time analyticsReal time analytics
Real time analytics
 
Io t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moeIo t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moe
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
 
The Most Trusted In-Memory database in the world- Altibase
The Most Trusted In-Memory database in the world- AltibaseThe Most Trusted In-Memory database in the world- Altibase
The Most Trusted In-Memory database in the world- Altibase
 
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
CS8091_BDA_Unit_IV_Stream_Computing
CS8091_BDA_Unit_IV_Stream_ComputingCS8091_BDA_Unit_IV_Stream_Computing
CS8091_BDA_Unit_IV_Stream_Computing
 
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
 
Data & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architectureData & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architecture
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
Enabling Microservices Frameworks to Solve Business Problems
Enabling Microservices Frameworks to Solve  Business ProblemsEnabling Microservices Frameworks to Solve  Business Problems
Enabling Microservices Frameworks to Solve Business Problems
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafka
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache Kafka
 

Mehr von HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association

Mehr von HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association (20)

Java cro'21 the best tools for java developers in 2021 - hujak
Java cro'21   the best tools for java developers in 2021 - hujakJava cro'21   the best tools for java developers in 2021 - hujak
Java cro'21 the best tools for java developers in 2021 - hujak
 
JavaCro'21 - Java is Here To Stay - HUJAK Keynote
JavaCro'21 - Java is Here To Stay - HUJAK KeynoteJavaCro'21 - Java is Here To Stay - HUJAK Keynote
JavaCro'21 - Java is Here To Stay - HUJAK Keynote
 
Javantura v7 - Behaviour Driven Development with Cucumber - Ivan Lozić
Javantura v7 - Behaviour Driven Development with Cucumber - Ivan LozićJavantura v7 - Behaviour Driven Development with Cucumber - Ivan Lozić
Javantura v7 - Behaviour Driven Development with Cucumber - Ivan Lozić
 
Javantura v7 - The State of Java - Today and Tomowwow - HUJAK's Community Key...
Javantura v7 - The State of Java - Today and Tomowwow - HUJAK's Community Key...Javantura v7 - The State of Java - Today and Tomowwow - HUJAK's Community Key...
Javantura v7 - The State of Java - Today and Tomowwow - HUJAK's Community Key...
 
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
Javantura v7 - Learning to Scale Yourself: The Journey from Coder to Leader -...
 
JavaCro'19 - The State of Java and Software Development in Croatia - Communit...
JavaCro'19 - The State of Java and Software Development in Croatia - Communit...JavaCro'19 - The State of Java and Software Development in Croatia - Communit...
JavaCro'19 - The State of Java and Software Development in Croatia - Communit...
 
Javantura v6 - Java in Croatia and HUJAK - Branko Mihaljević, Aleksander Radovan
Javantura v6 - Java in Croatia and HUJAK - Branko Mihaljević, Aleksander RadovanJavantura v6 - Java in Croatia and HUJAK - Branko Mihaljević, Aleksander Radovan
Javantura v6 - Java in Croatia and HUJAK - Branko Mihaljević, Aleksander Radovan
 
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
 
Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...
Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...
Javantura v6 - Case Study: Marketplace App with Java and Hyperledger Fabric -...
 
Javantura v6 - How to help customers report bugs accurately - Miroslav Čerkez...
Javantura v6 - How to help customers report bugs accurately - Miroslav Čerkez...Javantura v6 - How to help customers report bugs accurately - Miroslav Čerkez...
Javantura v6 - How to help customers report bugs accurately - Miroslav Čerkez...
 
Javantura v6 - When remote work really works - the secrets behind successful ...
Javantura v6 - When remote work really works - the secrets behind successful ...Javantura v6 - When remote work really works - the secrets behind successful ...
Javantura v6 - When remote work really works - the secrets behind successful ...
 
Javantura v6 - Kotlin-Java Interop - Matej Vidaković
Javantura v6 - Kotlin-Java Interop - Matej VidakovićJavantura v6 - Kotlin-Java Interop - Matej Vidaković
Javantura v6 - Kotlin-Java Interop - Matej Vidaković
 
Javantura v6 - Spring HATEOAS hypermedia-driven web services, and clients tha...
Javantura v6 - Spring HATEOAS hypermedia-driven web services, and clients tha...Javantura v6 - Spring HATEOAS hypermedia-driven web services, and clients tha...
Javantura v6 - Spring HATEOAS hypermedia-driven web services, and clients tha...
 
Javantura v6 - End to End Continuous Delivery of Microservices for Kubernetes...
Javantura v6 - End to End Continuous Delivery of Microservices for Kubernetes...Javantura v6 - End to End Continuous Delivery of Microservices for Kubernetes...
Javantura v6 - End to End Continuous Delivery of Microservices for Kubernetes...
 
Javantura v6 - Istio Service Mesh - The magic between your microservices - Ma...
Javantura v6 - Istio Service Mesh - The magic between your microservices - Ma...Javantura v6 - Istio Service Mesh - The magic between your microservices - Ma...
Javantura v6 - Istio Service Mesh - The magic between your microservices - Ma...
 
Javantura v6 - How can you improve the quality of your application - Ioannis ...
Javantura v6 - How can you improve the quality of your application - Ioannis ...Javantura v6 - How can you improve the quality of your application - Ioannis ...
Javantura v6 - How can you improve the quality of your application - Ioannis ...
 
Javantura v6 - Just say it v2 - Pavao Varela Petrac
Javantura v6 - Just say it v2 - Pavao Varela PetracJavantura v6 - Just say it v2 - Pavao Varela Petrac
Javantura v6 - Just say it v2 - Pavao Varela Petrac
 
Javantura v6 - Automation of web apps testing - Hrvoje Ruhek
Javantura v6 - Automation of web apps testing - Hrvoje RuhekJavantura v6 - Automation of web apps testing - Hrvoje Ruhek
Javantura v6 - Automation of web apps testing - Hrvoje Ruhek
 
Javantura v6 - Master the Concepts Behind the Java 10 Challenges and Eliminat...
Javantura v6 - Master the Concepts Behind the Java 10 Challenges and Eliminat...Javantura v6 - Master the Concepts Behind the Java 10 Challenges and Eliminat...
Javantura v6 - Master the Concepts Behind the Java 10 Challenges and Eliminat...
 
Javantura v6 - Building IoT Middleware with Microservices - Mario Kusek
Javantura v6 - Building IoT Middleware with Microservices - Mario KusekJavantura v6 - Building IoT Middleware with Microservices - Mario Kusek
Javantura v6 - Building IoT Middleware with Microservices - Mario Kusek
 

Kürzlich hochgeladen

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Javantura v3 - Real-time BigData ingestion and querying of aggregated data – Davor Poldrugo

  • 1. www.infobip.com REAL-TIME BIG DATA INGESTION AND QUERYING OF AGGREGATED DATA Davor Poldrugo software engineer
  • 2. Davor Poldrugo @ Infobip Software engineer with interest in backend development, high availability and distributed systems. https://about.me/davor.poldrugo
  • 3.
  • 4. ● MOBILE SERVICES: Professional SMS, number validation, voice, USSD, mobile payments; deeply integrated into the telecoms world ● ENTERPRISE PRODUCTS for businesses of any scale and need (mGate, fully-featured web apps, SMS authentication solutions, reseller solutions...) ● APP ENGAGEMENT PLATFORM based on advanced push notifications ● APIs and protocols for EASY INTEGRATION: xml, soap/rest, smpp, http, json ● Full 24/7 TECHNICAL SUPPORT regardless of location ● QUALITY guaranteed by a strict SLA Our services
  • 5.
  • 6. Presentation overview ● Dictionary ● The real-time use case and the challenges (because there are no problems ;) ● The platform and how we got here ● Our path towards real-time data ● Architecture and component overview ● Numbers and conclusion
  • 7. Dictionary REAL-TIME noun “the actual time during which something takes place <the computer may partly analyze the data in real time (as it comes in) — R. H. March> <chatted online in real time> – real-time adjective” http://www.merriam-webster.com/dictionary/real%20time BIG DATA noun “an accumulation of data that is too large and complex for processing by traditional database management tools” http://www.merriam-webster.com/dictionary/big%20data
  • 8. Dictionary INGEST verb “to take (something, such as food) into your body : to swallow (something) — sometimes used figuratively She ingested [=absorbed] large amounts of information very quickly.” http://www.learnersdictionary.com/definition/ingest I'll use this figurative meaning... in context of data ingestion.
  • 9. The real-time use case and the challenges ● Our new web requirement: provide real-time data and graphs of traffic ● SMS Campaigns Web application Near real-time ● But we wanted real-time!
  • 10. The platform and how we got here ● There was only one node – a monolith ● One transactional database (OLTP) ● Traffic increased ● After a while the database began to be a bottleneck ● Then we introduced multiple transaction databases ● Then multiple monolith nodes were introduced – one per database ● Then load balancers were needed
  • 11. The platform and how we got here ● After that querying has become complex: – when one or more databases down for maintenance - data from that DB is missing – queries had to span over multiple databases and then results had to be joined – aggregate reports become a problem (complexity, availability) – aggregation databases introduced (ETL) that pulled from transactional databases ● In the meantime we decoupled our monolithic node to lots of microservice nodes (IpCore, Billing, Contacts, Campaigns, ...) ● As traffic increased, non-transactional (apps, reports) queries become a problem – throughput decrease
  • 12. The platform and how we got here ● Our Database Team introduced GREEN – our ODS/DWH – named after the color of the pencil used to draw on the board ;) – Near real-time ETL (for traffic tables with 150+ columns) – Centralized reporting – Decreased workload from transactional databases – Throughput increase of our core nodes (IpCore) – Specialized indexes – Specialized aggregations – But still... near real-time... – 1 to 60 minutes out of sync – with the transactional databases (depending on the load)
  • 13. Our path towards real-time data ● GREEN ODS/DWH provided an abstract solution for all our traffic data but was not in REAL-TIME ● GREEN consists of big hardware – scales vertically ● This approach tries to solve a particular REAL-TIME use case – one by one – not a silver bullet! ● Because REAL-TIME isn't always needed ● Resources are limited ● The path towards horizontal scalability
  • 14. Our path towards real-time data 1. All data entering the system is dispatched to both the batch layer and the speed layer for processing. 2. The batch layer has two functions: (i) managing the master dataset (an immutable, append-only set of raw data), and (ii) to pre-compute the batch views. 3. The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way. 4. The speed layer compensates for the high latency of updates to the serving layer and deals with recent data only. 5. Any incoming query can be answered by merging results from batch views and real-time views. Lambda architecture ( http://lambda-architecture.net/ )
  • 15. Our path towards real-time data Know thyself! Adapt lambda architecture to fit your needs! IpCore (Core Message Processing) IpCore (Core Message Processing)Messaging Cloud Transactional Databases (OLTP) App Message Event App Message Event App Message Event GREEN DB ODS DWH (newly proclaimed BATCH/SERVING LAYER) REAL-TIME LAYER QUERY LAYER (queries REAL-TIME OR BATCH) Ingest point Ingest point Messaging Cloud App Messaging Cloud App ...
  • 16. Architecture and component overview Messaging Cloud App Message Event App Message Event App Message Event REAL-TIME LAYER ... Data Ingestion Service Process Message Process Delta Pairing and composing a new message Kafka cluster Druid cluster Billing ingest point IpCore ingest point
  • 17. Architecture and component overview REAL-TIME LAYER Data Ingestion Service Kafka cluster Druid cluster { "sendDateTime":"2016-02-19T12:07:47Z", "campaignId":29680, "currencyId":2, "currencyHNBCode":"EUR", "currencySymbol":"€", "countDelta":1, "priceDelta":0.02 }
  • 18. Architecture and component overview REAL-TIME LAYER Kafka cluster Druid cluster Data Ingestion Service GREEN DB ODS DWH BATCH LAYER QUERY LAYER Data Query Service Messaging Cloud App Messaging Cloud App Messaging Cloud App Messaging Cloud App Messaging Cloud App Is realtime? TRUE FALSE
  • 19. Architecture and component overview REAL-TIME LAYER Druid cluster QUERY LAYER Data Query Service POST /druid/v2 HTTP/1.1 Host: druid-broker-node:8080 Content-Type: application/json { "queryType": "groupBy", "dataSource": "campaign-totals-v2", "granularity": "all", "intervals": [ "2012-01-01T00:00:00.000/2100-01-01T00:00:00.000" ], "dimensions": ["campaignId", "currencyId", "currencySymbol", "currencyHNBCode"], "filter": { "type": "selector", "dimension": "campaignId", "value": 29680 }, "aggregations": [ { "type": "longSum", "name": "totalCountSum", "fieldName": "totalCount" }, { "type": "doubleSum", "name": "totalPriceSum", "fieldName": "price" } ] } Request to Druid
  • 20. Architecture and component overview REAL-TIME LAYER Druid cluster QUERY LAYER Data Query Service Response from Druid [ { "version": "v1", "timestamp": "2012-01-01T00:00:00.000Z", "event": { "totalCountSum": 1000000, "currencyid": "2", "totalPriceSum": 20000, "currencysymbol": "€", "currencyhnbcode": "EUR", "campaignid": "29680" } } ]
  • 21. Architecture and component overview KAFKA - https://kafka.apache.org/ ● Kafka maintains feeds of messages in categories called topics ● A distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. FEATURES ● two messaging models incorporated in an abstraction called consumer group (group id) – queue and publish- subscribe – queue - a pool of consumers may read from a server and each message goes to one of them – publish-subscribe - the message is broadcast to all consumers ● constant performance with respect to data size ● replay – all messages are stored and can be accessd with a sequential id number called the offset REAL-TIME LAYER Kafka cluster Druid cluster Data Ingestion Service REAL-TIME LAYER Kafka cluster Druid cluster Data Ingestion Service
  • 22. Architecture and component overview DRUID - http://druid.io/ Druid is a fast column-oriented distributed data store. Real-time Streams Druid supports streaming data ingestion and offers insights on events immediately after they occur. Retain events indefinitely and unify real-time and historical views. Sub-Second Queries Druid supports fast aggregations and sub-second OLAP queries. Scalable to Petabytes Existing Druid clusters have scaled to petabytes of data and trillions of events, ingesting millions of events every second. Druid is extremely cost effective, even at scale. Deploy Anywhere Druid runs on commodity hardware. Deploy it in the cloud or on- premise. Integrate with existing data systems such as Hadoop, Spark, Kafka, Storm, and Samza. REAL-TIME LAYER Kafka cluster Druid cluster Data Ingestion Service REAL-TIME LAYER Kafka cluster Druid cluster Data Ingestion Service
  • 23. Numbers and conlusion Data pipeline Max. throughput (msg/s) Ingest points → Data Ingestion Service 7700 Billing ingest point → Data Ingestion Service 5500 IpCore ingest point → Data Ingestion Service 2200 Data Ingestion service → Kafka 2130 Druid firehose pull and aggregate from Kafka 29000 Real-time! <2 sec delay
  • 24. Numbers and conclusion PROBLEMS / CHALLENGES ;) ● Added complexity to the flow – Maintenance of “ingest point” code – Maintenance of Data Ingestion Service – Operational knowledge of Kafka / Druid ● Scaling Druid – problems with “Druid realtme nodes” and Kafka topics with multiple partitions ● Druid - Exactly once semantics are not guaranteed with real-time ingestion in Druid – but we didn't have problems with our configuration - definitive solution – Druid batch ingestion using Tranquility

Hinweis der Redaktion

  1. I&amp;apos;ll use only this meaning although there are many meanings for the noun big data. Real time – can anything really be real time? Even our the world as we experience it has latency... eyes, ears, smell, touch has to be processed by our brain. This processing takes time, maybe as little as miliseconds
  2. OLTP - ON LINE TRANSACTION PROCESSING
  3. ODS – operational data store
  4. ODS – operational data store
  5. ODS – operational data store
  6. ODS – operational data store
  7. ODS – operational data store
  8. ODS – operational data store
  9. - queue - a pool of consumers may read from a server and each message goes to one of them - publish-subscribe - the message is broadcast to all consumers - If all the consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers. - If all the consumer instances have different consumer groups, then this works like publish-subscribe and all messages are broadcast to all consumers. - Kafka&amp;apos;s performance - so retaining lots of data is not a problem. - You can forget about Kafka – it just works!
  10. ODS – operational data store
  11. ODS – operational data store