SlideShare ist ein Scribd-Unternehmen logo
1 von 75
A real-time architecture using
Hadoop and Storm.
Speaker

Nathan Bijnens
@nathan_gs

A real-time architecture using Hadoop & Storm. #JaxLondon

2
Our Vision

Volume
Big Data

test

A real-time architecture using Hadoop & Storm. #JaxLondon

3
Big Data

Velocity
test

A real-time architecture using Hadoop & Storm. #JaxLondon

4
Our Vision

Volum
e

Variety
test

A real-time architecture using Hadoop & Storm. #JaxLondon

5
Computing Trends
Past

Current

Computation (CPUs)
Expensive

Computation Cheap
(Many Core Computers)

Disk Storage Expensive

Disk Storage Cheap
(Cheap Commodity Disks)

DRAM Expensive

DRAM / SSD
Getting Cheap

Coordination Easy
(Latches Don’t Often Hit)

Coordination Hard
(Latches Stall a Lot, etc)

Source: Immutability Changes Everything - Pat Helland, RICON2012

A real-time architecture using Hadoop & Storm. #JaxLondon

6
Credits
Nathan Marz
Ex-Backtype & Twitter
Startup in
Stealthmode
Storm
Cascalog
ElephantDB
manning.com/marz

A real-time architecture using Hadoop & Storm. #JaxLondon

7
A Data System

A real-time architecture using Hadoop & Storm. #JaxLondon

8
Data is more than Information

Not all information is equal.
Some information is derived from other pieces of
information.

A real-time architecture using Hadoop & Storm. #JaxLondon

9
Data is more than Information

Eventually you will reach the most
‘raw’ form of information.
This is the information you hold true, simple because it
exists.
Let’s call this ‘data’, very similar to ‘event’.

A real-time architecture using Hadoop & Storm. #JaxLondon

10
Events - Before

Events used to manipulate
the master data.

A real-time architecture using Hadoop & Storm. #JaxLondon

11
Events - After

Today, events are the master
data.

A real-time architecture using Hadoop & Storm. #JaxLondon

12
Data System

Let’s store everything.

A real-time architecture using Hadoop & Storm. #JaxLondon

13
Events

Data is Immutable

A real-time architecture using Hadoop & Storm. #JaxLondon

14
Events

Data is Time Based

A real-time architecture using Hadoop & Storm. #JaxLondon

15
Capturing change traditionally

Person

Location

Person

Location

Nathan

Antwerp

Nathan

Ghent

Geert

Dendermonde

Geert

Dendermonde

John

Ghent

John

Ghent

A real-time architecture using Hadoop & Storm. #JaxLondon

16
Capturing change

Person

Location

Timestamp

Person

Location

Time

Nathan

Antwerp

2005-01-01

Nathan

Antwerp

2005-01-01

Geert

Dendermonde

2011-10-08

Geert

Dendermond
e

2011-10-08

John

Ghent

2010-05-02

John

Ghent

2010-05-02

Nathan

Ghent

2013-02-03

A real-time architecture using Hadoop & Storm. #JaxLondon

17
Query

The data you query is often
transformed, aggregated, ...
Rarely used in it’s original form.

A real-time architecture using Hadoop & Storm. #JaxLondon

18
Query

Query = function ( all data
)
A real-time architecture using Hadoop & Storm. #JaxLondon

19
Number of people living in each city.

Person

Location

Time

Location

Count

Nathan

Antwerp

2005-01-01

Ghent

2

Geert

Dendermond
e

2011-10-08

Dendermonde

1

John

Ghent

2010-05-02

Nathan

Ghent

2013-02-03

A real-time architecture using Hadoop & Storm. #JaxLondon

20
Query

All Data

Query

A real-time architecture using Hadoop & Storm. #JaxLondon

22
Query: Precompute

All Data

Precomputed
View

Query

A real-time architecture using Hadoop & Storm. #JaxLondon

23
Layered Architecture

Batch Layer

Speed Layer

Serving Layer

A real-time architecture using Hadoop & Storm. #JaxLondon

24
Layered Architecture

Query

Cassandr
a

Incoming Data
Hadoop

Elephan
tDB

A real-time architecture using Hadoop & Storm. #JaxLondon

25
Batch Layer

A real-time architecture using Hadoop & Storm. #JaxLondon

26
Batch Layer

Incoming Data
Hadoop

Elephan
tDB

A real-time architecture using Hadoop & Storm. #JaxLondon

27
Batch Layer

Unrestrained computation.

A real-time architecture using Hadoop & Storm. #JaxLondon

28
Batch Layer

No need to De-Normalize.

A real-time architecture using Hadoop & Storm. #JaxLondon

29
Batch Layer

Horizontal scalable.

A real-time architecture using Hadoop & Storm. #JaxLondon

30
Batch Layer

High Latency.
Let’s pretend temporarily that update latency
doesn’t matter.

A real-time architecture using Hadoop & Storm. #JaxLondon

31
Batch Layer

Functional computation, based on
immutable inputs, is idempotent.

A real-time architecture using Hadoop & Storm. #JaxLondon

32
Batch Layer

Stores master copy of data
set...
append only.

A real-time architecture using Hadoop & Storm. #JaxLondon

33
Batch Layer

A real-time architecture using Hadoop & Storm. #JaxLondon

34
Batch: View generation

View
#1

Master Dataset

MapReduc
e

View
#2

View
#3

A real-time architecture using Hadoop & Storm. #JaxLondon

35
MapReduce

MAP

1. Take a large data set and divide it into subsets
…

2. Perform the same function on all subsets

REDUCE

DoWork()

DoWork()

DoWork()

…

3. Combine the output from all subsets
…

Output

A real-time architecture using Hadoop & Storm. #JaxLondon

36
Serialization & Schema

Catch errors as quickly as they happen.
Validation on write vs on read.

A real-time architecture using Hadoop & Storm. #JaxLondon

37
Serialization & Schema

CSV is actually a serialization language that is
just poorly defined.

A real-time architecture using Hadoop & Storm. #JaxLondon

38
Serialization & Schema
Use a format with a schema.
-

Thrift
Avro
Protobuffers

Added bonus: it’s faster & uses less space.

A real-time architecture using Hadoop & Storm. #JaxLondon

39
Batch View Database

Read only database.
No random writes required.

A real-time architecture using Hadoop & Storm. #JaxLondon

40
Batch View Database

Every iteration produces the
Views from scratch.

A real-time architecture using Hadoop & Storm. #JaxLondon

41
Batch View Database
ElephantDB
Splout
Voldemort
…

A real-time architecture using Hadoop & Storm. #JaxLondon

42
Batch Layer
We are not done yet…

Just a few hours of data.

Data absorbed into Batch Views

Not yet
absorbed.

A real-time architecture using Hadoop & Storm. #JaxLondon

No
w

Time

44
Speed Layer

A real-time architecture using Hadoop & Storm. #JaxLondon

45
Overview
Cassandr
a

Incoming Data
Hadoop

Elephan
tDB

A real-time architecture using Hadoop & Storm. #JaxLondon

46
Speed Layer

Stream processing.

A real-time architecture using Hadoop & Storm. #JaxLondon

47
Speed Layer

Continuous computation.

A real-time architecture using Hadoop & Storm. #JaxLondon

48
Speed Layer

Transactional.

A real-time architecture using Hadoop & Storm. #JaxLondon

49
Speed Layer

Storing a limited window of
data.
Compensating for the last few hours of data.

A real-time architecture using Hadoop & Storm. #JaxLondon

50
Speed Layer

All the complexity is isolated in the
Speed layer.
If anything goes wrong, it’s auto-corrected.

A real-time architecture using Hadoop & Storm. #JaxLondon

51
CAP
You have a choice between:
Availability
-

Queries are eventual consistent.

Consistency
-

Queries are consistent.

A real-time architecture using Hadoop & Storm. #JaxLondon

52
Eventual accuracy

Some algorithms are hard to
implement in real time. For those
cases we could estimate the results.

A real-time architecture using Hadoop & Storm. #JaxLondon

53
Speed Layer

Real
Time
View 1

Incoming Data
Real
Time
View 2

A real-time architecture using Hadoop & Storm. #JaxLondon

54
Storm
Message passing.
Distributed processing.
Horizontally scalable.
Incremental algorithms.
Fast.
Data in motion.

A real-time architecture using Hadoop & Storm. #JaxLondon

55
Storm

Nimbus

Execute
r

Execute
r

Worker
Node

Supervis
or

Execute
r

Execute
r

Execute
r

Worker
Node

Supervis
or

Execute
r

Execute
r

Execute
r

Execute
r

Supervis
or

Zookeep
er

Worker
Node

A real-time architecture using Hadoop & Storm. #JaxLondon

56
Storm
Tuple

Stream

A real-time architecture using Hadoop & Storm. #JaxLondon

57
Storm
Spout

Bolt

A real-time architecture using Hadoop & Storm. #JaxLondon

58
Storm
Grouping

A real-time architecture using Hadoop & Storm. #JaxLondon

59
Data Ingestion
Kafka
Flume
Scribe
*MQ
Kestrel

A real-time architecture using Hadoop & Storm. #JaxLondon

60
Speed Layer Views
The views are stored in Read & Write database.
-

Cassandra
Hbase
Redis
MySQL
ElasticSearch
…

Much more complex than a read only view.

A real-time architecture using Hadoop & Storm. #JaxLondon

61
Serving Layer

A real-time architecture using Hadoop & Storm. #JaxLondon

62
Overview

Query

Cassandr
a

Incoming Data
Hadoop

Elephan
tDB

A real-time architecture using Hadoop & Storm. #JaxLondon

63
Serving Layer

Random reads

A real-time architecture using Hadoop & Storm. #JaxLondon

64
Serving Layer

This layer queries the Batch & Real
Time views and merges it.

A real-time architecture using Hadoop & Storm. #JaxLondon

65
Serving Layer

Batch
Views

Merge
Real
Time
Views

A real-time architecture using Hadoop & Storm. #JaxLondon

66
Serving Layer

How to query an Average?

A real-time architecture using Hadoop & Storm. #JaxLondon

67
Overview

A real-time architecture using Hadoop & Storm. #JaxLondon

68
Overview

Query

Cassandr
a

Incoming Data
Hadoop

Elephan
tDB

A real-time architecture using Hadoop & Storm. #JaxLondon

69
Lambda Architecture

A real-time architecture using Hadoop & Storm. #JaxLondon

70
Lambda Architecture

Can discard any view, batch and real
time, and just recreate everything from
the master data.

A real-time architecture using Hadoop & Storm. #JaxLondon

71
Lambda Architecture

Mistakes are corrected via recomputation.
Write bad data? Remove the data & recompute.
Bug in view generation? Just recompute the view.

A real-time architecture using Hadoop & Storm. #JaxLondon

72
Lambda Architecture

Data storage is highly optimized.

A real-time architecture using Hadoop & Storm. #JaxLondon

73
Lambda Architecture

Immutability changes everything.

A real-time architecture using Hadoop & Storm. #JaxLondon

74
Questions?

Questions?
@nathan_gs & #BigDataCon13

A real-time architecture using Hadoop & Storm. #JaxLondon

75
DataCrunchers
We enable companies in envisioning, defining and
implementing a data strategy.
A one-stop-shop for all your Big Data needs.
The first Big Data Consultancy agency in Belgium.

A real-time architecture using Hadoop & Storm. #JaxLondon

76
Thank you

Thank you
@nathan_gs

A real-time architecture using Hadoop & Storm. #JaxLondon

77

Weitere ähnliche Inhalte

Was ist angesagt?

Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Brian O'Neill
 
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Databricks
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Spark Summit
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Brian O'Neill
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud ComputingAmazon Web Services
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OSri Ambati
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Spark Summit
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14Sri Ambati
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable PythonTravis Oliphant
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkDatabricks
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengDatabricks
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
 
Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalDatabricks
 
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Richard Seymour
 
Frustration-Reduced PySpark: Data engineering with DataFrames
Frustration-Reduced PySpark: Data engineering with DataFramesFrustration-Reduced PySpark: Data engineering with DataFrames
Frustration-Reduced PySpark: Data engineering with DataFramesIlya Ganelin
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleSri Ambati
 
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...Databricks
 
Introduction to Spark Training
Introduction to Spark TrainingIntroduction to Spark Training
Introduction to Spark TrainingSpark Summit
 

Was ist angesagt? (20)

Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
 
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud Computing
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare Metal
 
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
 
Frustration-Reduced PySpark: Data engineering with DataFrames
Frustration-Reduced PySpark: Data engineering with DataFramesFrustration-Reduced PySpark: Data engineering with DataFrames
Frustration-Reduced PySpark: Data engineering with DataFrames
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt Dowle
 
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
Performance Optimization Case Study: Shattering Hadoop's Sort Record with Spa...
 
Building data pipelines
Building data pipelinesBuilding data pipelines
Building data pipelines
 
Introduction to Spark Training
Introduction to Spark TrainingIntroduction to Spark Training
Introduction to Spark Training
 

Andere mochten auch

Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)jaxLondonConference
 
How Java got its Mojo Back - James Governor (Redmonk)
How Java got its Mojo Back - James Governor (Redmonk)					How Java got its Mojo Back - James Governor (Redmonk)
How Java got its Mojo Back - James Governor (Redmonk) jaxLondonConference
 
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)jaxLondonConference
 
Legal and ethical considerations redone
Legal and ethical considerations   redoneLegal and ethical considerations   redone
Legal and ethical considerations redoneNicole174
 
Interactive media applications
Interactive media applicationsInteractive media applications
Interactive media applicationsNicole174
 
45 second video proposal
45 second video proposal45 second video proposal
45 second video proposalNicole174
 
Bringing your app to the web with Dart - Chris Buckett (Entity Group)
Bringing your app to the web with Dart - Chris Buckett (Entity Group)Bringing your app to the web with Dart - Chris Buckett (Entity Group)
Bringing your app to the web with Dart - Chris Buckett (Entity Group)jaxLondonConference
 
Practical Performance: Understand the Performance of Your Application - Chris...
Practical Performance: Understand the Performance of Your Application - Chris...Practical Performance: Understand the Performance of Your Application - Chris...
Practical Performance: Understand the Performance of Your Application - Chris...jaxLondonConference
 
Streams and Things - Darach Ennis (Ubiquiti Networks)
Streams and Things - Darach Ennis (Ubiquiti Networks)Streams and Things - Darach Ennis (Ubiquiti Networks)
Streams and Things - Darach Ennis (Ubiquiti Networks)jaxLondonConference
 
Are you better than a coin toss? - Richard Warbuton & John Oliver (jClarity)
Are you better than a coin toss?  - Richard Warbuton & John Oliver (jClarity)Are you better than a coin toss?  - Richard Warbuton & John Oliver (jClarity)
Are you better than a coin toss? - Richard Warbuton & John Oliver (jClarity)jaxLondonConference
 
Interactive media applications
Interactive media applicationsInteractive media applications
Interactive media applicationsNicole174
 
What makes Groovy Groovy - Guillaume Laforge (Pivotal)
What makes Groovy Groovy  - Guillaume Laforge (Pivotal)What makes Groovy Groovy  - Guillaume Laforge (Pivotal)
What makes Groovy Groovy - Guillaume Laforge (Pivotal)jaxLondonConference
 
Introducing Vert.x 2.0 - Taking polyglot application development to the next ...
Introducing Vert.x 2.0 - Taking polyglot application development to the next ...Introducing Vert.x 2.0 - Taking polyglot application development to the next ...
Introducing Vert.x 2.0 - Taking polyglot application development to the next ...jaxLondonConference
 
How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...
How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...
How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...jaxLondonConference
 
Scaling Scala to the database - Stefan Zeiger (Typesafe)
Scaling Scala to the database - Stefan Zeiger (Typesafe)Scaling Scala to the database - Stefan Zeiger (Typesafe)
Scaling Scala to the database - Stefan Zeiger (Typesafe)jaxLondonConference
 
Databases and agile development - Dwight Merriman (MongoDB)
Databases and agile development - Dwight Merriman (MongoDB)Databases and agile development - Dwight Merriman (MongoDB)
Databases and agile development - Dwight Merriman (MongoDB)jaxLondonConference
 
The state of the art biorepository at ILRI
The state of the art biorepository at ILRIThe state of the art biorepository at ILRI
The state of the art biorepository at ILRIAbsolomon Kihara
 
Real-world polyglot programming on the JVM - Ben Summers (ONEIS)
Real-world polyglot programming on the JVM  - Ben Summers (ONEIS)Real-world polyglot programming on the JVM  - Ben Summers (ONEIS)
Real-world polyglot programming on the JVM - Ben Summers (ONEIS)jaxLondonConference
 
What You Need to Know About Lambdas - Jamie Allen (Typesafe)
What You Need to Know About Lambdas - Jamie Allen (Typesafe)What You Need to Know About Lambdas - Jamie Allen (Typesafe)
What You Need to Know About Lambdas - Jamie Allen (Typesafe)jaxLondonConference
 

Andere mochten auch (20)

Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)
 
How Java got its Mojo Back - James Governor (Redmonk)
How Java got its Mojo Back - James Governor (Redmonk)					How Java got its Mojo Back - James Governor (Redmonk)
How Java got its Mojo Back - James Governor (Redmonk)
 
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
 
Legal and ethical considerations redone
Legal and ethical considerations   redoneLegal and ethical considerations   redone
Legal and ethical considerations redone
 
Interactive media applications
Interactive media applicationsInteractive media applications
Interactive media applications
 
45 second video proposal
45 second video proposal45 second video proposal
45 second video proposal
 
Bringing your app to the web with Dart - Chris Buckett (Entity Group)
Bringing your app to the web with Dart - Chris Buckett (Entity Group)Bringing your app to the web with Dart - Chris Buckett (Entity Group)
Bringing your app to the web with Dart - Chris Buckett (Entity Group)
 
Practical Performance: Understand the Performance of Your Application - Chris...
Practical Performance: Understand the Performance of Your Application - Chris...Practical Performance: Understand the Performance of Your Application - Chris...
Practical Performance: Understand the Performance of Your Application - Chris...
 
Streams and Things - Darach Ennis (Ubiquiti Networks)
Streams and Things - Darach Ennis (Ubiquiti Networks)Streams and Things - Darach Ennis (Ubiquiti Networks)
Streams and Things - Darach Ennis (Ubiquiti Networks)
 
Are you better than a coin toss? - Richard Warbuton & John Oliver (jClarity)
Are you better than a coin toss?  - Richard Warbuton & John Oliver (jClarity)Are you better than a coin toss?  - Richard Warbuton & John Oliver (jClarity)
Are you better than a coin toss? - Richard Warbuton & John Oliver (jClarity)
 
Interactive media applications
Interactive media applicationsInteractive media applications
Interactive media applications
 
What makes Groovy Groovy - Guillaume Laforge (Pivotal)
What makes Groovy Groovy  - Guillaume Laforge (Pivotal)What makes Groovy Groovy  - Guillaume Laforge (Pivotal)
What makes Groovy Groovy - Guillaume Laforge (Pivotal)
 
Introducing Vert.x 2.0 - Taking polyglot application development to the next ...
Introducing Vert.x 2.0 - Taking polyglot application development to the next ...Introducing Vert.x 2.0 - Taking polyglot application development to the next ...
Introducing Vert.x 2.0 - Taking polyglot application development to the next ...
 
How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...
How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...
How Hailo fuels its growth using NoSQL storage and analytics - Dave Gardner (...
 
Scaling Scala to the database - Stefan Zeiger (Typesafe)
Scaling Scala to the database - Stefan Zeiger (Typesafe)Scaling Scala to the database - Stefan Zeiger (Typesafe)
Scaling Scala to the database - Stefan Zeiger (Typesafe)
 
Databases and agile development - Dwight Merriman (MongoDB)
Databases and agile development - Dwight Merriman (MongoDB)Databases and agile development - Dwight Merriman (MongoDB)
Databases and agile development - Dwight Merriman (MongoDB)
 
The state of the art biorepository at ILRI
The state of the art biorepository at ILRIThe state of the art biorepository at ILRI
The state of the art biorepository at ILRI
 
Why other ppl_dont_get_it
Why other ppl_dont_get_itWhy other ppl_dont_get_it
Why other ppl_dont_get_it
 
Real-world polyglot programming on the JVM - Ben Summers (ONEIS)
Real-world polyglot programming on the JVM  - Ben Summers (ONEIS)Real-world polyglot programming on the JVM  - Ben Summers (ONEIS)
Real-world polyglot programming on the JVM - Ben Summers (ONEIS)
 
What You Need to Know About Lambdas - Jamie Allen (Typesafe)
What You Need to Know About Lambdas - Jamie Allen (Typesafe)What You Need to Know About Lambdas - Jamie Allen (Typesafe)
What You Need to Know About Lambdas - Jamie Allen (Typesafe)
 

Ähnlich wie A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van Landeghem - DataCrunchers

A real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX LondonA real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX LondonNathan Bijnens
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
Developing high frequency indicators using real time tick data on apache supe...
Developing high frequency indicators using real time tick data on apache supe...Developing high frequency indicators using real time tick data on apache supe...
Developing high frequency indicators using real time tick data on apache supe...Zekeriya Besiroglu
 
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...DataWorks Summit
 
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...Cedric CARBONE
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...Nathan Bijnens
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Sid Anand
 
a real-time architecture using Hadoop and Storm at Devoxx
a real-time architecture using Hadoop and Storm at Devoxxa real-time architecture using Hadoop and Storm at Devoxx
a real-time architecture using Hadoop and Storm at DevoxxNathan Bijnens
 
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...javier ramirez
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCSheetal Dolas
 
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...NoSQLmatters
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldLeonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldOutlyer
 
Introduction to near real time computing
Introduction to near real time computingIntroduction to near real time computing
Introduction to near real time computingTao Li
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraDataStax Academy
 
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic WebESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Webeswcsummerschool
 
Staab programming thesemanticweb
Staab programming thesemanticwebStaab programming thesemanticweb
Staab programming thesemanticwebAneta Tu
 

Ähnlich wie A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van Landeghem - DataCrunchers (20)

A real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX LondonA real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX London
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
Developing high frequency indicators using real time tick data on apache supe...
Developing high frequency indicators using real time tick data on apache supe...Developing high frequency indicators using real time tick data on apache supe...
Developing high frequency indicators using real time tick data on apache supe...
 
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
 
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)
 
a real-time architecture using Hadoop and Storm at Devoxx
a real-time architecture using Hadoop and Storm at Devoxxa real-time architecture using Hadoop and Storm at Devoxx
a real-time architecture using Hadoop and Storm at Devoxx
 
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOC
 
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldLeonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
 
Spark Streaming into context
Spark Streaming into contextSpark Streaming into context
Spark Streaming into context
 
Introduction to near real time computing
Introduction to near real time computingIntroduction to near real time computing
Introduction to near real time computing
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
 
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic WebESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
 
Staab programming thesemanticweb
Staab programming thesemanticwebStaab programming thesemanticweb
Staab programming thesemanticweb
 

Mehr von jaxLondonConference

Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...
Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...
Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...jaxLondonConference
 
Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...
Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...
Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...jaxLondonConference
 
JVM Support for Multitenant Applications - Steve Poole (IBM)
JVM Support for Multitenant Applications - Steve Poole (IBM)JVM Support for Multitenant Applications - Steve Poole (IBM)
JVM Support for Multitenant Applications - Steve Poole (IBM)jaxLondonConference
 
Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)
Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)
Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)jaxLondonConference
 
Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...
Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...
Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...jaxLondonConference
 
Java Testing With Spock - Ken Sipe (Trexin Consulting)
Java Testing With Spock - Ken Sipe (Trexin Consulting)Java Testing With Spock - Ken Sipe (Trexin Consulting)
Java Testing With Spock - Ken Sipe (Trexin Consulting)jaxLondonConference
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...jaxLondonConference
 
Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...
Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...
Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...jaxLondonConference
 
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...jaxLondonConference
 
The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)jaxLondonConference
 
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)jaxLondonConference
 
Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)
Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)
Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)jaxLondonConference
 
Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...
Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...
Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...jaxLondonConference
 
Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...
Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...
Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...jaxLondonConference
 
Little words of wisdom for the developer - Guillaume Laforge (Pivotal)
Little words of wisdom for the developer - Guillaume Laforge (Pivotal)Little words of wisdom for the developer - Guillaume Laforge (Pivotal)
Little words of wisdom for the developer - Guillaume Laforge (Pivotal)jaxLondonConference
 
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...jaxLondonConference
 

Mehr von jaxLondonConference (17)

Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...
Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...
Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...
 
Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...
Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...
Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...
 
JVM Support for Multitenant Applications - Steve Poole (IBM)
JVM Support for Multitenant Applications - Steve Poole (IBM)JVM Support for Multitenant Applications - Steve Poole (IBM)
JVM Support for Multitenant Applications - Steve Poole (IBM)
 
Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)
Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)
Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)
 
Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...
Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...
Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...
 
Java Testing With Spock - Ken Sipe (Trexin Consulting)
Java Testing With Spock - Ken Sipe (Trexin Consulting)Java Testing With Spock - Ken Sipe (Trexin Consulting)
Java Testing With Spock - Ken Sipe (Trexin Consulting)
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
 
Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...
Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...
Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...
 
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
 
The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)
 
TDD at scale - Mash Badar (UBS)
TDD at scale - Mash Badar (UBS)TDD at scale - Mash Badar (UBS)
TDD at scale - Mash Badar (UBS)
 
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)
 
Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)
Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)
Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)
 
Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...
Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...
Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...
 
Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...
Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...
Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...
 
Little words of wisdom for the developer - Guillaume Laforge (Pivotal)
Little words of wisdom for the developer - Guillaume Laforge (Pivotal)Little words of wisdom for the developer - Guillaume Laforge (Pivotal)
Little words of wisdom for the developer - Guillaume Laforge (Pivotal)
 
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
 

Kürzlich hochgeladen

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van Landeghem - DataCrunchers

Hinweis der Redaktion

  1. How much data do you have? 44 times as much data in the next decade, 15 Zb in 2015Data silos (erp, crm, …)CustomersTrimble (3Tb in hun database systeem)Truvo (wijzigen van een index duurt 24u)Traditionele systemen kunnen dit volume niet aan.How many data do you have?Turn 12 terabytes of Tweets created each day into improved product sentiment analysisConvert 350 billion annual meter readings to better predict power consumption
  2. Real timeTime sensitivedecisiontakingFrauddetectionEnergy allocationMarketing campaignsMarket transactionsSolution:Real-time solutions in combination with batch (hadoop)Nosql systems
  3. StructuredUnstructured80% is unstructured data, A key drawback of using traditional relational database systems is that they're not good at handling variable data. A flexible data modelWord, email, foto, text, video, APIs, …?What are your needs regarding variety?The end result: bringingstructureintounstructured dataMonitor 100’s of live video feeds from surveillance cameras to target points of interestExploit the 80% data growth in images, video and documents to improve customer satisfaction
  4. We can afford to keep Immutable Copies of lots of data.We NEED immutability to Coordinate with fewer challenges.Semaphores & Locks are the things to avoid: Instruction opportunities lost waiting for a semaphore increase with more cores…
  5. The # of followers on Twitter = all follows & unfollows combined.Account balance
  6. Data = eventIn an ever changing world we found a ‘safe heaven’ for dataEverything we do generates events:Pay with Credit CardCommit to GitClick on a webpageTweet
  7. It is easier to store all data in a cost effective way.Compare to DWH world.
  8. Immutability greatly restricts the range of errors that can cause data loss or data corruption.Ex. Only CR, no more CRUD.Information might of course change.Fault ToleranceData lossHuman error, Hardware failureData CorruptionParallel met functioneelprogrammeren.
  9. Allows state regeneration. Eg. What was my bank balance on 1 may 2005?
  10. Queries as pure functions that take all data as input is the most general formulation.Different functions may look at different portions and aggregate information in different ways.
  11. Too slow; might be petabyte scaleImpala/Drill: why not
  12. The batch layer can calculate anything (given enough time).
  13. The batch layer stores the data normalized, but in the views it generates, data is often, if not always de normalized.
  14. Not vertically
  15. It’s OK to croak and restart
  16. Is something really immutable when it’s name can change.
  17. Doesn’t have to be Hadoop. The importance here is a Distributed FS combined with a processing framework.Spark,
  18. Source: PolybasePass2012.pptxhttp://whyjava.wordpress.com/2011/08/04/how-i-explained-mapreduce-to-my-wife/
  19. http://www.quora.com/Apache-Hadoop/What-is-the-advantage-of-writing-custom-input-format-and-writable-versus-the-TextInputFormat-and-Text-writable/answer/Eric-Sammer?srid=PU&st=nsValue of schemas• Structural integrity• Guarantees on what can and can’t be stored• Prevents corruptionOtherwise you’ll detect corruption issues at read-time
  20. http://www.quora.com/Apache-Hadoop/What-is-the-advantage-of-writing-custom-input-format-and-writable-versus-the-TextInputFormat-and-Text-writable/answer/Eric-Sammer?srid=PU&st=ns
  21. Maarkanopgelostworden, door bvb ES je views op voorhandtegenereren.
  22. In some circumstances.
  23. All the complexity of *dealing* with the CAP theorem (like read repair) is isolated in the realtime layer.
  24. Consistency (all nodes see the same data at the same time)Availability (a guarantee that every request receives a response about whether it was successful or failed)Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)http://codahale.com/you-cant-sacrifice-partition-tolerance/Hbasavs Cassandra
  25. Eg. Unique countsML
  26. Nimbus:Manages the clusterWorker Node:Supervisor:Manages workers; restarts them if neededExecuterPhysical JVM process.Execute tasks (those are spread evenly across the workers)TasksEach in his own Thread. Is the actual Bolt or Spout.Processes the stream.
  27. Tuple:Named list of valuesDynamicly typedStreamSequence of Tuples
  28. SpoutSource of StreamsSometimes replayableBoltStream transformationsAt least 1 input stream0 - * output streams
  29. The serving layer needs to be able to answer any query in a short amount of time.
  30. AVG = sum + count; preaggregate, but not everything is possible.
  31. Lambda first named by Alonzo Church, he needed a letter for functional abstraction in theory of computation in the 1930s.
  32. High tolerance for human & system errors.
  33. http://www.quora.com/Apache-Hadoop/What-is-the-advantage-of-writing-custom-input-format-and-writable-versus-the-TextInputFormat-and-Text-writable/answer/Eric-Sammer?srid=PU&st=ns
  34. Data storage layer optimized independently from query resolution layer
  35. If you remember one thing about this presentation is: Immutability.