SlideShare ist ein Scribd-Unternehmen logo
1 von 91
Downloaden Sie, um offline zu lesen
Swiss Transport in Real Time:
Tribulations in the Big Data Stack
Alexandre Masselot
Soft-shake, Geneva
October 2016
Swiss Transport in Real Time:
Tribulations in the Big Data Stack
Alexandre Masselot
Soft-shake, Geneva
October 2016
Is it possible to build
a simple scalable infrastructure, to
dispatch, store, transform

and visualize “near real time” data
and achieve a posteriori analysis?
This is only
a POC!!!
Finding a dataset
• social media
• finance
• sport
• energy
• transport
• log analysis
• meteorology
• bioinformatics
• personalized health
• monitoring
• security
• IOT
Finding a dataset
• social media
• finance
• sport
• energy
• transport
• log analysis
• meteorology
• bioinformatics
• personalized health
• monitoring
• security
• IOT
www.voev.ch
www.voev.ch
www.voev.ch
www.voev.ch
AAGL Autobus AG Liestal
AAGR Auto AG Rothenburg
AAGS Auto AG Schwyz
AAGU AUTO AG URI
AB Appenzeller Bahnen AG
ABl Autolinee Bleniesi SA
ABF Autobusbetrieb Freienbach
AFA Automobilverkehr Frutigen Adelboden AG
AMSA Autolinea Mendrisiense SA
AOT Autokurse Oberthurgau AG
ARAG Rottal Auto AG
ARBAG Aletsch Riederalp Bahnen AG
ARL Autolinee Regionali Luganesi
AS Autobetrieb Sernftal AG
ASGS Autotransports Sion-Grône-Sierre
ASm Aare Seeland mobil AG
AVG Autoverkehr Grindelwald AG
AVJ Autotransports de la Vallée de Joux
AWA Autobetrieb Weesen-Amden
AZZK Autobus Zürich-Zollikon-Küsnacht
BB Bürgenstock Bahnen
BBA Busbetrieb Aarau AAR bus+bahn
BBBW Bus-Betrieb Binggeli
BDWM BDWM Transport AG
BGU BGU Busbetrieb Grenchen und Umgebung AG
BLAG Busland AG
BLM Bergbahn Lauterbrunnen-Mürren AG
BLS BLS AG
BLT BLT Baselland Transport AG
BLWE Busbetrieb Lichtensteig-Wattwil-Ebnat-Kappel
BOB Berner Oberland-Bahnen AG
BOGG Busbetrieb Olten Gösgen Gäu AG
BOS BUS Ostschweiz AG
BOS-M BOS Management AG
BRB Brienz Rothorn Bahn AG
BRER Busbetrieb Rapperswil-Eschenbach-Rüti
BRSB Braunwald-Standseilbahn AG
BSU Busbetrieb Solothurn und Umgebung AG
BVB Basler Verkehrs-Betriebe
CGN CGN SA
CJ Compagnie des chemins de fer du Jura (C.J.) SA
CROS Crossrail AG
DBSCH DB Schenker Rail Schweiz GmbH
DBZ Dolderbahn Zürich
ETB Emmentalbahn, Huttwil
FART Ferrovie Autolinee Regionali Ticinesi
FB Forchbahn AG
FC FUNICAR Kursbetriebe AG
FLP Ferrovie Luganesi SA
FW Frauenfeld-Wil-Bahn AG
GGB Gornergrat Bahn AG HBSAG Hafenbahn Schweiz AG
JB Jungfraubahn AG
LEB Chemin de fer Lausanne-Echallens-Bercher
LLB AG für Verkehrsbetriebe Leuk-Leukerbad und Umgebung
LSMS Schilthornbahn AG
MBC Transports de la région Morges-Bière-Cossonay SA
MG Ferrovia Monte Generoso SA
MGB Matterhorn Gotthard Bahn
MIB Kraftwerke Oberhasli AG Meiringen-Innertkirchen-Bahn
MOB Chemin de fer Montreux-Oberland Bernois
MVR Transports Montreux-Vevey-Riviera SA
NHB Niederhornbahn
NB Niesenbahn AG
NStCM Chemin de fer Nyon-St. Cergue-Morez
OeBB Oensingen-Balsthal-Bahn
PAG PostAuto Schweiz AG
PB PILATUS-BAHNEN AG
RA RegionAlps SA
RAILG Railgate AG
RB RIGI BAHNEN AG
RBL Regionalbus Lenzburg AG
RBS Regionalverkehr Bern-Solothurn AG
REGO Regiobus Gossau AG
RhB Rhätische Bahn AG
RNCH DB Schenker Rail Schweiz GmbH
RLC railCare
RVBW Regionale Verkehrsbetriebe Baden-Wettingen AG
RVSH SchaffhausenBus, Regionale Verkehrsbetriebe SH AG
SBB SBB AG
SBB-D SBB GmbH
SBC Stadtbus Chur AG
SBF Stadtbus Frauenfeld
SBW Stadtbus Winterthur
SMC Cie de Chemin de Fer+d'Autobus Sierre-Montana-Crans (SMC) SA
SMGN Société des Mouettes Genevoises Navigation SA
SMtS Funiculaire St-Imier - Mont-Soleil SA
SOB Schweizerische Südostbahn AG
SRTAG Swiss Rail Traffic AG
SSIF Società Subalpina di Imprese Ferroviarie S.p.A.
ST Sursee-Triengen-Bahn
STB Sensetalbahn AG
STI Verkehrsbetriebe STI AG
SVB BERNMOBIL Städt. Verkehrsbetriebe Bern
SWAG Seilbahn Weissenstein AG
SZU Sihltal Zürich Uetliberg Bahn SZU AG
THURBO Thurbo AG
TL Transports publics de la région lausannoise SA
TMR TRANSPORTS DE MARTIGNY ET REGIONS SA
TPC Transports Publics du Chablais SA
TPF Transports publics fribourgeois SA
TPG Transports publics genevois
TPL Trasporti Pubblici Luganesi SA
TPN Transports Publics de la Région Nyonnaise SA
TRN Transports Publics Neuchâtelois SA
TRAVYS TRAVYS SA Transports Vallée de Joux-Yverdon-Sainte-Croix
TSD Theytaz Excursions Sion
VB Verkehrsbetriebe Biel
VBD Verkehrsbetrieb der Landschaft Davos
VBG VBG Verkehrsbetriebe Glattal AG
VBH Verkehrsbetriebe Herisau
VBL Verkehrsbetriebe Luzern AG
VBSG Verkehrsbetriebe St.Gallen
VBSH Verkehrsbetriebe Schaffhausen
VBZ Verkehrsbetriebe Zürich
VMCV Transports publics Vevey-Montreux-Chillon-Villeneuve
VSSU Verband Schweizerischer Schifffahrtsunternehmen
VZO Verkehrsbetriebe Zürichsee und Oberland AG
WAB Wengernalpbahn AG
WB Waldenburgerbahn AG
WRS Widmer Rail Services Personal AG
WSB Wynental- und Suhrentalbahn AAR bus+bahn
ZB zb Zentralbahn AG
ZVB Zugerland Verkehrsbetriebe AG
ZVV Zürcher Verkehrsverbund ZVV
AES Ägerisee Schifffahrt AG
BLS BLS AG Schifffahrt Berner Oberland Thuner- und Brienzersee
BPG Basler Personenschifffahrt AG
BSG Bielersee-Schifffahrts-Gesellschaft AG
CGN CGN SA
FHM Zürichsee-Fähre Horgen-Meilen AG
LNM Société de Navigation Lacs de Neuchâtel et Morat SA
NLM Navigazione Lago Maggiore
SBS SBS Schifffahrt AG
SGG Schifffahrts-Genossenschaft Greifensee
SGH Schifffahrtsgesellschaft Hallwilersee AG
SGV Schifffahrtsgesellschaft des Vierwaldstättersees
SGZ Schifffahrtsgesellschaft für den Zugersee AG / Ägerisee
SNL Società Navigazione del Lago di Lugano SA
SW Schiffsbetrieb Walensee AG
URh Schweiz. Schifffahrtsgesellschaft Untersee und Rhein AG
ZSG Zürichsee-Schifffahrtsgesellschaft AG
AAGL Autobus AG Liestal
AAGR Auto AG Rothenburg
AAGS Auto AG Schwyz
AAGU AUTO AG URI
AB Appenzeller Bahnen AG
ABl Autolinee Bleniesi SA
ABF Autobusbetrieb Freienbach
AFA Automobilverkehr Frutigen Adelboden AG
AMSA Autolinea Mendrisiense SA
AOT Autokurse Oberthurgau AG
ARAG Rottal Auto AG
ARBAG Aletsch Riederalp Bahnen AG
ARL Autolinee Regionali Luganesi
AS Autobetrieb Sernftal AG
ASGS Autotransports Sion-Grône-Sierre
ASm Aare Seeland mobil AG
AVG Autoverkehr Grindelwald AG
AVJ Autotransports de la Vallée de Joux
AWA Autobetrieb Weesen-Amden
AZZK Autobus Zürich-Zollikon-Küsnacht
BB Bürgenstock Bahnen
BBA Busbetrieb Aarau AAR bus+bahn
BBBW Bus-Betrieb Binggeli
BDWM BDWM Transport AG
BGU BGU Busbetrieb Grenchen und Umgebung AG
BLAG Busland AG
BLM Bergbahn Lauterbrunnen-Mürren AG
BLS BLS AG
BLT BLT Baselland Transport AG
BLWE Busbetrieb Lichtensteig-Wattwil-Ebnat-Kappel
BOB Berner Oberland-Bahnen AG
BOGG Busbetrieb Olten Gösgen Gäu AG
BOS BUS Ostschweiz AG
BOS-M BOS Management AG
BRB Brienz Rothorn Bahn AG
BRER Busbetrieb Rapperswil-Eschenbach-Rüti
BRSB Braunwald-Standseilbahn AG
BSU Busbetrieb Solothurn und Umgebung AG
BVB Basler Verkehrs-Betriebe
CGN CGN SA
CJ Compagnie des chemins de fer du Jura (C.J.) SA
CROS Crossrail AG
DBSCH DB Schenker Rail Schweiz GmbH
DBZ Dolderbahn Zürich
ETB Emmentalbahn, Huttwil
FART Ferrovie Autolinee Regionali Ticinesi
FB Forchbahn AG
FC FUNICAR Kursbetriebe AG
FLP Ferrovie Luganesi SA
FW Frauenfeld-Wil-Bahn AG
GGB Gornergrat Bahn AG HBSAG Hafenbahn Schweiz AG
JB Jungfraubahn AG
LEB Chemin de fer Lausanne-Echallens-Bercher
LLB AG für Verkehrsbetriebe Leuk-Leukerbad und Umgebung
LSMS Schilthornbahn AG
MBC Transports de la région Morges-Bière-Cossonay SA
MG Ferrovia Monte Generoso SA
MGB Matterhorn Gotthard Bahn
MIB Kraftwerke Oberhasli AG Meiringen-Innertkirchen-Bahn
MOB Chemin de fer Montreux-Oberland Bernois
MVR Transports Montreux-Vevey-Riviera SA
NHB Niederhornbahn
NB Niesenbahn AG
NStCM Chemin de fer Nyon-St. Cergue-Morez
OeBB Oensingen-Balsthal-Bahn
PAG PostAuto Schweiz AG
PB PILATUS-BAHNEN AG
RA RegionAlps SA
RAILG Railgate AG
RB RIGI BAHNEN AG
RBL Regionalbus Lenzburg AG
RBS Regionalverkehr Bern-Solothurn AG
REGO Regiobus Gossau AG
RhB Rhätische Bahn AG
RNCH DB Schenker Rail Schweiz GmbH
RLC railCare
RVBW Regionale Verkehrsbetriebe Baden-Wettingen AG
RVSH SchaffhausenBus, Regionale Verkehrsbetriebe SH AG
SBB SBB AG
SBB-D SBB GmbH
SBC Stadtbus Chur AG
SBF Stadtbus Frauenfeld
SBW Stadtbus Winterthur
SMC Cie de Chemin de Fer+d'Autobus Sierre-Montana-Crans (SMC) SA
SMGN Société des Mouettes Genevoises Navigation SA
SMtS Funiculaire St-Imier - Mont-Soleil SA
SOB Schweizerische Südostbahn AG
SRTAG Swiss Rail Traffic AG
SSIF Società Subalpina di Imprese Ferroviarie S.p.A.
ST Sursee-Triengen-Bahn
STB Sensetalbahn AG
STI Verkehrsbetriebe STI AG
SVB BERNMOBIL Städt. Verkehrsbetriebe Bern
SWAG Seilbahn Weissenstein AG
SZU Sihltal Zürich Uetliberg Bahn SZU AG
THURBO Thurbo AG
TL Transports publics de la région lausannoise SA
TMR TRANSPORTS DE MARTIGNY ET REGIONS SA
TPC Transports Publics du Chablais SA
TPF Transports publics fribourgeois SA
TPG Transports publics genevois
TPL Trasporti Pubblici Luganesi SA
TPN Transports Publics de la Région Nyonnaise SA
TRN Transports Publics Neuchâtelois SA
TRAVYS TRAVYS SA Transports Vallée de Joux-Yverdon-Sainte-Croix
TSD Theytaz Excursions Sion
VB Verkehrsbetriebe Biel
VBD Verkehrsbetrieb der Landschaft Davos
VBG VBG Verkehrsbetriebe Glattal AG
VBH Verkehrsbetriebe Herisau
VBL Verkehrsbetriebe Luzern AG
VBSG Verkehrsbetriebe St.Gallen
VBSH Verkehrsbetriebe Schaffhausen
VBZ Verkehrsbetriebe Zürich
VMCV Transports publics Vevey-Montreux-Chillon-Villeneuve
VSSU Verband Schweizerischer Schifffahrtsunternehmen
VZO Verkehrsbetriebe Zürichsee und Oberland AG
WAB Wengernalpbahn AG
WB Waldenburgerbahn AG
WRS Widmer Rail Services Personal AG
WSB Wynental- und Suhrentalbahn AAR bus+bahn
ZB zb Zentralbahn AG
ZVB Zugerland Verkehrsbetriebe AG
ZVV Zürcher Verkehrsverbund ZVV
AES Ägerisee Schifffahrt AG
BLS BLS AG Schifffahrt Berner Oberland Thuner- und Brienzersee
BPG Basler Personenschifffahrt AG
BSG Bielersee-Schifffahrts-Gesellschaft AG
CGN CGN SA
FHM Zürichsee-Fähre Horgen-Meilen AG
LNM Société de Navigation Lacs de Neuchâtel et Morat SA
NLM Navigazione Lago Maggiore
SBS SBS Schifffahrt AG
SGG Schifffahrts-Genossenschaft Greifensee
SGH Schifffahrtsgesellschaft Hallwilersee AG
SGV Schifffahrtsgesellschaft des Vierwaldstättersees
SGZ Schifffahrtsgesellschaft für den Zugersee AG / Ägerisee
SNL Società Navigazione del Lago di Lugano SA
SW Schiffsbetrieb Walensee AG
URh Schweiz. Schifffahrtsgesellschaft Untersee und Rhein AG
ZSG Zürichsee-Schifffahrtsgesellschaft AG
What do we propose?
https://github.com/alexmasselot/swiss-transport-realtime
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
offline
real time
users
data analysts
vehicles
positions
station
boards
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
offline
real time
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
offline
real time
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
offline
real time
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
This is only
a POC!!!
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
offline
real time
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
dispatch
vehicles
positions
station
boards
Acquire
SBB rest api
vehicles
positions
vehicles
positions
station
boards
station
boards
OpenData
transport api
{
id: 12345xyz,
category: IR,
name: IR 72928,
destination: Alpnach,
position: {
lat: 46.940582,
lon: 8.275442
}
}
positionspositions
{
id: 12345xyz,
category: IR,
name: IR 72928,
destination: Alpnach,
position: {
lat: 46.940582,
lon: 8.275442
}
}
station
boards
station
boards
{
station: {
name: Lausanne,
location: {lat, long}
},
departures: [
{
to:Domodossola,
time: 20:13,
delayed: 4,
prognosis: {
capacity2nd: 3,
capacity1st: 1
}
},
{…}
positionspositions
Dispatch
offline
real time
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
dispatch
vehicles
positions
station
boards
Events are streamed to
“Kafka is used for building real-
time data pipelines and
streaming apps. It is horizontally
scalable, fault-tolerant, wicked
fast, and runs in production in
thousands of companies.”
kafka.apache.org
Events are streamed to
“Kafka is used for building real-
time data pipelines and
streaming apps. It is horizontally
scalable, fault-tolerant, wicked
fast, and runs in production in
thousands of companies.”
kafka.apache.org
real time
offline
Kafka, RabbitMQ, ZeroMQ…
TIMTOWTDI
Store
format
dispatch
storage
Store
format
dispatch
storage
logstash
Store
format
dispatch
storage
logstash elasticsearch
Store
format
dispatch
storage
logstash elasticsearch
flat fileflat fileflat fileflat fileflat fileflat fileflat files
Logstash, Flume, Filebeat…
TIMTOWTDI
Elasticsearch, HBase, Cassandra…
TIMTOWTDI
real time
transform
dispatch
expose visualization
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Stream transformation
• We have an input flow of events and want to:
• know if a train is stopped into a station;
• know if a train as exited the network;
• expose an aggregated station board.
• We need to:
• digest the input flow;
• process with temporary state persistance;
• be able to expose snapshots.
Stream transformation
• Scala is The language for Big Data (functional & OO)

• Akka (actors):
• lightweight entities (one per train, per station);
• easy asynchronous communications;
• the perfect use case.
• Play framework for REST service, configuration etc.
Spark Streaming, Storm, Flink…
TIMTOWTDI
Spark Streaming, Storm, Flink…
TIMTOWTDI
DevOps
: putting everything together
• The “simple” infrastructure is not so light;
• A developper should have everything on his/her
laptop without polluting the machine;
• Docker comes to the rescue:
• lightweight containers,
• pre-existing images,
• docker-compose to describe the infrastructure
• deploy directly to AWS or GCE.
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
Performance: 2 numbers
Performance: 2 numbers
15x faster ajax queries (vs SBB rest)

to gather 30 times more trains
Performance: 2 numbers
15% CPU: nodeJS + kafka + akka + play
15x faster ajax queries (vs SBB rest)

to gather 30 times more trains
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
A scalable infrastructure
Kafka partitioning and zookeeper
Logstash ? (but naturally recover on failure)
Elasticsearch partitioning
Spark streaming
distributed by essence

& write ahead logs
Akka
aka cluster, supervisors

& failure strategy
Docker Kubernetes, AWS, GCE, Exoscale
offline
real time
users
data analysts
vehicles
positions
station
boards
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
JS for large data set
• Only a rendering library (but fast);
• Use a flux architecture;
• Built by Facebook.
JS for large data set
• Only a rendering library (but fast);
• Use a flux architecture;
• Built by Facebook. Dispatcher
Store
View
Action
Action
JavaScript for big data viz
• React can handle viz >100k elements (don’t show
them individually!)
JavaScript for big data viz
• React can handle viz >100k elements (don’t show
them individually!)
• Beware of performance issue;
JavaScript for big data viz
• React can handle viz >100k elements (don’t show
them individually!)
• Beware of performance issue;
• Testing is not an option.
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
Is it possible to build
a simple scalable infrastructure, to
dispatch, transform and visualize

“near real time” massive data
and achieve a posteriori analysis?
4.5 months of data
A. What is the train occupancy during weekdays,
between Lausanne and Geneva?
B. When are the train the most delayed?
C. Where are the train the most delayed?
A. Lausanne-Genève:
when to have a seat?
Lausanne-Genève: when to have a seat?
Lausanne-Genève: when to have a seat?
Lausanne-Genève: when to have a seat?
Good luck

in finding a spot!
or pay…
Lausanne-Genève: when to have a seat?
Good luck

in finding a spot!
Wake up earlier!
or pay…
Lausanne-Genève: when to have a seat?
Good luck

in finding a spot!
Wake up earlier!
Lausanne-Genève: when to have a seat?
B. When are the trains most delayed?
C. Where are the trains most delayed?
Trains Expected
Trains Delayed
Data analysis tooling…
…or “reproducible science”
a data science notebook
• Web application
• Interactively edit and run pieces of code (analysis
steps)
• Inclined towards Python (although other languages
are available)
• Beware of performance with large dataset (sample
data or use Spark mode)
a data science notebook
Jupyter, Zeppelin, RStudio…
TIMTOWTDI
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
https://github.com/alexmasselot/swiss-transport-realtime
transform
format
dispatch
storage
expose
analysis
visualization
users
data analysts
vehicles
positions
station
boards
This is only
a POC!!!
https://github.com/alexmasselot/swiss-transport-realtime
users
data analysts
@alex_massamasselot@octo.com
Nov 8th 7 pm, Genève
“Banknote Recognition System”

(Machine Learning)
Nov 10th 6 pm, Genève
“Data Science & Machine Learning:

Explorer, Comprendre Et Prédire”
Demo on OCTO stand

Weitere ähnliche Inhalte

Andere mochten auch

IoT Virtualization Poster
IoT Virtualization PosterIoT Virtualization Poster
IoT Virtualization PosterMehdi TAZI
 
És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...
És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...
És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...AMTU
 
Smart Data Tool - Smart City Conference St. Gallen
Smart Data Tool - Smart City Conference St. GallenSmart Data Tool - Smart City Conference St. Gallen
Smart Data Tool - Smart City Conference St. GallenRaphael Rollier
 
Creating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationCreating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationRaffael Marty
 
IBM Hybrid Integration Platform
IBM Hybrid Integration PlatformIBM Hybrid Integration Platform
IBM Hybrid Integration PlatformRobert Nicholson
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...OCTO Technology Suisse
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern TalesPolar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern TalesOCTO Technology Suisse
 
Session 1 reg and tl in classroom
Session 1 reg and tl in classroomSession 1 reg and tl in classroom
Session 1 reg and tl in classroomNick Verney
 

Andere mochten auch (18)

IoT Virtualization Poster
IoT Virtualization PosterIoT Virtualization Poster
IoT Virtualization Poster
 
Cloud : en 2017, sortez du stratus !
Cloud : en 2017, sortez du stratus !Cloud : en 2017, sortez du stratus !
Cloud : en 2017, sortez du stratus !
 
És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...
És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...
És possible evolucionar des d'un cercle viciós cap un cercle virtuós en el fi...
 
Smart Data Tool - Smart City Conference St. Gallen
Smart Data Tool - Smart City Conference St. GallenSmart Data Tool - Smart City Conference St. Gallen
Smart Data Tool - Smart City Conference St. Gallen
 
Creating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationCreating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & Visualization
 
IBM Hybrid Integration Platform
IBM Hybrid Integration PlatformIBM Hybrid Integration Platform
IBM Hybrid Integration Platform
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern TalesPolar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
 
Journée ASIT VD 2014 - session 4
Journée ASIT VD 2014 - session 4Journée ASIT VD 2014 - session 4
Journée ASIT VD 2014 - session 4
 
Session 1 reg and tl in classroom
Session 1 reg and tl in classroomSession 1 reg and tl in classroom
Session 1 reg and tl in classroom
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Mehr von Alexandre Masselot

Mehr von Alexandre Masselot (14)

Offshoring software development in Switzerland: You can do it
Offshoring software development in Switzerland: You can do itOffshoring software development in Switzerland: You can do it
Offshoring software development in Switzerland: You can do it
 
groovy & grails - lecture 8
groovy & grails - lecture 8groovy & grails - lecture 8
groovy & grails - lecture 8
 
groovy & grails - lecture 10
groovy & grails - lecture 10groovy & grails - lecture 10
groovy & grails - lecture 10
 
groovy & grails - lecture 2
groovy & grails - lecture 2groovy & grails - lecture 2
groovy & grails - lecture 2
 
groovy & grails - lecture 1
groovy & grails - lecture 1groovy & grails - lecture 1
groovy & grails - lecture 1
 
groovy & grails - lecture 11
groovy & grails - lecture 11groovy & grails - lecture 11
groovy & grails - lecture 11
 
groovy & grails - lecture 12
groovy & grails - lecture 12groovy & grails - lecture 12
groovy & grails - lecture 12
 
groovy & grails - lecture 13
groovy & grails - lecture 13groovy & grails - lecture 13
groovy & grails - lecture 13
 
groovy & grails - lecture 9
groovy & grails - lecture 9groovy & grails - lecture 9
groovy & grails - lecture 9
 
groovy & grails - lecture 7
groovy & grails - lecture 7groovy & grails - lecture 7
groovy & grails - lecture 7
 
groovy & grails - lecture 6
groovy & grails - lecture 6groovy & grails - lecture 6
groovy & grails - lecture 6
 
groovy & grails - lecture 5
groovy & grails - lecture 5groovy & grails - lecture 5
groovy & grails - lecture 5
 
groovy & grails - lecture 4
groovy & grails - lecture 4groovy & grails - lecture 4
groovy & grails - lecture 4
 
groovy & grails - lecture 3
groovy & grails - lecture 3groovy & grails - lecture 3
groovy & grails - lecture 3
 

Kürzlich hochgeladen

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 

Kürzlich hochgeladen (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 

Swiss Transport in Real Time: Tribulations in the Big Data Stack

  • 1. Swiss Transport in Real Time: Tribulations in the Big Data Stack Alexandre Masselot Soft-shake, Geneva October 2016
  • 2. Swiss Transport in Real Time: Tribulations in the Big Data Stack Alexandre Masselot Soft-shake, Geneva October 2016
  • 3. Is it possible to build a simple scalable infrastructure, to dispatch, store, transform
 and visualize “near real time” data and achieve a posteriori analysis? This is only a POC!!!
  • 4. Finding a dataset • social media • finance • sport • energy • transport • log analysis • meteorology • bioinformatics • personalized health • monitoring • security • IOT
  • 5. Finding a dataset • social media • finance • sport • energy • transport • log analysis • meteorology • bioinformatics • personalized health • monitoring • security • IOT
  • 10. AAGL Autobus AG Liestal AAGR Auto AG Rothenburg AAGS Auto AG Schwyz AAGU AUTO AG URI AB Appenzeller Bahnen AG ABl Autolinee Bleniesi SA ABF Autobusbetrieb Freienbach AFA Automobilverkehr Frutigen Adelboden AG AMSA Autolinea Mendrisiense SA AOT Autokurse Oberthurgau AG ARAG Rottal Auto AG ARBAG Aletsch Riederalp Bahnen AG ARL Autolinee Regionali Luganesi AS Autobetrieb Sernftal AG ASGS Autotransports Sion-Grône-Sierre ASm Aare Seeland mobil AG AVG Autoverkehr Grindelwald AG AVJ Autotransports de la Vallée de Joux AWA Autobetrieb Weesen-Amden AZZK Autobus Zürich-Zollikon-Küsnacht BB Bürgenstock Bahnen BBA Busbetrieb Aarau AAR bus+bahn BBBW Bus-Betrieb Binggeli BDWM BDWM Transport AG BGU BGU Busbetrieb Grenchen und Umgebung AG BLAG Busland AG BLM Bergbahn Lauterbrunnen-Mürren AG BLS BLS AG BLT BLT Baselland Transport AG BLWE Busbetrieb Lichtensteig-Wattwil-Ebnat-Kappel BOB Berner Oberland-Bahnen AG BOGG Busbetrieb Olten Gösgen Gäu AG BOS BUS Ostschweiz AG BOS-M BOS Management AG BRB Brienz Rothorn Bahn AG BRER Busbetrieb Rapperswil-Eschenbach-Rüti BRSB Braunwald-Standseilbahn AG BSU Busbetrieb Solothurn und Umgebung AG BVB Basler Verkehrs-Betriebe CGN CGN SA CJ Compagnie des chemins de fer du Jura (C.J.) SA CROS Crossrail AG DBSCH DB Schenker Rail Schweiz GmbH DBZ Dolderbahn Zürich ETB Emmentalbahn, Huttwil FART Ferrovie Autolinee Regionali Ticinesi FB Forchbahn AG FC FUNICAR Kursbetriebe AG FLP Ferrovie Luganesi SA FW Frauenfeld-Wil-Bahn AG GGB Gornergrat Bahn AG HBSAG Hafenbahn Schweiz AG JB Jungfraubahn AG LEB Chemin de fer Lausanne-Echallens-Bercher LLB AG für Verkehrsbetriebe Leuk-Leukerbad und Umgebung LSMS Schilthornbahn AG MBC Transports de la région Morges-Bière-Cossonay SA MG Ferrovia Monte Generoso SA MGB Matterhorn Gotthard Bahn MIB Kraftwerke Oberhasli AG Meiringen-Innertkirchen-Bahn MOB Chemin de fer Montreux-Oberland Bernois MVR Transports Montreux-Vevey-Riviera SA NHB Niederhornbahn NB Niesenbahn AG NStCM Chemin de fer Nyon-St. Cergue-Morez OeBB Oensingen-Balsthal-Bahn PAG PostAuto Schweiz AG PB PILATUS-BAHNEN AG RA RegionAlps SA RAILG Railgate AG RB RIGI BAHNEN AG RBL Regionalbus Lenzburg AG RBS Regionalverkehr Bern-Solothurn AG REGO Regiobus Gossau AG RhB Rhätische Bahn AG RNCH DB Schenker Rail Schweiz GmbH RLC railCare RVBW Regionale Verkehrsbetriebe Baden-Wettingen AG RVSH SchaffhausenBus, Regionale Verkehrsbetriebe SH AG SBB SBB AG SBB-D SBB GmbH SBC Stadtbus Chur AG SBF Stadtbus Frauenfeld SBW Stadtbus Winterthur SMC Cie de Chemin de Fer+d'Autobus Sierre-Montana-Crans (SMC) SA SMGN Société des Mouettes Genevoises Navigation SA SMtS Funiculaire St-Imier - Mont-Soleil SA SOB Schweizerische Südostbahn AG SRTAG Swiss Rail Traffic AG SSIF Società Subalpina di Imprese Ferroviarie S.p.A. ST Sursee-Triengen-Bahn STB Sensetalbahn AG STI Verkehrsbetriebe STI AG SVB BERNMOBIL Städt. Verkehrsbetriebe Bern SWAG Seilbahn Weissenstein AG SZU Sihltal Zürich Uetliberg Bahn SZU AG THURBO Thurbo AG TL Transports publics de la région lausannoise SA TMR TRANSPORTS DE MARTIGNY ET REGIONS SA TPC Transports Publics du Chablais SA TPF Transports publics fribourgeois SA TPG Transports publics genevois TPL Trasporti Pubblici Luganesi SA TPN Transports Publics de la Région Nyonnaise SA TRN Transports Publics Neuchâtelois SA TRAVYS TRAVYS SA Transports Vallée de Joux-Yverdon-Sainte-Croix TSD Theytaz Excursions Sion VB Verkehrsbetriebe Biel VBD Verkehrsbetrieb der Landschaft Davos VBG VBG Verkehrsbetriebe Glattal AG VBH Verkehrsbetriebe Herisau VBL Verkehrsbetriebe Luzern AG VBSG Verkehrsbetriebe St.Gallen VBSH Verkehrsbetriebe Schaffhausen VBZ Verkehrsbetriebe Zürich VMCV Transports publics Vevey-Montreux-Chillon-Villeneuve VSSU Verband Schweizerischer Schifffahrtsunternehmen VZO Verkehrsbetriebe Zürichsee und Oberland AG WAB Wengernalpbahn AG WB Waldenburgerbahn AG WRS Widmer Rail Services Personal AG WSB Wynental- und Suhrentalbahn AAR bus+bahn ZB zb Zentralbahn AG ZVB Zugerland Verkehrsbetriebe AG ZVV Zürcher Verkehrsverbund ZVV AES Ägerisee Schifffahrt AG BLS BLS AG Schifffahrt Berner Oberland Thuner- und Brienzersee BPG Basler Personenschifffahrt AG BSG Bielersee-Schifffahrts-Gesellschaft AG CGN CGN SA FHM Zürichsee-Fähre Horgen-Meilen AG LNM Société de Navigation Lacs de Neuchâtel et Morat SA NLM Navigazione Lago Maggiore SBS SBS Schifffahrt AG SGG Schifffahrts-Genossenschaft Greifensee SGH Schifffahrtsgesellschaft Hallwilersee AG SGV Schifffahrtsgesellschaft des Vierwaldstättersees SGZ Schifffahrtsgesellschaft für den Zugersee AG / Ägerisee SNL Società Navigazione del Lago di Lugano SA SW Schiffsbetrieb Walensee AG URh Schweiz. Schifffahrtsgesellschaft Untersee und Rhein AG ZSG Zürichsee-Schifffahrtsgesellschaft AG
  • 11. AAGL Autobus AG Liestal AAGR Auto AG Rothenburg AAGS Auto AG Schwyz AAGU AUTO AG URI AB Appenzeller Bahnen AG ABl Autolinee Bleniesi SA ABF Autobusbetrieb Freienbach AFA Automobilverkehr Frutigen Adelboden AG AMSA Autolinea Mendrisiense SA AOT Autokurse Oberthurgau AG ARAG Rottal Auto AG ARBAG Aletsch Riederalp Bahnen AG ARL Autolinee Regionali Luganesi AS Autobetrieb Sernftal AG ASGS Autotransports Sion-Grône-Sierre ASm Aare Seeland mobil AG AVG Autoverkehr Grindelwald AG AVJ Autotransports de la Vallée de Joux AWA Autobetrieb Weesen-Amden AZZK Autobus Zürich-Zollikon-Küsnacht BB Bürgenstock Bahnen BBA Busbetrieb Aarau AAR bus+bahn BBBW Bus-Betrieb Binggeli BDWM BDWM Transport AG BGU BGU Busbetrieb Grenchen und Umgebung AG BLAG Busland AG BLM Bergbahn Lauterbrunnen-Mürren AG BLS BLS AG BLT BLT Baselland Transport AG BLWE Busbetrieb Lichtensteig-Wattwil-Ebnat-Kappel BOB Berner Oberland-Bahnen AG BOGG Busbetrieb Olten Gösgen Gäu AG BOS BUS Ostschweiz AG BOS-M BOS Management AG BRB Brienz Rothorn Bahn AG BRER Busbetrieb Rapperswil-Eschenbach-Rüti BRSB Braunwald-Standseilbahn AG BSU Busbetrieb Solothurn und Umgebung AG BVB Basler Verkehrs-Betriebe CGN CGN SA CJ Compagnie des chemins de fer du Jura (C.J.) SA CROS Crossrail AG DBSCH DB Schenker Rail Schweiz GmbH DBZ Dolderbahn Zürich ETB Emmentalbahn, Huttwil FART Ferrovie Autolinee Regionali Ticinesi FB Forchbahn AG FC FUNICAR Kursbetriebe AG FLP Ferrovie Luganesi SA FW Frauenfeld-Wil-Bahn AG GGB Gornergrat Bahn AG HBSAG Hafenbahn Schweiz AG JB Jungfraubahn AG LEB Chemin de fer Lausanne-Echallens-Bercher LLB AG für Verkehrsbetriebe Leuk-Leukerbad und Umgebung LSMS Schilthornbahn AG MBC Transports de la région Morges-Bière-Cossonay SA MG Ferrovia Monte Generoso SA MGB Matterhorn Gotthard Bahn MIB Kraftwerke Oberhasli AG Meiringen-Innertkirchen-Bahn MOB Chemin de fer Montreux-Oberland Bernois MVR Transports Montreux-Vevey-Riviera SA NHB Niederhornbahn NB Niesenbahn AG NStCM Chemin de fer Nyon-St. Cergue-Morez OeBB Oensingen-Balsthal-Bahn PAG PostAuto Schweiz AG PB PILATUS-BAHNEN AG RA RegionAlps SA RAILG Railgate AG RB RIGI BAHNEN AG RBL Regionalbus Lenzburg AG RBS Regionalverkehr Bern-Solothurn AG REGO Regiobus Gossau AG RhB Rhätische Bahn AG RNCH DB Schenker Rail Schweiz GmbH RLC railCare RVBW Regionale Verkehrsbetriebe Baden-Wettingen AG RVSH SchaffhausenBus, Regionale Verkehrsbetriebe SH AG SBB SBB AG SBB-D SBB GmbH SBC Stadtbus Chur AG SBF Stadtbus Frauenfeld SBW Stadtbus Winterthur SMC Cie de Chemin de Fer+d'Autobus Sierre-Montana-Crans (SMC) SA SMGN Société des Mouettes Genevoises Navigation SA SMtS Funiculaire St-Imier - Mont-Soleil SA SOB Schweizerische Südostbahn AG SRTAG Swiss Rail Traffic AG SSIF Società Subalpina di Imprese Ferroviarie S.p.A. ST Sursee-Triengen-Bahn STB Sensetalbahn AG STI Verkehrsbetriebe STI AG SVB BERNMOBIL Städt. Verkehrsbetriebe Bern SWAG Seilbahn Weissenstein AG SZU Sihltal Zürich Uetliberg Bahn SZU AG THURBO Thurbo AG TL Transports publics de la région lausannoise SA TMR TRANSPORTS DE MARTIGNY ET REGIONS SA TPC Transports Publics du Chablais SA TPF Transports publics fribourgeois SA TPG Transports publics genevois TPL Trasporti Pubblici Luganesi SA TPN Transports Publics de la Région Nyonnaise SA TRN Transports Publics Neuchâtelois SA TRAVYS TRAVYS SA Transports Vallée de Joux-Yverdon-Sainte-Croix TSD Theytaz Excursions Sion VB Verkehrsbetriebe Biel VBD Verkehrsbetrieb der Landschaft Davos VBG VBG Verkehrsbetriebe Glattal AG VBH Verkehrsbetriebe Herisau VBL Verkehrsbetriebe Luzern AG VBSG Verkehrsbetriebe St.Gallen VBSH Verkehrsbetriebe Schaffhausen VBZ Verkehrsbetriebe Zürich VMCV Transports publics Vevey-Montreux-Chillon-Villeneuve VSSU Verband Schweizerischer Schifffahrtsunternehmen VZO Verkehrsbetriebe Zürichsee und Oberland AG WAB Wengernalpbahn AG WB Waldenburgerbahn AG WRS Widmer Rail Services Personal AG WSB Wynental- und Suhrentalbahn AAR bus+bahn ZB zb Zentralbahn AG ZVB Zugerland Verkehrsbetriebe AG ZVV Zürcher Verkehrsverbund ZVV AES Ägerisee Schifffahrt AG BLS BLS AG Schifffahrt Berner Oberland Thuner- und Brienzersee BPG Basler Personenschifffahrt AG BSG Bielersee-Schifffahrts-Gesellschaft AG CGN CGN SA FHM Zürichsee-Fähre Horgen-Meilen AG LNM Société de Navigation Lacs de Neuchâtel et Morat SA NLM Navigazione Lago Maggiore SBS SBS Schifffahrt AG SGG Schifffahrts-Genossenschaft Greifensee SGH Schifffahrtsgesellschaft Hallwilersee AG SGV Schifffahrtsgesellschaft des Vierwaldstättersees SGZ Schifffahrtsgesellschaft für den Zugersee AG / Ägerisee SNL Società Navigazione del Lago di Lugano SA SW Schiffsbetrieb Walensee AG URh Schweiz. Schifffahrtsgesellschaft Untersee und Rhein AG ZSG Zürichsee-Schifffahrtsgesellschaft AG
  • 12.
  • 13. What do we propose? https://github.com/alexmasselot/swiss-transport-realtime
  • 14.
  • 15.
  • 16. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 18. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 19. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 21.
  • 24. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 25. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 28. { id: 12345xyz, category: IR, name: IR 72928, destination: Alpnach, position: { lat: 46.940582, lon: 8.275442 } } positionspositions
  • 29. { id: 12345xyz, category: IR, name: IR 72928, destination: Alpnach, position: { lat: 46.940582, lon: 8.275442 } } station boards station boards { station: { name: Lausanne, location: {lat, long} }, departures: [ { to:Domodossola, time: 20:13, delayed: 4, prognosis: { capacity2nd: 3, capacity1st: 1 } }, {…} positionspositions
  • 31. Events are streamed to “Kafka is used for building real- time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.” kafka.apache.org
  • 32. Events are streamed to “Kafka is used for building real- time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.” kafka.apache.org real time offline
  • 37. Store format dispatch storage logstash elasticsearch flat fileflat fileflat fileflat fileflat fileflat fileflat files
  • 41. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 42. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 43. Stream transformation • We have an input flow of events and want to: • know if a train is stopped into a station; • know if a train as exited the network; • expose an aggregated station board. • We need to: • digest the input flow; • process with temporary state persistance; • be able to expose snapshots.
  • 44. Stream transformation • Scala is The language for Big Data (functional & OO)
 • Akka (actors): • lightweight entities (one per train, per station); • easy asynchronous communications; • the perfect use case. • Play framework for REST service, configuration etc.
  • 45. Spark Streaming, Storm, Flink… TIMTOWTDI
  • 46. Spark Streaming, Storm, Flink… TIMTOWTDI
  • 48. : putting everything together • The “simple” infrastructure is not so light; • A developper should have everything on his/her laptop without polluting the machine; • Docker comes to the rescue: • lightweight containers, • pre-existing images, • docker-compose to describe the infrastructure • deploy directly to AWS or GCE.
  • 52. Performance: 2 numbers 15x faster ajax queries (vs SBB rest)
 to gather 30 times more trains
  • 53. Performance: 2 numbers 15% CPU: nodeJS + kafka + akka + play 15x faster ajax queries (vs SBB rest)
 to gather 30 times more trains
  • 54. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 55. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 56. A scalable infrastructure Kafka partitioning and zookeeper Logstash ? (but naturally recover on failure) Elasticsearch partitioning Spark streaming distributed by essence
 & write ahead logs Akka aka cluster, supervisors
 & failure strategy Docker Kubernetes, AWS, GCE, Exoscale
  • 58. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 59. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 60.
  • 61.
  • 62. JS for large data set • Only a rendering library (but fast); • Use a flux architecture; • Built by Facebook.
  • 63. JS for large data set • Only a rendering library (but fast); • Use a flux architecture; • Built by Facebook. Dispatcher Store View Action Action
  • 64. JavaScript for big data viz • React can handle viz >100k elements (don’t show them individually!)
  • 65. JavaScript for big data viz • React can handle viz >100k elements (don’t show them individually!) • Beware of performance issue;
  • 66. JavaScript for big data viz • React can handle viz >100k elements (don’t show them individually!) • Beware of performance issue; • Testing is not an option.
  • 67. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 68. Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 “near real time” massive data and achieve a posteriori analysis?
  • 69. 4.5 months of data A. What is the train occupancy during weekdays, between Lausanne and Geneva? B. When are the train the most delayed? C. Where are the train the most delayed?
  • 73. Lausanne-Genève: when to have a seat? Good luck
 in finding a spot!
  • 74. or pay… Lausanne-Genève: when to have a seat? Good luck
 in finding a spot! Wake up earlier!
  • 75. or pay… Lausanne-Genève: when to have a seat? Good luck
 in finding a spot! Wake up earlier!
  • 77. B. When are the trains most delayed?
  • 78.
  • 79. C. Where are the trains most delayed?
  • 80.
  • 85. a data science notebook
  • 86. • Web application • Interactively edit and run pieces of code (analysis steps) • Inclined towards Python (although other languages are available) • Beware of performance with large dataset (sample data or use Spark mode) a data science notebook
  • 91. @alex_massamasselot@octo.com Nov 8th 7 pm, Genève “Banknote Recognition System”
 (Machine Learning) Nov 10th 6 pm, Genève “Data Science & Machine Learning:
 Explorer, Comprendre Et Prédire” Demo on OCTO stand

Hinweis der Redaktion

  1. OCTO, Lausanne cabinet qui conseille, développe. de plus en + de clients avec des problématique Big Data des technos dont on se sert chez nos clients et d’autres que l’on veut explorer L’année 2016 est l’année Big Data chez, où nous organisons une série d’afterworks. Le prochain est sur le data science le 10 novembre
  2. ça a démarré comme un projet fait sur le 20% R&D pendant les . il y a des tonnes de données dispos, la première question a été de trouver des données intéressantes
  3. On voulait quelque chose qui ne soit pas directement associé à une mission en cours et qui intéresse beaucoup de monde. “Qui a pris un bateau, un train, un bus pour venir ici ce matin?”
  4. Quoi de plus suisse que de s’intéresser aux transports publics?
  5. ça fait partie de la culture. Et c’est pas que les CFF Ce sont les trains
  6. des bus, surtout postaux. Pour l’anecdote, la majeure partie du projet que je présente aujourd’hui a été développée pendant mes trajets train + bus. Ce qui m’a valut quelque fois de lever les yeux une gare trop loin…
  7. bref, les transports c’est sur terre, sur mer et dans les airs
  8. et ce que l’on oublie, c’est que ce ne sont pas seulement que les CFF, mais 170 entreprises de transport
  9. ces entreprises sont ensembles, sous l’égide de l’Union des Transport Publics, qui font entre autres l’abo général, mais aussi des applis
  10. des applis qui donnent le retard en temps réel et même une carte avec le trains. soit dit en passant, cette app de viz est passablement buggée. Ce qui nous a donner envie de jouer. et ce qui est pratique, c’est qu’on peut récupérer les data via des API
  11. et pour faire le pendant, sans aucune optimisation, les requêtes qui ramènent tous les trains de Suisse et celle qui prennent tous les boards prennent ~300ms (à la place de 5 secondes pour 30 fois moins de véhicules). la démo est online sur Google Cloud, mais on limit les risques Rentron maintenant dans le coeur de la présentation
  12. Donc, notre but devient un application du style
  13. On en vient dans dur et voyons ce que recouvre une architecture simple
  14. L’architecture se résumé donc à Acquérir les data de positions et les boards avoir un système qui collecte les données et permets de les dispatcher à qui s’inscrit stocker les data pour des analyses a posteriori processer les data (est-ce qu’un train est en gare? Est-ce qu’il est sorti du circuit… visualizer Rester “simple” n’est pas évident, car dans la stack Big Data
  15. Par ce que le premier problème, dès qu’on évoque le buzz owrd “Big Data”, c’est l’avalanche te technologie. Et ce graph évolue sans cesse. N’oublions pas de rester simple, donc
  16. On va survoler chacune des briques, en tentant de la mettre en relation avec d’autres technologies. on pourrait passer une session soft shake sur chacune de ces briques. Les choix proposés sont le fruit de nos tribulations, parfois en dehors des solutions que l’on connait et pousse actuellement chez nos client. Mais c’est le but de ce POC On va tenter, pour chacune des technos présentées, d’évoquer des solution alternatives Passons à la pratique, à la première brique
  17. Bon on a couvert la partie “simple infrastructure”, maintenant on va détailler. Commençons par la partie acquisition et dispatch.
  18. C’est la partie qui prend les événements (train positions et panneaux d’affichages) et permet de les servir à des consommateurs, qui les traiteront ensuite. Cependant, on doit d’aborder d’abord aborder la question de l’acquisition. Nous n’avons malheureusement pas accès au GPS des véhicules, ni au SI des CFF pour savoir ce qui se passe dans une gare
  19. Un train contain… un station board contient… Tous ces événements sont récoltés et poussés plus loin, vers Kafka
  20. Kafka qui est un hub de message, ou un message broker
  21. Kafka is an open source application producers pushing rich message with topics (“vehicle position”, “station board”) consumers register to topic and pull messages multi language - we used here Scala & NodeJS beware the release evolution
  22. Kafka is not the only solutions tiré vers un stockage.
  23. log stash est très utilisé pour le processing de log il y a beaucoup d’adapteur entrant et descendant peut être un peu gourmand en ressources on a eu plusieurs destinations Elasticsearch, document store, pour la facilité d’installation et de requêtage flat files basiques (avec rotation toutes les heures)
  24. other alternatives are less versatile with connectors, offer less transformation or more log oriented on a essayé Flume mais on avait arrêté pour des problème de connecteurs vers ES. Logstash a marché en quelques minutes
  25. For large time series, columnar database would have certainly be a better choice ES était le choix de la facilité
  26. kafka permet de mettre autant de consumers que désirés on regarde maintenant du côté du streaming real time
  27. pour l’instant, à part pour la partie d’acquisition assez simple, le fait de mettre en place kafka, log stash et Elasticsearch n’était que de la config. relevons les manches et passons à la partie transformation
  28. stopped train means that its position is the same as the previous one into a station mean see if the position of a train is within 200 meters the position of a station
  29. Spark streaming (with state): designed for hdfs, naturally suited for distribution, usable all across the Big Data use cases; on en est rendu au point où les info sont digérées au fil de l’eau et exposées À travers du REST
  30. On a codé/configuré notre infrastructure regardons en 2 minutes comment déployer, scaler et une mention des performances
  31. beware he cost on AWS or GCE en pratique, on peut mettre chacune de nos petites boites dans un container
  32. 1000 train et 3000 gares GCE snapshot 1 CPU, 4G memory => 15% of CPU
  33. on avance dans notre plan même si le sujet était abordé de manière sous-jacente, parlons deux secondes de la scalabilité
  34. Kafka : 100k events per seconds la scalabilité va de paire avec la résistance à la panne un warning ur aka ou tout est possible, mais faut se le faire à la main
  35. On a implémenté l’infra mais le but était de servir des users soit relatif, soit offline
  36. donc prêtes à être visualisées dans le browser.
  37. visualization, ça veut dire présentation & interactivité le cercle taille: nombre de trains attendus, secteur orange: nombre de trains en retard. mouse over station => requête REST pour aller chercher le snapshot de la gare en question le fond de carte est google map pour la viz elle même, y’a pas de TIMTOWTDI, c’est d3.js
  38. c’est d3.js on transform des données en éléments de DOM et on interagit avec limit is the sky.
  39. React 4 large data juste un libraire de rendering faut un architecture de flux il y a beaucoup d’idées préconçues sur JavaScript L’abondance de frameworks, la vitesse d’obsolescence
  40. perf: au delà de mettre trop d’éléments, de simples transitions peuvent couler les perf tests: on pourrait passer la matinée sur le sujet, mais avoir un page qui regroupe différentes configurations permet de tout checker en un coup d’oeil plutôt que de passer sont temps à naviguer dans l’interface ou de compter sur d’éventuels testers.
  41. on a couvert de quoi faire notre application de visualisation interactive en temps réel les données sont stockées mais ils nous manque encore la partie data analysis. regardons d’abord quelques résultats et ensuite comment les obtenir.
  42. - on reviendra sur la questions des 4.5 mois
  43. lorsqu’on regarde l’appli cff, on a une estimation du taux de remplissage et cette info est poussé par les station board on veux alors repartir dans les données et regarder quand est-ce que le trains sont chargé en fonction de l’heure de la journée - si on se décale d’une heure, on a de la place
  44. ou nous sommes dans un merveilleux pays capitaliste, tout peut s’acheter
  45. et en pratique, il s’agit simplement de faire un groupby (r => (r.timestap.hourOfDay, capacity).size()
  46. on regarde d’abord le nombre de trains annoncés en gare en fonction de l’heure de la journée rien la nuit, unpic le matin et en fin d’après-midi.
  47. idem pour les retard et on peut superposer les deux jeux de données
  48. en rouge le nombre de trains en bleu les nombre de retards y’a peu de trains la nuit, mais un sur 10 est à la bourre Overall, 4% de train en retard Quand sont en retard les trains, c’est bien. Mais où?, c’est encore mieux
  49. - un carte de la Suisse
  50. un histogram des gares les plus fréquentées. un pic dans la région de Zurich
  51. les retards les plus massif aussi à Zurich. mais aussi dans la région de Lugano et la vallée qui monte à Zermatt C’est joli ces images, mais comment on y arrive?
  52. Store data, data processing method along with results together in order to reproduce. mix interactivity and reproducibility
  53. et c’est la fin
  54. This is a global view of the simple infrastructure we presented, but it is only
  55. discussion avec mon chauffeur de bus