SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
Large Infrastructure Monitoring
@ CERN
Matthias Bräger CERN
Thursday, 10/15/2015
Big Data Spain
Agenda
Matthias Bräger
Software Engineer
CERN
matthias.braeger@cern.ch
▪  Big Data @ CERN
▪  In-Memory Data Grid &
Streaming Analytics
▪  Concrete CERN Example
Physics data(>100 PB)
Metadata of
physics data
Sensor Data of
technical installations
Log data
Configuration
data
Documents
Media
data
Others
European Organization for Nuclear Research
▪ Founded in 1954 (60 years ago!)
▪ 21 Member States
▪ ~ 3’360 Staff, fellows, students...
▪ ~ 10’000 Scientists from
113 different countries
▪ Budget: 1 billion CHF/year
http://cern.ch
From Physics to Industry
ATLAS
CMS LHCb
Alice
LHC
The worlds biggest machine
Generated 30 Petabytes in 2012
> 100 PB in total!
LHC - Large Hadron Collider
27km ring of superconducting magnets
Started operation in 2010 with 3.5 + 3.5 TeV,
4 + 4 TeV in 2012
2013 – 2015 in Long Shutdown 1
(machine upgrade)
Restarted in April 2015 with 6.5 + 6.5 TeV max
Some ATLAS facts
▪  25m diameter, 46m length, 7’000 tons
▪  100 million channels
▪  40MHz collision rate (~ 1 PB/s)
▪  Run 1: 200 Hz (~ 320 MB/s)
event rate after filtering
▪  Run 2: up to 1 kHz
Is Hadoop used for storing
the ~30 PB/year of physics data ?
No ;-(
Experimental data are mainly stored
on tape
CERN uses Hadoop for storing the metadata
of the experimental data
Physics Data Handling
▪  Run 1: 30 PB per year
demanding 100’000 processors
with peaks of 20 GB/s writing to tape
spread across 80 tape drives
▪  Run 2: > 50 PB per year
CERN’s Computer Center (1st floor)
Physics Data Handling
2013 already more than 100 PB stored in
total!
▪  > 88 PB on 55’000 tapes
▪  > 13 PB on disk
▪  > 150 PB free tape storage waiting for
Run 2
CERN’s tape robot
Why tape storage?
▪  Cost of tape storage is a lot less than
disk storage
▪  No electricity consumption when tapes
are not being accessed
▪  Tape storage size = Data + Copy
Hadoop storage size = Data + 2 Copies
▪  No requirement to have all recorded
physics data available within seconds
CERN’s tape robot
@ CERN
3 HBase Clusters
▪  CASTOR Cluster with ~10 servers
-  ~ 100 GB of Logs per day
-  > 120 TB of Logs in total
▪  ATLAS Cluster with ~20 servers
-  Event index Catalogue for
experimental Data in the Grid
▪  Monitoring Cluster with ~10 servers
-  Log events from CERN Computer Center
Metadata from physics event
Metadata are created upon recording of the physics event
Examples 1:
▪  Tape Storage event log
-  On which tape is my file stored?
-  Is there a copy on disk?
-  List me all events for a given tape or drive
-  Was the tape repacked?
Example 1: Tape Storage event log
Metadata from physics event
Metadata are created upon recording of the physics event
Examples 2:
▪  Information about
-  Event number
-  run number
-  timestamp
-  luminosity block number
-  trigger that selected the event, etc.
Example 2: ATLAS EventIndex catalogue
Prototype of an event-level metadata catalogue for all ATLAS events
▪  In 2011 and 2012, ATLAS produced 2 billion real events and 4 billion simulated events
▪  Migration to Hadoop for run 2 of the LHC
Data are read from the brokers, decoded and stored into Hadoop.
Example 2: ATLAS EventIndex catalogue
The major use cases of the EventIndex project are:
▪  Event picking:
give me the reference (pointer) to "this" event in "that" format for a given processing cycle.
▪  Production consistency checks:
technical checks that processing cycles are complete (event counts match).
▪  Event service:
give me the references (pointers) for “this” list of events, or for the events satisfying given selection criteria
Agenda
Matthias Bräger
Software Engineer
CERN
matthias.braeger@cern.ch
▪  Big Data @ CERN
▪  In-Memory Data Grid &
Streaming Analytics
▪  Concrete CERN Example
Physics data(>100 PB)
Metadata of
physics data
Sensor Data of
technical installations
Log data
Configuration
data
Documents
Media
data
Others
Growth of Data
Transactions, Sensors, Logs, M2M, ..
Big Data is first of all cost factor!
The infrastructure has to be put in place to store the data
To a get a maximum return of investment it requires
good analytic tools and well defined target goals
to harvest the precious insights of your data
The value of real time
Latency Matters
Uptime, SLAs, HA
Performance and Scale
The Shift
90% of Data in
Disk-based Databases
90% of Data in In-
Memory
MEMORY
RAM is 58,000 times faster than disk and
2,000 times faster than solid-state drives (SSD)
Tiered Storage
Distributed memory
Server RAM or Flash/SSD
Process Memory
Local off-heap Memory
2,000,000+
1,000,000
100,000
Micro-seconds
Micro-seconds
Milli-seconds
Speed (TPS)
1,000s
Latency
External Data Source
(e.g., Database, Hadoop, Data Warehouse)
4 GB
32 GB – 12 TB
100s GB – 100s TB
Seconds
Achiving High Availability with RAM?
In-memory data grids replicate
the data to one or more nodes
Why now?
Explosion in volume
and velocity of
data
Steep drop in price
of RAM
In-Memory Data Grid (IMDG) Platforms
-  Scale of NoSQL
-  Low latency of In-Memory databases
-  Reliability & Fault Tolerance
-  Transactional Guarantees
Fast Big Data
Scale with data and processing needs
Increase Data in Memory
Reduce database dendency
DB
Application
DB
Application
In-Memory
Distributed
In-Memory
DB
Application
In-Memory
Application
In-Memory
Use Cases for IMDG
•  Cache to overcome legacy data bottlenecks
•  Cache for transient data
•  Primary store for modern apps
•  NoSQL database at in-memory speed
•  Data services fabric for real-time data integration
•  Compute grid at in-memory speed
In-Memory Data Grid solutions
Forrester Wave™:
In-Memory Data Grids, Q3 2015
Analyzing data In-Motion
Streaming analytics and filtering with Complex Event Processing (CEP)
CEP
engine
… Events … Actions
In-Memory
Event stack
Hadoop Storage
Alarming
Streaming Analytics
•  Many existing products, but still no standards
•  : Open Source, SQL like query language
•  JEPC: An attempt to standardize event processing,
from Database Research Group, University of Marburg:
http://www.mathematik.uni-marburg.de/~bhossbach/jepc/
Use cases
Influencing operations and decisions
Agenda
Matthias Bräger
Software Engineer
CERN
matthias.braeger@cern.ch
▪  Big Data @ CERN
▪  In-Memory Data Grid &
Streaming Analytics
▪  Concrete CERN Example
Cooling
Access Control
Safety Systems
Network and
Hardware Controls
Cryogenics
Electricity
TIM – Technical Infrastructure Monitoring
▪ Operational since 2005
▪ Used to monitor and control infrastructure at CERN
▪ 24/7 service
▪ ~ 100 different main users at CERN
▪ Since Jan. 2012 based on
new server architecture with C2MON
CERN Control Center at LHC startup
Cooling Safety Systems Electricity Access Network and
Hardware Controls
Cryogenics
TIM Server
based on C2MON
Client Tier
Data Analysis Video ViewerTIM Viewer Access
Management
Alarm Console
Data Acquisition & Filtering
> 1200 commands
> 1300 rules
> 91k data sensors
> 41k alarms
Web Apps
TIM Server
based on C2MON
Client Tier
Data Analysis Video ViewerTIM Viewer Access
Management
Alarm Console
Data Acquisition & Filtering
ca. 400 million
raw values per day
Filtering ca.2 million updates
> 1200 commands
> 1300 rules
> 91k data sensors
> 41k alarms
Web Apps
C2MON - CERN Control and Monitoring Platform
C2MON server
C2MON client API
my app
C2MON DAQ API
my DAQ
…
▪  Allows the rapid implementation of high-performance
monitoring solutions
▪  Modular and scalable at all layers
▪  Optimized for High Availability & big data volume
▪  Based on In-Memory solution
▪  All written in Java
Currently used by two big monitoring systems @CERN:
TIM & DIAMON
Central LHC alarm system (“LASER”) in migration phase
http://cern.ch/c2mon
C2MON server
C2MON architecture
Application Tier
C2MON server
DAQs
History /
Backup
In-Memory
-  Configuration
-  Rule logic
-  Latest sensor values
C2MON Server
C2MON server core
In-Memory Store
(JCache - JSR-107)
DAQ outDAQ in
DAQ
supervision
Cache
persistence
Cache
loading
Lifecycle
Configuration
Cache
DB access
Logging Alarm Rules Benchmark Video access
C2MON server modulesClient
communication
Authentication
Open Source time-series databases
▪  OpenTSDB: Uses as storage model
▪  : Uses Apache as storage model
▪  : Natively time-series, using LMDB storage engine
▪  : Built on top of Apache LuceneTM
Scenario 1: High availability
•  moderate data size
•  average throughput
•  min service interrupts
•  high availability
C2MON SERVER
DAQ process DAQ process DAQ processDAQ process
Clustered JMS brokersJMS broker JMS broker
JMS broker JMS broker
C2MON client C2MON client C2MON client
C2MON SERVER
standby
Raw data filtering on DAQ layer
GIQO
Garbage In
Quality out
Scenario 2: High requirements
•  large data set
•  high throughput
•  min service interrupts
•  high availability
DAQ process DAQ process DAQ processDAQ process
server array
C2MON SERVER
CLUSTER
C2MON SERVER
CLUSTER
JMS broker
cluster
JMS broker
cluster
C2MON client C2MON client
JMS broker
cluster
JMS broker
cluster
C2MON clientC2MON client
C2MON Roadmap
▪  Offering C2MON to the Open Source community
http://cern.ch/c2mon
▪  Introduction of Complex Event Processing (CEP) module
▪  Migrating historical event store from
relational database to time series database
IoT = Internet of Things or … Intranet of Things?
▪ Creating a smarter world happens first in the Intranet
▪ Challenge: Integrating heterogenous systems and protocols
▪ Many IoT solutions available, but often closed products
which are not compatible to each other
Internet of Things:
▪ Integrating and analysing monitoring data from a variety of installations of the same
device type throughout the industry is essential.
Takeaways
▪  Data and High Availability services are more important than ever
before for all modern organizations.
▪  Deriving value from collected data is key to success.
▪  In-Memory platforms are essential for high value & high velocity
data storage and processing.
Credits & References
Many thanks to CERN & Software AG:
-  Sebastien Ponce (CERN), for providing information about CASTOR
-  Rainer Toebbicke (CERN), for providing information about CERN HBASE service
-  Jan Iven (CERN), for being helpful finding information about existing CERN Hadoop projects
-  Software AG/Terracotta Product & Engineering Team
References:
-  C2MON: http://cern.ch/c2mon
-  The ATLAS EventIndex: https://cds.cern.ch/record/1690609
-  Agile Infrastructure at CERN - Moving 9'000 Servers into a Private Cloud, Helge Meinhard (CERN):
http://vimeo.com/93247922
-  CRAN, The Comprehensive R Archive Network: http://cran.r-project.org
-  Software AG Terracotta: http://www.terracotta.org
Questions?
Muchas gracias por su atención!
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain 2015

Weitere ähnliche Inhalte

Was ist angesagt?

Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...Spark Summit
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Databricks
 
Data Gloveboxes: A Philosophy of Data Science Data Security
Data Gloveboxes: A Philosophy of Data Science Data SecurityData Gloveboxes: A Philosophy of Data Science Data Security
Data Gloveboxes: A Philosophy of Data Science Data SecurityDataWorks Summit
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Spark Summit
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
 
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...Databricks
 
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark Summit
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable PythonTravis Oliphant
 
Spark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit
 
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...Spark Summit
 
Geospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkGeospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkDatabricks
 
Spark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin KeynoteSpark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin KeynoteDatabricks
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingPaco Nathan
 
On-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsOn-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsDatabricks
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...Spark Summit
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Spark Summit
 
Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Wee Hyong Tok
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopDatabricks
 

Was ist angesagt? (20)

Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
 
Data Gloveboxes: A Philosophy of Data Science Data Security
Data Gloveboxes: A Philosophy of Data Science Data SecurityData Gloveboxes: A Philosophy of Data Science Data Security
Data Gloveboxes: A Philosophy of Data Science Data Security
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
 
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
 
Spark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer Agarwal
 
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
 
Distributed Deep Learning on Hadoop Clusters
Distributed Deep Learning on Hadoop ClustersDistributed Deep Learning on Hadoop Clusters
Distributed Deep Learning on Hadoop Clusters
 
Geospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkGeospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache Spark
 
Spark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin KeynoteSpark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin Keynote
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
On-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsOn-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy Models
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
 
Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics Workshop
 

Andere mochten auch

Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...
Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...
Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...Liliana Bounegru
 
20151125 LAC de architect tussen 44 scrum teams in de 90e sprint
20151125 LAC de architect tussen 44 scrum teams in de 90e sprint20151125 LAC de architect tussen 44 scrum teams in de 90e sprint
20151125 LAC de architect tussen 44 scrum teams in de 90e sprintPeter Paul van de Beek
 
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...DataKitchen
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsRichard McDougall
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...Big Data Spain
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Big Data Spain
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014Big Data Spain
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014Big Data Spain
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceBig Data Spain
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Big Data Spain
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
 

Andere mochten auch (20)

Overview of the Hive Stinger Initiative
Overview of the Hive Stinger InitiativeOverview of the Hive Stinger Initiative
Overview of the Hive Stinger Initiative
 
Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...
Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...
Data Infrastructure Literacy: Reshaping Practices of Measurement, Monitoring ...
 
20151125 LAC de architect tussen 44 scrum teams in de 90e sprint
20151125 LAC de architect tussen 44 scrum teams in de 90e sprint20151125 LAC de architect tussen 44 scrum teams in de 90e sprint
20151125 LAC de architect tussen 44 scrum teams in de 90e sprint
 
Hive Now Sparks
Hive Now SparksHive Now Sparks
Hive Now Sparks
 
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure Considerations
 
October 2014 HUG : Hive On Spark
October 2014 HUG : Hive On SparkOctober 2014 HUG : Hive On Spark
October 2014 HUG : Hive On Spark
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
 

Ähnlich wie Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain 2015

Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchRobert Grossman
 
CERN IT Monitoring
CERN IT Monitoring CERN IT Monitoring
CERN IT Monitoring Tim Bell
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
 
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaDatabricks
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaDataStax Academy
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 
Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginnerscpallares
 
Tech 2 Tech: Network performance
Tech 2 Tech: Network performanceTech 2 Tech: Network performance
Tech 2 Tech: Network performanceJisc
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Rediscacois
 
Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Dayprogrammermag
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Remote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSRemote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSSumant Tambe
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsDataconomy Media
 
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...Paris Carbone
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Florian Lautenschlager
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...Flink Forward
 
Huawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingHuawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingJen Aman
 

Ähnlich wie Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain 2015 (20)

Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
CERN IT Monitoring
CERN IT Monitoring CERN IT Monitoring
CERN IT Monitoring
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john mallory
 
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and Delta
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
 
Tech 2 Tech: Network performance
Tech 2 Tech: Network performanceTech 2 Tech: Network performance
Tech 2 Tech: Network performance
 
High-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using RedisHigh-Volume Data Collection and Real Time Analytics Using Redis
High-Volume Data Collection and Real Time Analytics Using Redis
 
Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Day
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Remote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSRemote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJS
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx Systems
 
WW Historian 10
WW Historian 10WW Historian 10
WW Historian 10
 
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...Reintroducing the Stream Processor: A universal tool for continuous data anal...
Reintroducing the Stream Processor: A universal tool for continuous data anal...
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
 
Huawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingHuawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark Streaming
 

Mehr von Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 

Mehr von Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 

Kürzlich hochgeladen

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
lifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxlifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxsomshekarkn64
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 

Kürzlich hochgeladen (20)

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
lifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxlifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptx
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 

Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain 2015

  • 1.
  • 2.
  • 3. Large Infrastructure Monitoring @ CERN Matthias Bräger CERN Thursday, 10/15/2015 Big Data Spain
  • 4. Agenda Matthias Bräger Software Engineer CERN matthias.braeger@cern.ch ▪  Big Data @ CERN ▪  In-Memory Data Grid & Streaming Analytics ▪  Concrete CERN Example
  • 5. Physics data(>100 PB) Metadata of physics data Sensor Data of technical installations Log data Configuration data Documents Media data Others
  • 6. European Organization for Nuclear Research ▪ Founded in 1954 (60 years ago!) ▪ 21 Member States ▪ ~ 3’360 Staff, fellows, students... ▪ ~ 10’000 Scientists from 113 different countries ▪ Budget: 1 billion CHF/year http://cern.ch
  • 7. From Physics to Industry
  • 8. ATLAS CMS LHCb Alice LHC The worlds biggest machine Generated 30 Petabytes in 2012 > 100 PB in total!
  • 9. LHC - Large Hadron Collider 27km ring of superconducting magnets Started operation in 2010 with 3.5 + 3.5 TeV, 4 + 4 TeV in 2012 2013 – 2015 in Long Shutdown 1 (machine upgrade) Restarted in April 2015 with 6.5 + 6.5 TeV max
  • 10. Some ATLAS facts ▪  25m diameter, 46m length, 7’000 tons ▪  100 million channels ▪  40MHz collision rate (~ 1 PB/s) ▪  Run 1: 200 Hz (~ 320 MB/s) event rate after filtering ▪  Run 2: up to 1 kHz
  • 11.
  • 12. Is Hadoop used for storing the ~30 PB/year of physics data ? No ;-( Experimental data are mainly stored on tape CERN uses Hadoop for storing the metadata of the experimental data
  • 13. Physics Data Handling ▪  Run 1: 30 PB per year demanding 100’000 processors with peaks of 20 GB/s writing to tape spread across 80 tape drives ▪  Run 2: > 50 PB per year CERN’s Computer Center (1st floor)
  • 14. Physics Data Handling 2013 already more than 100 PB stored in total! ▪  > 88 PB on 55’000 tapes ▪  > 13 PB on disk ▪  > 150 PB free tape storage waiting for Run 2 CERN’s tape robot
  • 15. Why tape storage? ▪  Cost of tape storage is a lot less than disk storage ▪  No electricity consumption when tapes are not being accessed ▪  Tape storage size = Data + Copy Hadoop storage size = Data + 2 Copies ▪  No requirement to have all recorded physics data available within seconds CERN’s tape robot
  • 16. @ CERN 3 HBase Clusters ▪  CASTOR Cluster with ~10 servers -  ~ 100 GB of Logs per day -  > 120 TB of Logs in total ▪  ATLAS Cluster with ~20 servers -  Event index Catalogue for experimental Data in the Grid ▪  Monitoring Cluster with ~10 servers -  Log events from CERN Computer Center
  • 17. Metadata from physics event Metadata are created upon recording of the physics event Examples 1: ▪  Tape Storage event log -  On which tape is my file stored? -  Is there a copy on disk? -  List me all events for a given tape or drive -  Was the tape repacked?
  • 18. Example 1: Tape Storage event log
  • 19. Metadata from physics event Metadata are created upon recording of the physics event Examples 2: ▪  Information about -  Event number -  run number -  timestamp -  luminosity block number -  trigger that selected the event, etc.
  • 20. Example 2: ATLAS EventIndex catalogue Prototype of an event-level metadata catalogue for all ATLAS events ▪  In 2011 and 2012, ATLAS produced 2 billion real events and 4 billion simulated events ▪  Migration to Hadoop for run 2 of the LHC Data are read from the brokers, decoded and stored into Hadoop.
  • 21. Example 2: ATLAS EventIndex catalogue The major use cases of the EventIndex project are: ▪  Event picking: give me the reference (pointer) to "this" event in "that" format for a given processing cycle. ▪  Production consistency checks: technical checks that processing cycles are complete (event counts match). ▪  Event service: give me the references (pointers) for “this” list of events, or for the events satisfying given selection criteria
  • 22. Agenda Matthias Bräger Software Engineer CERN matthias.braeger@cern.ch ▪  Big Data @ CERN ▪  In-Memory Data Grid & Streaming Analytics ▪  Concrete CERN Example
  • 23. Physics data(>100 PB) Metadata of physics data Sensor Data of technical installations Log data Configuration data Documents Media data Others
  • 24. Growth of Data Transactions, Sensors, Logs, M2M, ..
  • 25. Big Data is first of all cost factor! The infrastructure has to be put in place to store the data To a get a maximum return of investment it requires good analytic tools and well defined target goals to harvest the precious insights of your data
  • 26. The value of real time Latency Matters
  • 28. The Shift 90% of Data in Disk-based Databases 90% of Data in In- Memory MEMORY RAM is 58,000 times faster than disk and 2,000 times faster than solid-state drives (SSD)
  • 29. Tiered Storage Distributed memory Server RAM or Flash/SSD Process Memory Local off-heap Memory 2,000,000+ 1,000,000 100,000 Micro-seconds Micro-seconds Milli-seconds Speed (TPS) 1,000s Latency External Data Source (e.g., Database, Hadoop, Data Warehouse) 4 GB 32 GB – 12 TB 100s GB – 100s TB Seconds
  • 30. Achiving High Availability with RAM? In-memory data grids replicate the data to one or more nodes
  • 31. Why now? Explosion in volume and velocity of data Steep drop in price of RAM
  • 32. In-Memory Data Grid (IMDG) Platforms -  Scale of NoSQL -  Low latency of In-Memory databases -  Reliability & Fault Tolerance -  Transactional Guarantees Fast Big Data
  • 33. Scale with data and processing needs Increase Data in Memory Reduce database dendency DB Application DB Application In-Memory Distributed In-Memory DB Application In-Memory Application In-Memory
  • 34. Use Cases for IMDG •  Cache to overcome legacy data bottlenecks •  Cache for transient data •  Primary store for modern apps •  NoSQL database at in-memory speed •  Data services fabric for real-time data integration •  Compute grid at in-memory speed
  • 35. In-Memory Data Grid solutions Forrester Wave™: In-Memory Data Grids, Q3 2015
  • 36. Analyzing data In-Motion Streaming analytics and filtering with Complex Event Processing (CEP) CEP engine … Events … Actions In-Memory Event stack Hadoop Storage Alarming
  • 37. Streaming Analytics •  Many existing products, but still no standards •  : Open Source, SQL like query language •  JEPC: An attempt to standardize event processing, from Database Research Group, University of Marburg: http://www.mathematik.uni-marburg.de/~bhossbach/jepc/
  • 39. Agenda Matthias Bräger Software Engineer CERN matthias.braeger@cern.ch ▪  Big Data @ CERN ▪  In-Memory Data Grid & Streaming Analytics ▪  Concrete CERN Example
  • 40. Cooling Access Control Safety Systems Network and Hardware Controls Cryogenics Electricity
  • 41. TIM – Technical Infrastructure Monitoring ▪ Operational since 2005 ▪ Used to monitor and control infrastructure at CERN ▪ 24/7 service ▪ ~ 100 different main users at CERN ▪ Since Jan. 2012 based on new server architecture with C2MON CERN Control Center at LHC startup
  • 42. Cooling Safety Systems Electricity Access Network and Hardware Controls Cryogenics TIM Server based on C2MON Client Tier Data Analysis Video ViewerTIM Viewer Access Management Alarm Console Data Acquisition & Filtering > 1200 commands > 1300 rules > 91k data sensors > 41k alarms Web Apps
  • 43. TIM Server based on C2MON Client Tier Data Analysis Video ViewerTIM Viewer Access Management Alarm Console Data Acquisition & Filtering ca. 400 million raw values per day Filtering ca.2 million updates > 1200 commands > 1300 rules > 91k data sensors > 41k alarms Web Apps
  • 44. C2MON - CERN Control and Monitoring Platform C2MON server C2MON client API my app C2MON DAQ API my DAQ … ▪  Allows the rapid implementation of high-performance monitoring solutions ▪  Modular and scalable at all layers ▪  Optimized for High Availability & big data volume ▪  Based on In-Memory solution ▪  All written in Java Currently used by two big monitoring systems @CERN: TIM & DIAMON Central LHC alarm system (“LASER”) in migration phase http://cern.ch/c2mon
  • 45. C2MON server C2MON architecture Application Tier C2MON server DAQs History / Backup In-Memory -  Configuration -  Rule logic -  Latest sensor values
  • 46. C2MON Server C2MON server core In-Memory Store (JCache - JSR-107) DAQ outDAQ in DAQ supervision Cache persistence Cache loading Lifecycle Configuration Cache DB access Logging Alarm Rules Benchmark Video access C2MON server modulesClient communication Authentication
  • 47. Open Source time-series databases ▪  OpenTSDB: Uses as storage model ▪  : Uses Apache as storage model ▪  : Natively time-series, using LMDB storage engine ▪  : Built on top of Apache LuceneTM
  • 48.
  • 49.
  • 50. Scenario 1: High availability •  moderate data size •  average throughput •  min service interrupts •  high availability C2MON SERVER DAQ process DAQ process DAQ processDAQ process Clustered JMS brokersJMS broker JMS broker JMS broker JMS broker C2MON client C2MON client C2MON client C2MON SERVER standby
  • 51. Raw data filtering on DAQ layer GIQO Garbage In Quality out
  • 52. Scenario 2: High requirements •  large data set •  high throughput •  min service interrupts •  high availability DAQ process DAQ process DAQ processDAQ process server array C2MON SERVER CLUSTER C2MON SERVER CLUSTER JMS broker cluster JMS broker cluster C2MON client C2MON client JMS broker cluster JMS broker cluster C2MON clientC2MON client
  • 53. C2MON Roadmap ▪  Offering C2MON to the Open Source community http://cern.ch/c2mon ▪  Introduction of Complex Event Processing (CEP) module ▪  Migrating historical event store from relational database to time series database
  • 54. IoT = Internet of Things or … Intranet of Things? ▪ Creating a smarter world happens first in the Intranet ▪ Challenge: Integrating heterogenous systems and protocols ▪ Many IoT solutions available, but often closed products which are not compatible to each other Internet of Things: ▪ Integrating and analysing monitoring data from a variety of installations of the same device type throughout the industry is essential.
  • 55. Takeaways ▪  Data and High Availability services are more important than ever before for all modern organizations. ▪  Deriving value from collected data is key to success. ▪  In-Memory platforms are essential for high value & high velocity data storage and processing.
  • 56. Credits & References Many thanks to CERN & Software AG: -  Sebastien Ponce (CERN), for providing information about CASTOR -  Rainer Toebbicke (CERN), for providing information about CERN HBASE service -  Jan Iven (CERN), for being helpful finding information about existing CERN Hadoop projects -  Software AG/Terracotta Product & Engineering Team References: -  C2MON: http://cern.ch/c2mon -  The ATLAS EventIndex: https://cds.cern.ch/record/1690609 -  Agile Infrastructure at CERN - Moving 9'000 Servers into a Private Cloud, Helge Meinhard (CERN): http://vimeo.com/93247922 -  CRAN, The Comprehensive R Archive Network: http://cran.r-project.org -  Software AG Terracotta: http://www.terracotta.org