3. About 11850 Amps to generate
around 8.4 Tesla fields (about
150000 times the earth
magnetic field) but they
operate at low Voltage
A lot of what LHC is about is electricity flow management
4. How BIG?
BIG data is like the LHC combined with gold
extraction
- Huge amount of data -> 6.6 Zettabytes/year by 2016 (Cisco
Cloud Index)
- Big flow of data -> 400TB/day (Facebook)
- LHC generates 10-15 Petabytes/year of data for each
experiment
5. The essence of new service
providers BI Based Revenue Models
(eg Advertisement)
User
Core Semantic
Improves Consumes
experience
Data Set
Mindmap
Revenue from
Value enriched Data
existing services
generates
revenue Data Service will shrink
Service
Produces Service
Additional
revenue from
new services
The more context
the more efficient and
One data set Many free services
the more value and common semantic
Example:
Search/Information Management :
Rated auction/Selling:
6. Classic Approach
• Structured Data
• Data in the range of Gigabytes to Terabytes
• Centralized (Data is imported in analytics)
• Batch based
• Data silos
ETL ETL ETL
Transaction Relational Data Analyse
Database Warehouse
Where is the data that answer my questions ?
7. Big Data Approach
• Multi Structured Data
• Data in the range of Terabytes to Petabytes
• Distributed/Federated (Analytics grab the data)
• Streaming based
• Holistic Data Clusters
1
Stream 2
Organize Analyse
3
n
Here are the questions and the data for the answers
8. A new pattern
• Many different data structures
• Many different ways to extract the data
Knowledge • Structured
• Many different locations (even for the
References
API
Services Content
Sources
Applications
Social Networks
Buffering
same type of data) • Proprietary
•
RAN
Graph
• Batch and Realtime based
Data card
•
Data as a
Service
Neural Network
• Buffered or stream
Sim Card Premise Network Core
•
Connected Things
Connected
(Consumer, Enterprise)
Gateway
Relational
• Correlation parameters • Unstructured
Devices IT Infrastructure
Consumption
Buffering
Report
Statistics
• Streaming
• Taping at Source
Real-time
Cheap Storage High Efficient Storage
Low level Semantic
• Buffering, Routing, Filtering • Taping on Stream
• Structured/Unstructured • Consumption to
Stream
Graph
Network/
store Source Analysis
• Event Collector
• Batch Process/Multi
Non Real-time
Rich Semantic
Structure Stream
• Multi Stage Store/Process Neura l
Network/
Analysis
9. With added security
Knowledge
References
API
Services Content
Sources
Social Networks
Applications
• Strong access
RAN
control based
Data as a
Service
Data card
Sim Card
Connected Things
(Consumer, Enterprise)
Premise
Gateway
Network Core
on industry
Connected
Devices IT Infrastructure
standard
Consumption
(user, dev, app
lication)
Report
Statistics
• Securing the infrastructure (public, private) • Strong
• Policy (internal/external) authorization
• On-going assessment (DDOS, Penetration …) control based
• Data leakage
•
on open
Stream
Migration Graph
standard
Network/
• Securing the identity
Analysis
• Validating ID • Analytics
• Anonymization applied to
• Securing the access Analytics
• Distributed permission/preference
• 3rd party permission Neura l
Network/
Analysis
10. Final thoughts
1. We need to eliminate the silos
– Sources or Usage
2. Still very much a collection of technologies
– The assembly is still very complex
3. Is everything about events?
4. We need to handle the CAP theorem more appropriately
5. What is the user experience (not just the end user but also
the admin)