SlideShare ist ein Scribd-Unternehmen logo
1 von 17
TeraLab, A Secure Big Data Platform
Description And Use Cases
Franck Cotton – INSEE
Kamel Gadouche – GENES/CASD
TERALAB: DESCRIPTION
*Etude Deloitte 2013
Birth of the TeraLab project
*Etude Deloitte 2013
• Call for projects “Cloud computing / Big Data” conducted by the French
Government
• Proposal for the construction and operation of a Big Data platform,
– For innovation, research and education projects
– Submitted by a consortium comprising
• The IMT (Institut Mines-Télécom)
• The GENES,particularly the CASD (secure remote access data center)
• With INSEE partnership
• Project selected and launched
– Budget of 5,7 M€
– Over 5 years
– Contract signed in December 2013
The TeraLab platform
• A state-of-the-art technical infrastructure
– Elastic distributed system + tera-memory server
– With unique security features
• A rich catalogue of software tools
– Data storage (MPP, NoSQL)
– Query, exploration, visualization (Pig, Hive, Mahout…)
– Management and monitoring
• Data sets
– Pre-installed (public data, open data…)
– Brought by the projects, or acquired for them
• A dedicated team
– 6 people
– Platform configuration and operation
– Project advisors
*Etude Deloitte 2013
Platform organization
*Etude Deloitte 2013
The CASD
• The CASD is a facility including
– A central secure computing infrastructure (IICE): “the bubble”
– Specific access devices (SD-Box™), guarantying imperviousness as the sole means
for accessing the IICE.
• With the SD-Box, researchers may
– Work remotely on confidential data
• With 64-bit statistics sofware: SAS, Stata, R, Gauss, Matlab, Latex, Excel...
• Soon with Big Data software: Hive, Pig, Mahout, Revolution Analytics, Python…
– Request inputs or outputs
• Scripts or data
• Inputs and outputs are monitored
• With the SD-Box, data owners are sure that
– The authorized researcher is the one behind the SD-Box (smartcard and biometry)
– No data can be retrieved by researchers (no copy or paste, printing, USB keys...)
*Etude Deloitte 2013
The TeraLab platform – planning
• 2014 - 2015
– Incremental platform construction
– Pilot projects
– No cost
• 2016 - 2018
– Professionalization (business model, methodology, client support, etc.)
– Operating expenses recovery
• 2019 and beyond
– Target service offer
– Commercial mode
*Etude Deloitte 2013
TERALAB: SOME USE CASES
*Etude Deloitte 2013
Use cases in public statistics
• A burning subject
– The statistical community sees Big Data as a high-priority topic
– A few experiences in some pioneer statistical institutes (Estonia,The Netherlands, etc.)
– Several actions launched by international organizations (OECD, UNECE, Eurostat)
• HowTeraLab fits in
– Needs: methodological tests, exploration of data sources, process redesign
– A presentation to the French official statistics system aroused much interest
– Precise project on scanner data for the consumer price index
• Currently a 7 To relational database
– Other ideas expressed
• Telco data for tourism statistics
• Web site log analysis
• Next-generation social declarations
*Etude Deloitte 2013
Use case for health data
• French context
– Everyone has a unique personal identifier (the NIR)
• Allowing data matching
• Longitudinal studies
• Using the NIR requires high confidentiality (organized by law)
– A central database with all the health services provided to every citizen
• More than 1.2 billion records with more than a thousand variables
• About 250 terabytes of data generated each year
• Real time updates
• HowTeraLab fits in
– Able to meet the challenges
• Huge volumes
• Real-time analysis
– While ensuring ultra-high security
*Etude Deloitte 2013
Use case for data challenges
• The DataScience web site (http://datascience.net)
– Allows data owners to issue public or private challenges based on their data
– Allows data scientists to analyze the data, to submit models and their results and to get
evaluation scores (ranking). The winner gets a prize in euros.
• The goals are to improve the knowledge
– On methodological aspects
– On data management aspects
• HowTeraLab fits in
– Allow to organize challenges on Big Data
• hosted byTeraLab – standard
• hosted byTeraLab – bubble
– Help disseminate Big Data technologies
*Etude Deloitte 2013
CONCLUSION
*Etude Deloitte 2013
Where are we now?
• The story has just begun
– The planning is tight
• A ß-version of the distributed service will open in April 2014
• The “tera-memory” server will open in summer 2014
• The ultra-secure compartment will open in September 2014
– The team is currently being set up
– Several pilot projects have been identified
– The methodology for projects management is being defined
• Contact us if you have a Big Data project
• Visit us at http://www.teralab-datascience.fr/
*Etude Deloitte 2013
Thank you for your attention
franck.cotton@insee.fr
kamel.gadouche@casd.eu
22/03/2014 17
WWW.TERALAB-DATASCIENCE.FR

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Rajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectRajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO Project
 
Building blocks for fair digital society
Building blocks for fair digital societyBuilding blocks for fair digital society
Building blocks for fair digital society
 
Introduction to EOSCpilot project and topical activities in the area of EOSC
Introduction to EOSCpilot project and topical activities in the area of EOSCIntroduction to EOSCpilot project and topical activities in the area of EOSC
Introduction to EOSCpilot project and topical activities in the area of EOSC
 
OPEN DEI vision about European Data Spaces
OPEN DEI vision about European Data SpacesOPEN DEI vision about European Data Spaces
OPEN DEI vision about European Data Spaces
 
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...
 
European Data Spaces
European Data SpacesEuropean Data Spaces
European Data Spaces
 
Sitra rise of the pilots janne enberg
Sitra rise of the pilots janne enbergSitra rise of the pilots janne enberg
Sitra rise of the pilots janne enberg
 
EDF2014: Michele Vescovi, Researcher, Semantic & Knowledge Innovation Lab, It...
EDF2014: Michele Vescovi, Researcher, Semantic & Knowledge Innovation Lab, It...EDF2014: Michele Vescovi, Researcher, Semantic & Knowledge Innovation Lab, It...
EDF2014: Michele Vescovi, Researcher, Semantic & Knowledge Innovation Lab, It...
 
The EC strategy to enable data sharing spaces
The EC strategy to enable data sharing spacesThe EC strategy to enable data sharing spaces
The EC strategy to enable data sharing spaces
 
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
 
20140521 presentation ce de mv3
20140521 presentation ce de mv320140521 presentation ce de mv3
20140521 presentation ce de mv3
 
An overview of piv initiatives(papaloi,gouscos)final21.5
An overview of piv initiatives(papaloi,gouscos)final21.5An overview of piv initiatives(papaloi,gouscos)final21.5
An overview of piv initiatives(papaloi,gouscos)final21.5
 
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...
 
EDF2014: Rüdiger Eichin, Research Manager at SAP AG, Germany: Deriving Value ...
EDF2014: Rüdiger Eichin, Research Manager at SAP AG, Germany: Deriving Value ...EDF2014: Rüdiger Eichin, Research Manager at SAP AG, Germany: Deriving Value ...
EDF2014: Rüdiger Eichin, Research Manager at SAP AG, Germany: Deriving Value ...
 
SC6 Workshop 1: What can big data do for you?
SC6 Workshop 1: What can big data do for you? SC6 Workshop 1: What can big data do for you?
SC6 Workshop 1: What can big data do for you?
 
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 WorkshopSC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
 
Europe rules – making the fair data economy flourish
Europe rules – making the fair data economy flourishEurope rules – making the fair data economy flourish
Europe rules – making the fair data economy flourish
 
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
 
EDF2014: Allan Hanbury, Senior Researcher, Vienna University of Technology, A...
EDF2014: Allan Hanbury, Senior Researcher, Vienna University of Technology, A...EDF2014: Allan Hanbury, Senior Researcher, Vienna University of Technology, A...
EDF2014: Allan Hanbury, Senior Researcher, Vienna University of Technology, A...
 
BYTE Project Overview
BYTE Project OverviewBYTE Project Overview
BYTE Project Overview
 

Andere mochten auch

Andere mochten auch (17)

EDF2014: Talk of Inge Buffolo, Head of Institutional Relations ad Linguistic ...
EDF2014: Talk of Inge Buffolo, Head of Institutional Relations ad Linguistic ...EDF2014: Talk of Inge Buffolo, Head of Institutional Relations ad Linguistic ...
EDF2014: Talk of Inge Buffolo, Head of Institutional Relations ad Linguistic ...
 
EDF2014: Talk of Axel Polleres, Full Professor, WU - Vienna University of Eco...
EDF2014: Talk of Axel Polleres, Full Professor, WU - Vienna University of Eco...EDF2014: Talk of Axel Polleres, Full Professor, WU - Vienna University of Eco...
EDF2014: Talk of Axel Polleres, Full Professor, WU - Vienna University of Eco...
 
EDF2014: Talk of Peter Cullen, General Manager, Trustworthy Computing, Micros...
EDF2014: Talk of Peter Cullen, General Manager, Trustworthy Computing, Micros...EDF2014: Talk of Peter Cullen, General Manager, Trustworthy Computing, Micros...
EDF2014: Talk of Peter Cullen, General Manager, Trustworthy Computing, Micros...
 
EDF2014: José Ignacio Sánchez Valdenebro, Deputy Director of Digital Public S...
EDF2014: José Ignacio Sánchez Valdenebro, Deputy Director of Digital Public S...EDF2014: José Ignacio Sánchez Valdenebro, Deputy Director of Digital Public S...
EDF2014: José Ignacio Sánchez Valdenebro, Deputy Director of Digital Public S...
 
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
EDF2014: Talk of European Data Innovator Award Winner: Johann Mittheisz, form...
 
EDF2014: Talk of Nancy Routzouni, E-Governance Advisor to the Deputy Minister...
EDF2014: Talk of Nancy Routzouni, E-Governance Advisor to the Deputy Minister...EDF2014: Talk of Nancy Routzouni, E-Governance Advisor to the Deputy Minister...
EDF2014: Talk of Nancy Routzouni, E-Governance Advisor to the Deputy Minister...
 
EDF2014: Talk of Krzysztof Wecel, Assistant professor, Poznan University of E...
EDF2014: Talk of Krzysztof Wecel, Assistant professor, Poznan University of E...EDF2014: Talk of Krzysztof Wecel, Assistant professor, Poznan University of E...
EDF2014: Talk of Krzysztof Wecel, Assistant professor, Poznan University of E...
 
EDF2014: Talk of Ioannis Kotsiopoulos, European Dynamics: Semantics – Interop...
EDF2014: Talk of Ioannis Kotsiopoulos, European Dynamics: Semantics – Interop...EDF2014: Talk of Ioannis Kotsiopoulos, European Dynamics: Semantics – Interop...
EDF2014: Talk of Ioannis Kotsiopoulos, European Dynamics: Semantics – Interop...
 
EDF2014: Taru Rastas, Senior Advisor, Ministry of Communications of Finland: ...
EDF2014: Taru Rastas, Senior Advisor, Ministry of Communications of Finland: ...EDF2014: Taru Rastas, Senior Advisor, Ministry of Communications of Finland: ...
EDF2014: Taru Rastas, Senior Advisor, Ministry of Communications of Finland: ...
 
EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...
EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...
EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...
 
Barbato leit ict 15-16-17
Barbato leit ict 15-16-17Barbato leit ict 15-16-17
Barbato leit ict 15-16-17
 
EDF2014: Talk of Marta Nagy-Rothengass, Head of Unit Data Value Chain, Direct...
EDF2014: Talk of Marta Nagy-Rothengass, Head of Unit Data Value Chain, Direct...EDF2014: Talk of Marta Nagy-Rothengass, Head of Unit Data Value Chain, Direct...
EDF2014: Talk of Marta Nagy-Rothengass, Head of Unit Data Value Chain, Direct...
 
EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...
EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...
EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...
 
EDF2014: Talk of Abraham Bernstein, Full Professor of Informatics, University...
EDF2014: Talk of Abraham Bernstein, Full Professor of Informatics, University...EDF2014: Talk of Abraham Bernstein, Full Professor of Informatics, University...
EDF2014: Talk of Abraham Bernstein, Full Professor of Informatics, University...
 
EDF2014: Talk of Frank Kresin, Research Director, Waag Society, Netherlands: ...
EDF2014: Talk of Frank Kresin, Research Director, Waag Society, Netherlands: ...EDF2014: Talk of Frank Kresin, Research Director, Waag Society, Netherlands: ...
EDF2014: Talk of Frank Kresin, Research Director, Waag Society, Netherlands: ...
 
EDF2014: Ralf-Peter Schaefer, Head of Traffic Product Unit, TomTom, Germany: ...
EDF2014: Ralf-Peter Schaefer, Head of Traffic Product Unit, TomTom, Germany: ...EDF2014: Ralf-Peter Schaefer, Head of Traffic Product Unit, TomTom, Germany: ...
EDF2014: Ralf-Peter Schaefer, Head of Traffic Product Unit, TomTom, Germany: ...
 
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
 

Ähnlich wie EDF2014: Franck Cotton & Kamel Gadouche, France: TeraLab - A Secure Big Data Platform, Description And Use Cases

Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 
Service and Support for Science IT -Peter Kunzst, University of Zurich
Service and Support for Science IT-Peter Kunzst, University of ZurichService and Support for Science IT-Peter Kunzst, University of Zurich
Service and Support for Science IT -Peter Kunzst, University of Zurich
Mind the Byte
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution
WSO2
 

Ähnlich wie EDF2014: Franck Cotton & Kamel Gadouche, France: TeraLab - A Secure Big Data Platform, Description And Use Cases (20)

Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdf
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...Experimental transformation of  ABS data into Data Cube Vocabulary (DCV) form...
Experimental transformation of ABS data into Data Cube Vocabulary (DCV) form...
 
Sensitive Data Workshop
Sensitive Data WorkshopSensitive Data Workshop
Sensitive Data Workshop
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10
 
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
 
SILECS/SLICES
SILECS/SLICESSILECS/SLICES
SILECS/SLICES
 
MDIS workshop 2015
MDIS workshop 2015MDIS workshop 2015
MDIS workshop 2015
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
Service and Support for Science IT -Peter Kunzst, University of Zurich
Service and Support for Science IT-Peter Kunzst, University of ZurichService and Support for Science IT-Peter Kunzst, University of Zurich
Service and Support for Science IT -Peter Kunzst, University of Zurich
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
 
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research data
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
 

Mehr von European Data Forum

Mehr von European Data Forum (8)

EDF2014: BIG - NESSI Networking Session: Intro Presentation
EDF2014: BIG - NESSI Networking Session: Intro PresentationEDF2014: BIG - NESSI Networking Session: Intro Presentation
EDF2014: BIG - NESSI Networking Session: Intro Presentation
 
EDF2014: Adrian Cristal, Barcelona Supercomputing Center, RETHINK big Project...
EDF2014: Adrian Cristal, Barcelona Supercomputing Center, RETHINK big Project...EDF2014: Adrian Cristal, Barcelona Supercomputing Center, RETHINK big Project...
EDF2014: Adrian Cristal, Barcelona Supercomputing Center, RETHINK big Project...
 
EDF2014: Dimitris Vassiliadis, Head of Unit, EXUS Innovation Attractor: From ...
EDF2014: Dimitris Vassiliadis, Head of Unit, EXUS Innovation Attractor: From ...EDF2014: Dimitris Vassiliadis, Head of Unit, EXUS Innovation Attractor: From ...
EDF2014: Dimitris Vassiliadis, Head of Unit, EXUS Innovation Attractor: From ...
 
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
 
EDF2014: Nikolaos Loutas, Manager at PwC Belgium, Business Models for Linked ...
EDF2014: Nikolaos Loutas, Manager at PwC Belgium, Business Models for Linked ...EDF2014: Nikolaos Loutas, Manager at PwC Belgium, Business Models for Linked ...
EDF2014: Nikolaos Loutas, Manager at PwC Belgium, Business Models for Linked ...
 
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...
 
EDF2014: Daniel Vila-Suero, Researcher, Ontology Engineering Group, Universid...
EDF2014: Daniel Vila-Suero, Researcher, Ontology Engineering Group, Universid...EDF2014: Daniel Vila-Suero, Researcher, Ontology Engineering Group, Universid...
EDF2014: Daniel Vila-Suero, Researcher, Ontology Engineering Group, Universid...
 
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

EDF2014: Franck Cotton & Kamel Gadouche, France: TeraLab - A Secure Big Data Platform, Description And Use Cases

  • 1. TeraLab, A Secure Big Data Platform Description And Use Cases Franck Cotton – INSEE Kamel Gadouche – GENES/CASD
  • 3. Birth of the TeraLab project *Etude Deloitte 2013 • Call for projects “Cloud computing / Big Data” conducted by the French Government • Proposal for the construction and operation of a Big Data platform, – For innovation, research and education projects – Submitted by a consortium comprising • The IMT (Institut Mines-Télécom) • The GENES,particularly the CASD (secure remote access data center) • With INSEE partnership • Project selected and launched – Budget of 5,7 M€ – Over 5 years – Contract signed in December 2013
  • 4. The TeraLab platform • A state-of-the-art technical infrastructure – Elastic distributed system + tera-memory server – With unique security features • A rich catalogue of software tools – Data storage (MPP, NoSQL) – Query, exploration, visualization (Pig, Hive, Mahout…) – Management and monitoring • Data sets – Pre-installed (public data, open data…) – Brought by the projects, or acquired for them • A dedicated team – 6 people – Platform configuration and operation – Project advisors *Etude Deloitte 2013
  • 6. The CASD • The CASD is a facility including – A central secure computing infrastructure (IICE): “the bubble” – Specific access devices (SD-Box™), guarantying imperviousness as the sole means for accessing the IICE. • With the SD-Box, researchers may – Work remotely on confidential data • With 64-bit statistics sofware: SAS, Stata, R, Gauss, Matlab, Latex, Excel... • Soon with Big Data software: Hive, Pig, Mahout, Revolution Analytics, Python… – Request inputs or outputs • Scripts or data • Inputs and outputs are monitored • With the SD-Box, data owners are sure that – The authorized researcher is the one behind the SD-Box (smartcard and biometry) – No data can be retrieved by researchers (no copy or paste, printing, USB keys...) *Etude Deloitte 2013
  • 7.
  • 8.
  • 9.
  • 10. The TeraLab platform – planning • 2014 - 2015 – Incremental platform construction – Pilot projects – No cost • 2016 - 2018 – Professionalization (business model, methodology, client support, etc.) – Operating expenses recovery • 2019 and beyond – Target service offer – Commercial mode *Etude Deloitte 2013
  • 11. TERALAB: SOME USE CASES *Etude Deloitte 2013
  • 12. Use cases in public statistics • A burning subject – The statistical community sees Big Data as a high-priority topic – A few experiences in some pioneer statistical institutes (Estonia,The Netherlands, etc.) – Several actions launched by international organizations (OECD, UNECE, Eurostat) • HowTeraLab fits in – Needs: methodological tests, exploration of data sources, process redesign – A presentation to the French official statistics system aroused much interest – Precise project on scanner data for the consumer price index • Currently a 7 To relational database – Other ideas expressed • Telco data for tourism statistics • Web site log analysis • Next-generation social declarations *Etude Deloitte 2013
  • 13. Use case for health data • French context – Everyone has a unique personal identifier (the NIR) • Allowing data matching • Longitudinal studies • Using the NIR requires high confidentiality (organized by law) – A central database with all the health services provided to every citizen • More than 1.2 billion records with more than a thousand variables • About 250 terabytes of data generated each year • Real time updates • HowTeraLab fits in – Able to meet the challenges • Huge volumes • Real-time analysis – While ensuring ultra-high security *Etude Deloitte 2013
  • 14. Use case for data challenges • The DataScience web site (http://datascience.net) – Allows data owners to issue public or private challenges based on their data – Allows data scientists to analyze the data, to submit models and their results and to get evaluation scores (ranking). The winner gets a prize in euros. • The goals are to improve the knowledge – On methodological aspects – On data management aspects • HowTeraLab fits in – Allow to organize challenges on Big Data • hosted byTeraLab – standard • hosted byTeraLab – bubble – Help disseminate Big Data technologies *Etude Deloitte 2013
  • 16. Where are we now? • The story has just begun – The planning is tight • A ß-version of the distributed service will open in April 2014 • The “tera-memory” server will open in summer 2014 • The ultra-secure compartment will open in September 2014 – The team is currently being set up – Several pilot projects have been identified – The methodology for projects management is being defined • Contact us if you have a Big Data project • Visit us at http://www.teralab-datascience.fr/ *Etude Deloitte 2013
  • 17. Thank you for your attention franck.cotton@insee.fr kamel.gadouche@casd.eu 22/03/2014 17 WWW.TERALAB-DATASCIENCE.FR