SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Data Organization & Big Data Architecture
 Data Organization
 Big Data Architecture
 Recruitment
Agenda
Data Organization
Line Of Business
HR Finance Sales Customers
Competitors Markets Products Supply
Trafic
Acquisition
Communication Security Prospects
* If you read this text, work in the data field and are interested in joining us, please go to: https://www.ovh.com/fr/careers/
Use Line Of Business
•LOB 1
( Customer )
BI Team
DataScience
Team
LOB 2
( Support )
BI Team
DataScience
Team
LOB 3
…
BI Team
DataScience
Team
Data Office
Data
Centralization
Datalake
Cleansing
Data
Integration
Data Office
CRM
BI Team
Data Science
Team
• ExtractsData
Analyst
•Events
•Actions
Customer
Animation
•Product Analysis
•Global AnalysisBUS
•Country Analysis
SUBS
•PAC
•Analyse AdhocDigital
•Onsite
•PartnerBIZDEV
•Campaigns
•Text mining
Trafic
Acquistion
•Segmentation
•Normalisation
Targeting
Channel
Incaseyoumisseditonthepreviousslide,ifyouworkinthedatafield,
weareinterestedinyourprofile!
Data Maturity
Level 1:
POC
Data are manually created or extracted once
Data are modified by one data scientist
Data are assessed by a data analyst and manually sent to a business analyst post control
Data Maturity
Level 1:
POC
Data are manually created or extracted once
Data are modified by one data scientist
Data are assessed by a data analyst and manually sent to a business analyst post control
Level 2:
Manual
Data are manually created on a regular basis
Data are manually added to the enterprise model with an automated process
Data can be used by all data scientists, data analysts or business analysts
Data Maturity
Level 1:
POC
Data are manually created or extracted once
Data are modified by one data scientist
Data are assessed by a data analyst and manually sent to a business analyst post control
Level 2:
Manual
Data are manually created on a regular basis
Data are manually added to the enterprise model with an automated process
Data can be used by all data scientists, data analysts or business analysts
Level 3:
Automatic
Data are created through a controlled business process
Data are automatically added to the enterprise model
Data can be used by all data scientists, data analysts or business analysts
Data Maturity Matrix
Customers Competitors Products
Advanced 5 Potential Strategy
4 Attrition New Product
3 Churn Rank
2 Adds Event
Basic 1 NIC Pricing …
Exploration : Code First Industrialisation : Model first
Data Scientists
Data Analysts
Business Analysts
Analyse
Test
Validation
Data Management Team ( Architect + Data Integrator )
Business Intelligence Team
Data Lake Team
Data Lake Team
Tool / Infrastructure
Exploration : Code First Industrialisation : Model first
Data Scientists
Data Analysts
Business Analysts
Technical model
Analyse
Test
Validation
Data Management Team ( Architect + Data Integrator )
Business Intelligence Team
Tool / Infrastructure
Exploration : Code First Industrialisation : Model first
Data preparation :
80%
Data Scientists
Data Analysts
Business Analysts
Technical model
Machine
Learning :
20%
Analyse
Test
Validation
Data Management Team ( Architect + Data Integrator )
Business Intelligence Team
Data Lake Team
Tool / Infrastructure
Exploration : Code First Industrialisation : Model first
Data preparation :
80%
Data Scientists
Data Analysts
Business Analysts
Technical model
Machine
Learning :
20%
Analyse
Test
Validation
Data Analysis /
Creation
Data
Analysis
Data Management Team ( Architect + Data Integrator )
DataViz
Model
Business Intelligence Team
POC
Expose
POC
POC Mode
Data Lake Team
Tool / Infrastructure
Exploration : Code First Industrialisation : Model first
Data preparation :
80%
Data Scientists
Data Analysts
Business Analysts
Technical model
Machine
Learning :
20%
Analyse
Test
Validation
Data Analysis /
Creation
Data
Analysis
DataCommitee
Data Management Team ( Architect + Data Integrator )
DataViz
Model
Enterprise Model Building
Datamart and report
building
Business Intelligence Team
DTM
Data Prepare:
industrialise
POC
Datastore 360
Level 2 & 3
mode
Expose
POC
Entreprise model
POC Mode
Data Lake Team
Tool / Infrastructure
Exploration : Code First Industrialisation : Model first
Data preparation :
80%
Data Scientists
Data Analysts
Business Analysts
Technical model
Machine
Learning :
20%
Analyse
Test
Validation
Data Analysis /
Creation
Data
Analysis
DataCommitee
Data Management Team ( Architect + Data Integrator )
DataViz
Model
Enterprise Model Building
Datamart and report
building
Business Intelligence Team
DTM
Data Prepare:
industrialise
Build Datamart and
Dashboard
POC
Datastore 360
Expose
POC
Entreprise model
POC Mode
Level 2 & 3
mode
Data Lake Team
Data Commitee
 Define data that needs to be added to
enterprise data
 Define priority and owners by subject
 Industrialise New data production : from
excel to full business process
 Validate enterprise model
– Common vocabulary
– Business and/or Functional model
 Be informed of evolution
Participant
 Data Scientist
 Data Analyst
 Business Analyst
 Data Management Team
Periodicity
 Every month
Objectives
Datastore 360
EDS 360
History
 Get all data from
– Front office application
– Back Office Application
– External Data
 Stores data in a business oriented model
 Responsable to historize data when this makes
sense for the business
– What data do we want to keep ? What will I need in 20 years ?
 Expose data to all application that requires it
– Business Intelligence : reporting or datamart
– Front office Application
Current
Client Produit Activity
Client Produit Activity
…
…
Data Scientist
Data Analyst
Business Analyst
DataViz
User APPs
(CRM,
Support
api
api Direct
read
Big Data Architecture
Context
~ 50 Replicas SQL
~ 700 DB
~ 300K tables
~ 100TB
~ 500K events/s
Datalake Hardware view
Private network
OVH Dedicated server
OVH Public Cloud High scalability
Security
Performance
Reliability
Lille Grand Palais – 28 Février 2017
Datalake software view
Pig
Flink
Spark
HDFS
HBase
Phoenix
Kafka
(Queue)Couch
Base
Jobs
Job Skills Output
Data Analyst Excel
Dataviz : Tableau, PowerBI
Data strategy
Data Scientist Scala, Java, R, Python, Cube Datasets, Flows, Patterns,
Models
Data Integrator Flink, Hbase, Pig, Spark Data preparation
Data Dev Ops Kafka, Hbase, Go, Apache
Beam, …
Datalake
Thank you !
Join us : ovh.com/fr/careers

Weitere ähnliche Inhalte

Was ist angesagt?

OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuDataiku
 
Visualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaVisualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaObjectRocket
 
Your data layer - Choosing the right database solutions for the future
Your data layer - Choosing the right database solutions for the futureYour data layer - Choosing the right database solutions for the future
Your data layer - Choosing the right database solutions for the futureObjectRocket
 
BigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearchBigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearchTO THE NEW | Technology
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentationTao Feng
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Imply
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadatamarkgrover
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big DataLewis Crawford
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformOntotext
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps Ontotext
 
Democratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data DiscoveryDemocratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data DiscoveryMark Grover
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryChris Schalk
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperMárton Kodok
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 

Was ist angesagt? (20)

OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - Dataiku
 
Visualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaVisualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and Kibana
 
Your data layer - Choosing the right database solutions for the future
Your data layer - Choosing the right database solutions for the futureYour data layer - Choosing the right database solutions for the future
Your data layer - Choosing the right database solutions for the future
 
BigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearchBigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearch
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big Data
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
Democratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data DiscoveryDemocratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data Discovery
 
Redshift VS BigQuery
Redshift VS BigQueryRedshift VS BigQuery
Redshift VS BigQuery
 
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryIntro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Meetup SF - Amundsen
Meetup SF  -  AmundsenMeetup SF  -  Amundsen
Meetup SF - Amundsen
 
Elastic Stack Roadmap
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack Roadmap
 

Ähnlich wie Meetup Data-science OVH

Strategy session 5 - unlocking the data dividend - andy steer
Strategy   session 5 - unlocking the data dividend - andy steerStrategy   session 5 - unlocking the data dividend - andy steer
Strategy session 5 - unlocking the data dividend - andy steerAndy Steer
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseDatabricks
 
M Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classM Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classmcAnalytics99
 
Project Deliverable 1 Project Plan InceptionBy Justin M. Bla.docx
Project Deliverable 1 Project Plan InceptionBy Justin M. Bla.docxProject Deliverable 1 Project Plan InceptionBy Justin M. Bla.docx
Project Deliverable 1 Project Plan InceptionBy Justin M. Bla.docxwkyra78
 
How to make your data scientists happy
How to make your data scientists happy How to make your data scientists happy
How to make your data scientists happy Hussain Sultan
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics WebinarEckerson Group
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Thought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserveThought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserveRon Krzoska
 
Predictive Data Analytics and Artificial Intelligence by 40°
Predictive Data Analytics and Artificial Intelligence by 40°Predictive Data Analytics and Artificial Intelligence by 40°
Predictive Data Analytics and Artificial Intelligence by 40°40° Labor für Innovation
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkkguest4e975e2
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionProvectus
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Matt Stubbs
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMBig Data Joe™ Rossi
 

Ähnlich wie Meetup Data-science OVH (20)

Agile BI success factors
Agile BI success factorsAgile BI success factors
Agile BI success factors
 
Strategy session 5 - unlocking the data dividend - andy steer
Strategy   session 5 - unlocking the data dividend - andy steerStrategy   session 5 - unlocking the data dividend - andy steer
Strategy session 5 - unlocking the data dividend - andy steer
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
M Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classM Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson class
 
Project Deliverable 1 Project Plan InceptionBy Justin M. Bla.docx
Project Deliverable 1 Project Plan InceptionBy Justin M. Bla.docxProject Deliverable 1 Project Plan InceptionBy Justin M. Bla.docx
Project Deliverable 1 Project Plan InceptionBy Justin M. Bla.docx
 
How to make your data scientists happy
How to make your data scientists happy How to make your data scientists happy
How to make your data scientists happy
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics Webinar
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
The Manulife Journey
The Manulife JourneyThe Manulife Journey
The Manulife Journey
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Thought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserveThought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserve
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 
Predictive Data Analytics and Artificial Intelligence by 40°
Predictive Data Analytics and Artificial Intelligence by 40°Predictive Data Analytics and Artificial Intelligence by 40°
Predictive Data Analytics and Artificial Intelligence by 40°
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 

Mehr von Vincent Terrasi

SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...Vincent Terrasi
 
IA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEOIA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEOVincent Terrasi
 
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a mentislides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a mentiVincent Terrasi
 
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEOUne IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEOVincent Terrasi
 
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...Vincent Terrasi
 
Génération de contenu pour le SEO
Génération de contenu pour le SEOGénération de contenu pour le SEO
Génération de contenu pour le SEOVincent Terrasi
 
Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?Vincent Terrasi
 
Explainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking FactorsExplainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking FactorsVincent Terrasi
 
Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !Vincent Terrasi
 
Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?Vincent Terrasi
 
Find out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHFind out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHVincent Terrasi
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?Vincent Terrasi
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projectsVincent Terrasi
 
How Data Science can boost your SEO ?
How Data Science can boost your SEO ?How Data Science can boost your SEO ?
How Data Science can boost your SEO ?Vincent Terrasi
 
Analyse your SEO Data with R and Kibana
Analyse your SEO Data with R and KibanaAnalyse your SEO Data with R and Kibana
Analyse your SEO Data with R and KibanaVincent Terrasi
 

Mehr von Vincent Terrasi (15)

SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
 
IA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEOIA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEO
 
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a mentislides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a menti
 
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEOUne IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
 
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
 
Génération de contenu pour le SEO
Génération de contenu pour le SEOGénération de contenu pour le SEO
Génération de contenu pour le SEO
 
Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?
 
Explainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking FactorsExplainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking Factors
 
Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !
 
Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?
 
Find out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHFind out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVH
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projects
 
How Data Science can boost your SEO ?
How Data Science can boost your SEO ?How Data Science can boost your SEO ?
How Data Science can boost your SEO ?
 
Analyse your SEO Data with R and Kibana
Analyse your SEO Data with R and KibanaAnalyse your SEO Data with R and Kibana
Analyse your SEO Data with R and Kibana
 

Kürzlich hochgeladen

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Kürzlich hochgeladen (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

Meetup Data-science OVH

  • 1. Data Organization & Big Data Architecture
  • 2.  Data Organization  Big Data Architecture  Recruitment Agenda
  • 4. Line Of Business HR Finance Sales Customers Competitors Markets Products Supply Trafic Acquisition Communication Security Prospects * If you read this text, work in the data field and are interested in joining us, please go to: https://www.ovh.com/fr/careers/
  • 5. Use Line Of Business •LOB 1 ( Customer ) BI Team DataScience Team LOB 2 ( Support ) BI Team DataScience Team LOB 3 … BI Team DataScience Team
  • 6. Data Office Data Centralization Datalake Cleansing Data Integration Data Office CRM BI Team Data Science Team • ExtractsData Analyst •Events •Actions Customer Animation •Product Analysis •Global AnalysisBUS •Country Analysis SUBS •PAC •Analyse AdhocDigital •Onsite •PartnerBIZDEV •Campaigns •Text mining Trafic Acquistion •Segmentation •Normalisation Targeting Channel Incaseyoumisseditonthepreviousslide,ifyouworkinthedatafield, weareinterestedinyourprofile!
  • 7. Data Maturity Level 1: POC Data are manually created or extracted once Data are modified by one data scientist Data are assessed by a data analyst and manually sent to a business analyst post control
  • 8. Data Maturity Level 1: POC Data are manually created or extracted once Data are modified by one data scientist Data are assessed by a data analyst and manually sent to a business analyst post control Level 2: Manual Data are manually created on a regular basis Data are manually added to the enterprise model with an automated process Data can be used by all data scientists, data analysts or business analysts
  • 9. Data Maturity Level 1: POC Data are manually created or extracted once Data are modified by one data scientist Data are assessed by a data analyst and manually sent to a business analyst post control Level 2: Manual Data are manually created on a regular basis Data are manually added to the enterprise model with an automated process Data can be used by all data scientists, data analysts or business analysts Level 3: Automatic Data are created through a controlled business process Data are automatically added to the enterprise model Data can be used by all data scientists, data analysts or business analysts
  • 10. Data Maturity Matrix Customers Competitors Products Advanced 5 Potential Strategy 4 Attrition New Product 3 Churn Rank 2 Adds Event Basic 1 NIC Pricing …
  • 11. Exploration : Code First Industrialisation : Model first Data Scientists Data Analysts Business Analysts Analyse Test Validation Data Management Team ( Architect + Data Integrator ) Business Intelligence Team Data Lake Team
  • 12. Data Lake Team Tool / Infrastructure Exploration : Code First Industrialisation : Model first Data Scientists Data Analysts Business Analysts Technical model Analyse Test Validation Data Management Team ( Architect + Data Integrator ) Business Intelligence Team
  • 13. Tool / Infrastructure Exploration : Code First Industrialisation : Model first Data preparation : 80% Data Scientists Data Analysts Business Analysts Technical model Machine Learning : 20% Analyse Test Validation Data Management Team ( Architect + Data Integrator ) Business Intelligence Team Data Lake Team
  • 14. Tool / Infrastructure Exploration : Code First Industrialisation : Model first Data preparation : 80% Data Scientists Data Analysts Business Analysts Technical model Machine Learning : 20% Analyse Test Validation Data Analysis / Creation Data Analysis Data Management Team ( Architect + Data Integrator ) DataViz Model Business Intelligence Team POC Expose POC POC Mode Data Lake Team
  • 15. Tool / Infrastructure Exploration : Code First Industrialisation : Model first Data preparation : 80% Data Scientists Data Analysts Business Analysts Technical model Machine Learning : 20% Analyse Test Validation Data Analysis / Creation Data Analysis DataCommitee Data Management Team ( Architect + Data Integrator ) DataViz Model Enterprise Model Building Datamart and report building Business Intelligence Team DTM Data Prepare: industrialise POC Datastore 360 Level 2 & 3 mode Expose POC Entreprise model POC Mode Data Lake Team
  • 16. Tool / Infrastructure Exploration : Code First Industrialisation : Model first Data preparation : 80% Data Scientists Data Analysts Business Analysts Technical model Machine Learning : 20% Analyse Test Validation Data Analysis / Creation Data Analysis DataCommitee Data Management Team ( Architect + Data Integrator ) DataViz Model Enterprise Model Building Datamart and report building Business Intelligence Team DTM Data Prepare: industrialise Build Datamart and Dashboard POC Datastore 360 Expose POC Entreprise model POC Mode Level 2 & 3 mode Data Lake Team
  • 17. Data Commitee  Define data that needs to be added to enterprise data  Define priority and owners by subject  Industrialise New data production : from excel to full business process  Validate enterprise model – Common vocabulary – Business and/or Functional model  Be informed of evolution Participant  Data Scientist  Data Analyst  Business Analyst  Data Management Team Periodicity  Every month Objectives
  • 18. Datastore 360 EDS 360 History  Get all data from – Front office application – Back Office Application – External Data  Stores data in a business oriented model  Responsable to historize data when this makes sense for the business – What data do we want to keep ? What will I need in 20 years ?  Expose data to all application that requires it – Business Intelligence : reporting or datamart – Front office Application Current Client Produit Activity Client Produit Activity … … Data Scientist Data Analyst Business Analyst DataViz User APPs (CRM, Support api api Direct read
  • 20. Context ~ 50 Replicas SQL ~ 700 DB ~ 300K tables ~ 100TB ~ 500K events/s
  • 21. Datalake Hardware view Private network OVH Dedicated server OVH Public Cloud High scalability Security Performance Reliability
  • 22. Lille Grand Palais – 28 Février 2017
  • 24. Jobs Job Skills Output Data Analyst Excel Dataviz : Tableau, PowerBI Data strategy Data Scientist Scala, Java, R, Python, Cube Datasets, Flows, Patterns, Models Data Integrator Flink, Hbase, Pig, Spark Data preparation Data Dev Ops Kafka, Hbase, Go, Apache Beam, … Datalake
  • 25. Thank you ! Join us : ovh.com/fr/careers

Hinweis der Redaktion

  1. A secured cluster accessible through a gateaway Computing layer is based on Public cloud instances in order to scale fastly On the other hand Cold Storage is based on dedicated server for higher performances Technologie vRACK pour le réseau dédié Public Cloud pour la scalabilité
  2. A secured cluster accessible through a gateaway Computing layer is based on Public cloud instances in order to scale fastly On the other hand Cold Storage is based on dedicated server for higher performances Technologie vRACK pour le réseau dédié Public Cloud pour la scalabilité -> datanode
  3. Hadoop ecosystem with HDFS for data storage, Hbase plus phoenix for SQL support on columnar storage -> Relationnal data storage layer CouchBase for document data storage. Key, value can either be stored into HDFS or couchbase depending on their access rate Processing is made by Spark / Flink / Pig. Each of these solution has its strong points, but spark and flink may be abstracted as a apache Beam layer in incoming versions.