SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Scaling Face Recognition
with Big Data
Bogdan BOCȘE
Solutions Architect & Co-founder VisageCloud
• How to learn ?
• What to learn?
• Defining learning objectives
• How to scale learning?
• Gotchas
• VisageCloud
–Architecture
–Use Cases
Agenda
• What questions to ask before writing the code?
• How to look at the data before feeding it to the
machine?
• What is the state of the art regarding ML?
• What frameworks to use?
• What are the common traps to avoid?
• How to design for scale?
Objectives
HOW TO LEARN?
Vision
• Convolutional Neural Networks
• Inception Paper
NLP
• Word2Vec
• GloVe: Global Vectors for Words Representation
Generic
• Classification
• Prediction
How to Learn?
Convolutional Neural Networks: Big Picture
• Pooling / Max Pooling
• Convolution
• Fully Connected Activation
– Activation Function, eg. ReLu
Convolutional Neural Networks : Components
• Learning is an optimization problem
–Find parameters of a system (neural network) that
minimize a fixed error function
–Not unlike planning orbital paths
• Defining the network architecture
• Defining the training algorithm
–Stochastic Gradient Descent
• With momentum
• With noisy
Taking a Step Back: The Math
• DeepLearning4j
– Independent company
– Java interface with C-bindings for performance
• TensorFlow
– Python & C++ API
– Developed by Google
– Compatible with TPU
• Torch
– Developed by Facebook
– Written in LuaJIT, with Python bindings
Frameworks
WHAT TO LEARN?
• Public data sets
–Labelled Faces in the Wild (LFW)
–Youtube faces
–Kaggle
• Private data sets
• Build your own
–Outsourcing: Mechanical Turk
–Crowsourcing: ReCaptcha model
Data Sets
Preparing Data
Clean
data
Cropping
Structure
Homogeneity
Normalization
Histograms
Filtering
• Machine learning is not magic
• If you can’t understand the data, a machine probably
won’t either
• Preprocessing makes the difference between results
• Applying filters, normalization, anomaly detection is
computationally inexpensive
Preparing Data
DEFINING LEARNING OBJECTIVES
• Supervised
–Classification
–Scoring and regression
–Identification
• Unsupervised
–Clustering
Defining learning objectives
• Projecting input onto a fixed set of classes
• “Don’t use a cannon to kill a fly”
–Support Vector Machines
• Linear
• Radial Based Functions
Classification
• Embedding
–Projecting input (image) onto an vector space with a
known property
• Triplet Loss Function
Identification
• Splitting a set of items into non-overlapping subsets,
based on item attributes
• Counting people in video streams
• Algorithms:
–Fixed threshold
–K-means
–Rank-order clustering
Clustering
HOW TO SCALE LEARNING?
• Scaling training
– Requires shared memory space
– Vertical scaling
• GPU
• Soon-to-come: TPU (tensor processing unit)
• Scaling evaluation
– Shared nothing architecture
– Neural network/classifier rarely change
– Load balancing pattern
– Partitioning data if needed
How to scale learning?
• There is no “reduce” for neural networks
• Averaging weights/parameters
– Usually not a good idea
• Genetic algorithms
– Requires a lot of processing power
– Running independent iterations on different machines
– Crossover between weights/parameters of independently
trained neural networks after each epoch
Ideas for horizontal scaling
GOTCHAS
• Our 2D and 3D intuition often fails in high dimensions
• Distances tend to become relatively “the same” as
number of dimensions increases
• Dimensionality reduction
– Embedding functions
– Principal component analysis
The Curse of Dimensionality
• “The bottom of a valley is not necessarily the lowest
point on Earth”
• Learning algorithms may get stuck in local optima
• Using momentum or some random noise reduces
this possibility
• Using genetic algorithms can be even more robust,
but it’s computationally expensive
Local Optima
Visualizing Local Optima
monkey saddle
“Based on state-of-the-art machine learning, our
weather forecast system can predict tomorrow’s
weather with 72% accuracy”
Evaluating of Learning
You get the same results by saying “it’s going to be the same as today”
• Don’t test on the data you train on
– Use different data set
– Split the data sets you have
• Beware of data biases
– Confirmation bias
– Survivorship bias
– Selection bias
• Compare against a benchmark, even a dummy one
– Coin flip
– Linear algorithms
– “Same-as-before”
Evaluation of Learning
Architecture and Use Cases
High Level Architecture
VisageCloud Production
HAProxy
(reverse proxy)
Image Storage
AWS S3
Service
(API Controller)
Cassandra
Containers
(Docker)
Neural Networks
(OpenCV, Dlib,
Torch, pixie magic)
CQL Binary
HTTP
API Consumer
(Customer Infrastructure)
HTTPS
HTTP
HTTPS
Detect
faces
Align faces
Pre-
processing
Feature
extraction
Feature
comparison
Processing Pipeline
• The collection
–Slice of data used together
–10K-100K records
• The Cache-Inside Pattern
–Loading / preloading collection in one application server
–Content based routing/balancing to maximize cache hits
–No logic in the database layer
–Requires periodic polling for updates
• Weaker consistency
Partitioning Data: Application Level Logic
Partitioning Data: Application Level Logic
Application Layer
Application Application Application
Cassandra (Database Layer)
Cassandra Node Cassandra Node Cassandra Node Cassandra Node
Content-based balancing/routing
Preload collectionPoll for updatesWrite updates
• Perform comparison logic in database
–User Defined Aggregate Functions
• Removes the need to move data around between
application and database
• Harder to deploy/test
• Stronger consistency
Partitioning Data: Application Level Logic
• It’s math, not magic
• If you don’t understand the data, neither will the
machine
• Preprocessing makes the difference
• Test against a benchmark, any benchmark
• Evaluate first, scale later
Key Take-away
Bogdan@VisageCloud.com
+(40) 724 714 234
https://www.linkedin.com/in/bogdanbocse/
https://twitter.com/bocse
Let’s keep in touch

Weitere ähnliche Inhalte

Was ist angesagt?

Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019
Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019
Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019webwinkelvakdag
 
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...Denodo
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewDr. Ananth Krishnamoorthy
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...Dataconomy Media
 
Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine LearningPaul Prae
 
Deep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreathDeep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreathDatabricks
 
Citizen Data Science Training using KNIME
Citizen Data Science Training using KNIMECitizen Data Science Training using KNIME
Citizen Data Science Training using KNIMEAli Raza Anjum
 
Artificial Intelligence and the Data Center
Artificial Intelligence and the Data CenterArtificial Intelligence and the Data Center
Artificial Intelligence and the Data Centersflaig
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoDenodo
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationAn AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationDavid Solomon
 
Tom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - boTom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - boSogeti Nederland B.V.
 
IBM Deep Learning Overview
IBM Deep Learning OverviewIBM Deep Learning Overview
IBM Deep Learning OverviewDavid Solomon
 
Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...Data Con LA
 
Building predictive models in Azure Machine Learning
Building predictive models in Azure Machine LearningBuilding predictive models in Azure Machine Learning
Building predictive models in Azure Machine LearningMostafa
 
Dealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AIDealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AIData Products Meetup
 
Accelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationAccelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationDenodo
 
Codes and standards
Codes and standardsCodes and standards
Codes and standardssflaig
 
Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars
Practical Artificial Intelligence: Deep Learning Beyond Cats and CarsPractical Artificial Intelligence: Deep Learning Beyond Cats and Cars
Practical Artificial Intelligence: Deep Learning Beyond Cats and CarsAlexey Rybakov
 

Was ist angesagt? (20)

Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019
Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019
Six Data Prep steps to Optimize Cloud Data Lakes - Big Data Expo 2019
 
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
 
Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine Learning
 
Deep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreathDeep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreath
 
Citizen Data Science Training using KNIME
Citizen Data Science Training using KNIMECitizen Data Science Training using KNIME
Citizen Data Science Training using KNIME
 
Artificial Intelligence and the Data Center
Artificial Intelligence and the Data CenterArtificial Intelligence and the Data Center
Artificial Intelligence and the Data Center
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from Denodo
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationAn AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven Organization
 
Tom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - boTom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - bo
 
IBM Deep Learning Overview
IBM Deep Learning OverviewIBM Deep Learning Overview
IBM Deep Learning Overview
 
Msst 2019 v4
Msst 2019 v4Msst 2019 v4
Msst 2019 v4
 
Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...
 
Building predictive models in Azure Machine Learning
Building predictive models in Azure Machine LearningBuilding predictive models in Azure Machine Learning
Building predictive models in Azure Machine Learning
 
Dealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AIDealing with uncertainty in fintech using AI
Dealing with uncertainty in fintech using AI
 
Accelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationAccelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data Virtualization
 
Codes and standards
Codes and standardsCodes and standards
Codes and standards
 
Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars
Practical Artificial Intelligence: Deep Learning Beyond Cats and CarsPractical Artificial Intelligence: Deep Learning Beyond Cats and Cars
Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars
 

Ähnlich wie InfoEducatie - Face Recognition Architecture

Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learningNEEVEE Technologies
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema
 
The Diabolical Developers Guide to Performance Tuning
The Diabolical Developers Guide to Performance TuningThe Diabolical Developers Guide to Performance Tuning
The Diabolical Developers Guide to Performance TuningjClarity
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Spark Summit
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsGanesan Narayanasamy
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure applicationCodecamp Romania
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Fields in computer science
Fields in computer scienceFields in computer science
Fields in computer scienceUC San Diego
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningS. Diana Hu
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningVarad Meru
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for EveryoneAly Abdelkareem
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NETDev Raj Gautam
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018Apache MXNet
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-Systeminside-BigData.com
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
 

Ähnlich wie InfoEducatie - Face Recognition Architecture (20)

Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 
The Diabolical Developers Guide to Performance Tuning
The Diabolical Developers Guide to Performance TuningThe Diabolical Developers Guide to Performance Tuning
The Diabolical Developers Guide to Performance Tuning
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Fields in computer science
Fields in computer scienceFields in computer science
Fields in computer science
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
 

Mehr von Bogdan Bocse

Whatever your question is, math already has a map to the answer
Whatever your question is, math already has a map to the answerWhatever your question is, math already has a map to the answer
Whatever your question is, math already has a map to the answerBogdan Bocse
 
The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...
The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...
The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...Bogdan Bocse
 
The deconstruction of the Chinese Room
The deconstruction of the Chinese Room The deconstruction of the Chinese Room
The deconstruction of the Chinese Room Bogdan Bocse
 
#SafeNet - COVID-19 Contact Tracing
#SafeNet - COVID-19 Contact Tracing#SafeNet - COVID-19 Contact Tracing
#SafeNet - COVID-19 Contact TracingBogdan Bocse
 
The Commoditization of Intelligence
The Commoditization of IntelligenceThe Commoditization of Intelligence
The Commoditization of IntelligenceBogdan Bocse
 
Computer Vision - The New Renaissance or 1983?
Computer Vision - The New Renaissance or 1983?Computer Vision - The New Renaissance or 1983?
Computer Vision - The New Renaissance or 1983?Bogdan Bocse
 
Scaling Face Recognition with Big Data
Scaling Face Recognition with Big DataScaling Face Recognition with Big Data
Scaling Face Recognition with Big DataBogdan Bocse
 
The VisageCloud Domain Model
The VisageCloud Domain ModelThe VisageCloud Domain Model
The VisageCloud Domain ModelBogdan Bocse
 
Training and Face Recognition in 5 Easy Steps with VisageCloud
Training and Face Recognition in 5 Easy Steps with VisageCloudTraining and Face Recognition in 5 Easy Steps with VisageCloud
Training and Face Recognition in 5 Easy Steps with VisageCloudBogdan Bocse
 
VisageCloud - Face Recognition meets Big Data.
VisageCloud - Face Recognition meets Big Data.VisageCloud - Face Recognition meets Big Data.
VisageCloud - Face Recognition meets Big Data.Bogdan Bocse
 
Agile Business Analysis - Certificate
Agile Business Analysis - CertificateAgile Business Analysis - Certificate
Agile Business Analysis - CertificateBogdan Bocse
 
Axway - comunicat de presa - Hackathon
Axway  - comunicat de presa - HackathonAxway  - comunicat de presa - Hackathon
Axway - comunicat de presa - HackathonBogdan Bocse
 
ScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazione
ScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazioneScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazione
ScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazioneBogdan Bocse
 
Certification - Agile Business Analysis
Certification - Agile Business AnalysisCertification - Agile Business Analysis
Certification - Agile Business AnalysisBogdan Bocse
 
ScentSee - Consilier virtual pentru descoperire și recomandare de parfum
ScentSee - Consilier virtual pentru descoperire și recomandare de parfumScentSee - Consilier virtual pentru descoperire și recomandare de parfum
ScentSee - Consilier virtual pentru descoperire și recomandare de parfumBogdan Bocse
 
The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)
The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)
The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)Bogdan Bocse
 
What is Solution Architecture?
What is Solution Architecture?What is Solution Architecture?
What is Solution Architecture?Bogdan Bocse
 
Certificate for Architect Enterprise Applications with Java EE
Certificate for Architect Enterprise Applications with Java EECertificate for Architect Enterprise Applications with Java EE
Certificate for Architect Enterprise Applications with Java EEBogdan Bocse
 
TimeOP: Automated System for PC Activity Tracking and User Productivity Analysis
TimeOP: Automated System for PC Activity Tracking and User Productivity AnalysisTimeOP: Automated System for PC Activity Tracking and User Productivity Analysis
TimeOP: Automated System for PC Activity Tracking and User Productivity AnalysisBogdan Bocse
 
Performanta si Inovatie
Performanta si InovatiePerformanta si Inovatie
Performanta si InovatieBogdan Bocse
 

Mehr von Bogdan Bocse (20)

Whatever your question is, math already has a map to the answer
Whatever your question is, math already has a map to the answerWhatever your question is, math already has a map to the answer
Whatever your question is, math already has a map to the answer
 
The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...
The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...
The Intelligence Wars -Neopolitics of so-called ”A.I.” in the Digital Post-tr...
 
The deconstruction of the Chinese Room
The deconstruction of the Chinese Room The deconstruction of the Chinese Room
The deconstruction of the Chinese Room
 
#SafeNet - COVID-19 Contact Tracing
#SafeNet - COVID-19 Contact Tracing#SafeNet - COVID-19 Contact Tracing
#SafeNet - COVID-19 Contact Tracing
 
The Commoditization of Intelligence
The Commoditization of IntelligenceThe Commoditization of Intelligence
The Commoditization of Intelligence
 
Computer Vision - The New Renaissance or 1983?
Computer Vision - The New Renaissance or 1983?Computer Vision - The New Renaissance or 1983?
Computer Vision - The New Renaissance or 1983?
 
Scaling Face Recognition with Big Data
Scaling Face Recognition with Big DataScaling Face Recognition with Big Data
Scaling Face Recognition with Big Data
 
The VisageCloud Domain Model
The VisageCloud Domain ModelThe VisageCloud Domain Model
The VisageCloud Domain Model
 
Training and Face Recognition in 5 Easy Steps with VisageCloud
Training and Face Recognition in 5 Easy Steps with VisageCloudTraining and Face Recognition in 5 Easy Steps with VisageCloud
Training and Face Recognition in 5 Easy Steps with VisageCloud
 
VisageCloud - Face Recognition meets Big Data.
VisageCloud - Face Recognition meets Big Data.VisageCloud - Face Recognition meets Big Data.
VisageCloud - Face Recognition meets Big Data.
 
Agile Business Analysis - Certificate
Agile Business Analysis - CertificateAgile Business Analysis - Certificate
Agile Business Analysis - Certificate
 
Axway - comunicat de presa - Hackathon
Axway  - comunicat de presa - HackathonAxway  - comunicat de presa - Hackathon
Axway - comunicat de presa - Hackathon
 
ScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazione
ScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazioneScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazione
ScentSee - Consigliere virtuale per la scoperta fragranza e la raccomandazione
 
Certification - Agile Business Analysis
Certification - Agile Business AnalysisCertification - Agile Business Analysis
Certification - Agile Business Analysis
 
ScentSee - Consilier virtual pentru descoperire și recomandare de parfum
ScentSee - Consilier virtual pentru descoperire și recomandare de parfumScentSee - Consilier virtual pentru descoperire și recomandare de parfum
ScentSee - Consilier virtual pentru descoperire și recomandare de parfum
 
The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)
The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)
The Rise of Digital Audio (AdsWizz, DevTalks Bucharest, 2015)
 
What is Solution Architecture?
What is Solution Architecture?What is Solution Architecture?
What is Solution Architecture?
 
Certificate for Architect Enterprise Applications with Java EE
Certificate for Architect Enterprise Applications with Java EECertificate for Architect Enterprise Applications with Java EE
Certificate for Architect Enterprise Applications with Java EE
 
TimeOP: Automated System for PC Activity Tracking and User Productivity Analysis
TimeOP: Automated System for PC Activity Tracking and User Productivity AnalysisTimeOP: Automated System for PC Activity Tracking and User Productivity Analysis
TimeOP: Automated System for PC Activity Tracking and User Productivity Analysis
 
Performanta si Inovatie
Performanta si InovatiePerformanta si Inovatie
Performanta si Inovatie
 

Kürzlich hochgeladen

Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Kürzlich hochgeladen (20)

Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

InfoEducatie - Face Recognition Architecture

  • 1. Scaling Face Recognition with Big Data Bogdan BOCȘE Solutions Architect & Co-founder VisageCloud
  • 2. • How to learn ? • What to learn? • Defining learning objectives • How to scale learning? • Gotchas • VisageCloud –Architecture –Use Cases Agenda
  • 3. • What questions to ask before writing the code? • How to look at the data before feeding it to the machine? • What is the state of the art regarding ML? • What frameworks to use? • What are the common traps to avoid? • How to design for scale? Objectives
  • 5. Vision • Convolutional Neural Networks • Inception Paper NLP • Word2Vec • GloVe: Global Vectors for Words Representation Generic • Classification • Prediction How to Learn?
  • 7. • Pooling / Max Pooling • Convolution • Fully Connected Activation – Activation Function, eg. ReLu Convolutional Neural Networks : Components
  • 8. • Learning is an optimization problem –Find parameters of a system (neural network) that minimize a fixed error function –Not unlike planning orbital paths • Defining the network architecture • Defining the training algorithm –Stochastic Gradient Descent • With momentum • With noisy Taking a Step Back: The Math
  • 9. • DeepLearning4j – Independent company – Java interface with C-bindings for performance • TensorFlow – Python & C++ API – Developed by Google – Compatible with TPU • Torch – Developed by Facebook – Written in LuaJIT, with Python bindings Frameworks
  • 11. • Public data sets –Labelled Faces in the Wild (LFW) –Youtube faces –Kaggle • Private data sets • Build your own –Outsourcing: Mechanical Turk –Crowsourcing: ReCaptcha model Data Sets
  • 13. • Machine learning is not magic • If you can’t understand the data, a machine probably won’t either • Preprocessing makes the difference between results • Applying filters, normalization, anomaly detection is computationally inexpensive Preparing Data
  • 15. • Supervised –Classification –Scoring and regression –Identification • Unsupervised –Clustering Defining learning objectives
  • 16. • Projecting input onto a fixed set of classes • “Don’t use a cannon to kill a fly” –Support Vector Machines • Linear • Radial Based Functions Classification
  • 17. • Embedding –Projecting input (image) onto an vector space with a known property • Triplet Loss Function Identification
  • 18. • Splitting a set of items into non-overlapping subsets, based on item attributes • Counting people in video streams • Algorithms: –Fixed threshold –K-means –Rank-order clustering Clustering
  • 19. HOW TO SCALE LEARNING?
  • 20. • Scaling training – Requires shared memory space – Vertical scaling • GPU • Soon-to-come: TPU (tensor processing unit) • Scaling evaluation – Shared nothing architecture – Neural network/classifier rarely change – Load balancing pattern – Partitioning data if needed How to scale learning?
  • 21. • There is no “reduce” for neural networks • Averaging weights/parameters – Usually not a good idea • Genetic algorithms – Requires a lot of processing power – Running independent iterations on different machines – Crossover between weights/parameters of independently trained neural networks after each epoch Ideas for horizontal scaling
  • 23. • Our 2D and 3D intuition often fails in high dimensions • Distances tend to become relatively “the same” as number of dimensions increases • Dimensionality reduction – Embedding functions – Principal component analysis The Curse of Dimensionality
  • 24. • “The bottom of a valley is not necessarily the lowest point on Earth” • Learning algorithms may get stuck in local optima • Using momentum or some random noise reduces this possibility • Using genetic algorithms can be even more robust, but it’s computationally expensive Local Optima
  • 26. “Based on state-of-the-art machine learning, our weather forecast system can predict tomorrow’s weather with 72% accuracy” Evaluating of Learning You get the same results by saying “it’s going to be the same as today”
  • 27. • Don’t test on the data you train on – Use different data set – Split the data sets you have • Beware of data biases – Confirmation bias – Survivorship bias – Selection bias • Compare against a benchmark, even a dummy one – Coin flip – Linear algorithms – “Same-as-before” Evaluation of Learning
  • 29. High Level Architecture VisageCloud Production HAProxy (reverse proxy) Image Storage AWS S3 Service (API Controller) Cassandra Containers (Docker) Neural Networks (OpenCV, Dlib, Torch, pixie magic) CQL Binary HTTP API Consumer (Customer Infrastructure) HTTPS HTTP HTTPS
  • 31. • The collection –Slice of data used together –10K-100K records • The Cache-Inside Pattern –Loading / preloading collection in one application server –Content based routing/balancing to maximize cache hits –No logic in the database layer –Requires periodic polling for updates • Weaker consistency Partitioning Data: Application Level Logic
  • 32. Partitioning Data: Application Level Logic Application Layer Application Application Application Cassandra (Database Layer) Cassandra Node Cassandra Node Cassandra Node Cassandra Node Content-based balancing/routing Preload collectionPoll for updatesWrite updates
  • 33. • Perform comparison logic in database –User Defined Aggregate Functions • Removes the need to move data around between application and database • Harder to deploy/test • Stronger consistency Partitioning Data: Application Level Logic
  • 34. • It’s math, not magic • If you don’t understand the data, neither will the machine • Preprocessing makes the difference • Test against a benchmark, any benchmark • Evaluate first, scale later Key Take-away
  • 35. Bogdan@VisageCloud.com +(40) 724 714 234 https://www.linkedin.com/in/bogdanbocse/ https://twitter.com/bocse Let’s keep in touch