3. 1. Overview
• Major topics:
• Face detection
• Face alignment/tracking
• Face recognition
• Discussion:
• Database
• Methods
• Software
• Highlight:
• Recent works
• Deep-learning based works
• My works
4. • Face analysis
1. Overview
Face detection
Face alignment
/tracking
Face
recognition
Head Pose
Estimation
Facial expression
analysis
*
6. 2. Face detection
• Overview:
• Even though face detection on up-right images is generally
considered as solved problem, face detection on challenging
conditions remains difficult.
• Pose
• Occlusion
• Low-resolution
• Illumination
• Chronology
• Expression
• Decoration
7. 2. Face detection: database
• Overview:
• Face detection database include images with face location
annotations.
• Recently, there are more “in-the-wild” databases.
8. 2.1 Face detection: database
• Databases
Title # images Conditions Availability
1. Face Detection and Data Set
Benchmark (FDDB)
5K In-the-wild Public
2. Annotated Facial landmarks
in the wild (AFLW)
25K In-the-wild Public
3. Annotated Faces in the Wild
(AFW)
~1K In-the-wild Public
* Some databases for face recognition and landmark detections can also be
used for face detection.
** Please see the last slides of the chapter for links to the database
9. 2.2 Face detection: methods
• Method overview:
• Viola-Jones face detectors: Cascade framework, effective features.
• Deformable Part Model (DPM): Part based method, tree-structure
spatial constraints among parts.
• Exemplar based methods: Select the training face as exemplars.
• Deep learning based methods: Learn deep CNN for face detection
Viola-Jones
Face detector
Deformable
Part Model
Exemplar
based model
Deep learning
based methods
10. 2.2 Face detection: methods(1)
• Deformable Part Model (DPM):
• X. Zhu and D. Ramanan. “Face detection, pose estimation and
landmark localization in the wild”, CVPR 2012
• Keys:
• Pose-dependent models.
• Template based appearance model
• Tree-structure model to captures the spatial relationships among parts.
• Discriminative training with SVM.
11. 2.2 Face detection: method (1)
• Deep learning based method:
• H. Li, Z. Lin, X. Shen, J. Brandt, G. Hua, “A convolutional neural
network cascade for face detection”, CVPR 2015
• Keys:
• Cascade manner + multi-resolution inputs: reject the negative samples
quickly, focus on the remaining patches in the later cascade stage.
• Classifier: shallow CNN
• CNN calibration net for box modification: change the scale, locations of
the currently predicted boxes.
13. 2.2 Face detection: method (1)
• Results
• Top performance on FDDB database
• Real-time
14. 2.3 Face detection: discussion
Discussion:
Deep learning could be the new star in this area. If more data is
provided, it should achieve better performance.
If we consider the face as a special class of objects, we can learn
from other object detection methods.
15. 2.4 Face detection: resources
• Databases
• FDDB: http://vis-www.cs.umass.edu/fddb/
• AFLW: https://lrs.icg.tugraz.at/research/aflw/
• AFW: http://www.ics.uci.edu/~xzhu/face/
• Software:
• Viola-Jones face detector: opencv, matlab
• Face detection without bells and whistles:
http://markusmathias.bitbucket.org/2014_eccv_face_detection/
• More commercial software: refer to section 3.3
16. 2.4 Face detection: resources
• If we consider the face as a special class of objects. Then, we
have more resources:
• Database:
• Pascal VOC http://host.robots.ox.ac.uk/pascal/VOC/
• ImageNet http://www.image-net.org/
• Good methods:
• R-CNN: R. Girshick, J. Donahue, T. Darrell, J. Malik, “Rich feature
hierarchies for accurate object detection and semantic
segmentation”, arXiv
• Software: https://github.com/rbgirshick/rcnn
18. 3. Face alignment/tracking
• Overview:
• Facial alignment is to identify the locations of facial key points.
• Face alignment is usually the basic step for face recognition
• Challenge:
• Pose
• Occlusion
• Low-resolution
• Illumination
• Chronology
• Expression
• Decoration
• There are some deep learning based methods for face alignment
20. 3.2 Face alignment/tracking:
method
• Overview:
Holistic
methods
Constrained
local methods
Regression
based method
appearance Shape model performance speed
Holistic Whole face Explicit Poor
generalization
slow
CLM Local patch Explicit good Slow/fast
Regression
-based *
Local patch Implicit Very good fast
21. 3.2 Face alignment/tracking:
method (1)
• Constrained local methods:
• Yue Wu and Qiang Ji “Discriminative deep face shape model for
facial point detection”, IJCV 2015.
• Keys:
• Discriminative deep face shape models based on Restricted Boltzmann
Machine that decomposes the shape variations into pose related and
expression related parts.
22. 3.2 Face alignment/tracking:
method (2)
• Regression based method:
• X. Xiong and F. De la Torre, “Supervised Descent Method and its
applications to face alignemnt”, CVPR 2013.
• Keys:
• Iteratively estimate the locations of facial landmarks, by starting from a
mean face.
• Predict the difference vector between the current landmark locations
and the target ground truth locations with linear regression methods.
𝑥0 𝑥 𝑡
……
∆𝑥 𝑡
= 𝑥∗
− 𝑥 𝑡
∆𝑥 𝑡
= 𝑅Φ(𝐼, 𝑥 𝑡
)
23. 3.2 Face alignment/tracking:
method (3)
• Regression based method:
• J. Zhange, S. Shan, M. Kan, and X. Chen, “Coarse-to-fine Auto-
encoder network for real-time face alignment”, ECCV 2014.
• Key: Coarse-to-fine search with multi-resolution inputs.
24. 3.3 Face alignment/tracking:
Discussion
• Discussion:
• Regression based methods become the popular techniques
• Real-time face alignment and tracking is feasible.
• Facial alignment/tracking remain challenging with strong
illumination changes, occlusions and large head poses.
26. 3.4 Face alignment/tracking:
resource
• “In-the-wild” databases:
• Annotated Facial Landmarks in the Wild (AFLW):
https://lrs.icg.tugraz.at/research/aflw/
• Labeled Face Parts In the Wild (LFPW):
http://neerajkumar.org/databases/lfpw/
• Helen: http://www.ifp.illinois.edu/~vuongle2/helen/
• Annotated Faces in the Wild (AFW):
http://www.ics.uci.edu/~xzhu/face/
• Caltech Occluded Faces in the Wild (COFW):
http://www.vision.caltech.edu/xpburgos/ICCV13/
• Additional annotations: http://ibug.doc.ic.ac.uk/resources/300-
W/
28. 4. Face recognition
• Overview:
• Challenges:
• Pose
• Occlusion
• Low-resolution
• Illumination
• Chronology
• Expression
• Decoration
• Cross modality (e.g. sketch to image, infrared to visible)
• Large database + deep learning techniques
= close to human face recognition performance ?
29. 4.1 Face recognition: database
• Overview
• Face recognition database include facial images and identity
labels.
• Public available database is relative small, while there exist very
large private databases.
Data set # images # subjects Availability
1. LFW 13K 1.6K Public
2. Youtube face 4K video 1.6K Public
3. MSRA-CFW 202K 1.6K Public
4. CASIA-WebFace 494K 11K Public
5. WDRef 100K 3K Public (feature only)
6. CACD 163K 2K Public (partial annotation)
7. PubFig 59K 200 Public
8. Janus (CVPR15) 6K 500 Public
9. People In Photo Albums (PIPA) 63K 2K Public
31. 4.1 Face recognition: database
• Two more new databases (public):
• Janus Benchmark:
• Major novelty: full poses
• 6K images of 500 people
• Annotations: face locations, identity, landmarks, meta-data
• Reference: B. Klare, et al. “Pushing the frontiers of unconstrained face
detection and recognition: IARPA Janus benchmark”, CVPR15
• Link: http://www.nist.gov/itl/iad/ig/face.cfm
32. 4.1 Face recognition: database
• People In Photo Albums (PIPA)
• Major novelty: person recognition based on other attributes. Variations
in poses, illuminations, etc.
• 63K faces of 2K people
• DeepFace method only achieves ~ 50% accuracy on the new dataset.
• Reference: N. Zhang, et al (Berkeley and Facebook) “Beyond frontal
faces: improving person recognition using multiple cues”, CVPR 2015.
• Link: http://www.eecs.berkeley.edu/~nzhang/piper.html
33. 4.2 Face recognition: method
• Overview:
• Deep learning based methods:
• Going deeper
• More data
Hand crafted features
+ SVM classifier
Deep learning
based methods
34. 4.2 Face recognition: method(1)
• Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “DeepFace:
closing the gap to human-level performance in face
verification”, CVPR 2014
• Keys:
• 3D face alignment
• Deep neutral network (convolution & local).
• 97.35% accuracy on LFW comparing to 97.53% of human.
35. 4.2 Face recognition: method(2)
• Y. Taigman, M. Yang, M. Ranzato, and L. Wolf , “Web-scalde
training for face identification”, CVPR 2015
• Key:
• Structure: similar as DeepFace.
• Very large database: 500M images of 10M people.
• Size of “fc7” determines the generalization performance of the features.
• Random sampling is bad, and they use bootstrapping to select the hard
samples.
• Performance: 98% on LFW
36. 4.2 Face recognition: method(3)
• Xiaogang Wang and Xiaoou Tang's group (CUHK):
• DeepID[1]: loose face alignment, several NNs of face parts.
• DeepID2[2]: joint face verification and identification.
• DeepID2+[3]: supervision in early layer, unshared weights in last
few layers.
• DeepID3[4]: deeper
• Reference:
• [1] Y. Sun, X. Wang and X. Tang, “Deep learning face representation from predicting 10,000 classes”,
CVPR 2014.
• [2] Y. Sun et al, “Deep learning face representation by joint identification-verification”, NIPS 2014
• [3] Y. Sun, X. Wang and X. Wang, “Deeply learned face representations are sparse, selective and
robust”, CVPR 2015.
• [4] Y. Sun et al, “DeepID3: face recognition with very deep neural network”, arXiv 2015
37. 4.2 Face recognition: method(3)
• DeepID3:
• Loose face alignment
• Models for facial parts
• Deeper model
• Supervision in early layer
• Joint verification and identification
• Ensemble models
• ~200K training images
• 99.53% on LFW
38. 4.2 Face recognition: method(4)
• F. Schroff, D. Kalenichenko, J. Philbin (google), “FaceNet: a
unified embedding for face recognition and clustering”, CVPR
2015.
• Keys:
• Goal: learn a mapping from face images to a compact Euclidean space
where distances directly correspond to a measure of face similarity.
• Loss: triplet distance loss
• Structure: 20+ layers, loose face alignment
• Very large database: 200M images of 8M people
• Training time: 1000~2000 hours
• Performance: 99.63% on LFW
39. 4.3 Face recognition: Discussion
• Discussion:
• Deep learning techniques are the popular learning method for
face recognition.
• The training set would takes up to millions of labeled images.
• Big training set is not shared.
• Face alignment may not be the necessary step.
40. 4.4 Face recognition: resources
Software overview (mainly commercial software):
title d: detection,
r: recognition
Available
1. Lambda Lab d & r Free for limited usage
2. Animetrics Face
Recognition
d & r Sign up for API
3. SkyBiometry d & r Free for limited usage
4. Face++ d & r Sign up for SDK
5. FaceMark d free
6. EYERIS d & r Sign up for SDK
7. ReKognition d & r SDK
8. Betaface d & r Free for limited usage
9. Eyefdea recognition d & r Free for limited usage
10. Kairos r Sign up for SDK
41. 4.4 Face recognition: resources
Public available database:
LFW: http://vis-www.cs.umass.edu/lfw/
Youtube face database: http://www.cs.tau.ac.il/~wolf/ytfaces/
Data Set of Celebrity Faces on the Web (MSRA-CFW):
http://research.microsoft.com/en-us/projects/msra-cfw/
Cross-Age Celebrity Dataset (CACD):
http://bcsiriuschen.github.io/CARC/
WDRef database:
http://home.ustc.edu.cn/~chendong/JointBayesian/
CASIA-WebFace: http://www.cbsr.ia.ac.cn/english/CASIA-
WebFace-Database.html
PubFig: http://www.cs.columbia.edu/CAVE/databases/pubfig/
43. Summary
• Face detection
• New deep learning based method outperform traditional
methods.
• Facial alignment
• Regression based methods, including deep learning methods, are
the popular techniques.
• Face recognition
• Based on very large databases, deep learning methods become
the leading techniques in this area.
• Over-saturate performance on some databases. New challenging
database shows that face recognition is still unsolved problem.