This thesis proposes a real-time automatic people crowd density estimation method for overcoming the non-linearity problem, working with different densities and scales, and enhancing the prediction error. To cover most of the properties of the crowded scene, a newly used combination of features is proposed that includes segmented region properties, texture, edge, and SIFT keypoints. Edge strength is a suggested for use.
M.Sc. Thesis - Automatic People Counting in Crowded Scenes
1. Automatic People Counting
in Crowded Scenes
Menoufia University
Faculty of Computers and Information
Information Technology Department
By
Ahmed F. Gad
ahmed.fawzy@ci.menofia.edu.eg
Supervised By
Prof. Khalid M. Amin
Dr. Ahmed M. Hamad
15 August 2018
2. Index
2
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
3. Motivation
3
Difficult to analyze manually
Krausz, Barbara, and Christian Bauckhage. "Loveparade 2010: Automatic video analysis of a crowd
disaster." Computer Vision and Image Understanding 116.3 (2012): 307-319.
9. Loy, Chen Change, et al. "Crowd counting and profiling: Methodology and evaluation." Modeling, Simulation and
Visual Analysis of Crowds. Springer, New York, NY, 2013. 347-382.
Crowd Counting Approaches
Crowd Density Estimation
9
▰ Solves the requirements to detect and track objects.
▰ Counting based on groups not individuals.
Scene Features
Count
X
Y
14. Problem Definition
14
▰ Predicting the people count in a scene is not straight forward.
Count
Feature Mining
Texture Edge
Crowd
Levels
Non-Linearity
15. Problem Definition
15
▰ Predicting the people count in a scene is not straight forward.
Count
Feature Mining
Texture Edge
Crowd
Levels
Same
Size
Non-Linearity
16. Problem Definition
16
▰ Predicting the people count in a scene is not straight forward.
Count
Feature Mining
Texture Edge
Crowd
Levels
Same
Size
Scale
Non-Linearity
17. Problem Definition
17
▰ Predicting the people count in a scene is not straight forward.
Count
Feature Mining
Texture Edge
Crowd
Levels
Same
Size
Scale
Non-Linearity
SR Properties
Variations
Capacity
19. Index
19
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
20. Related Work
Pixel Count
20Ma, Ruihua, et al. "On pixel count based crowd density estimation for visual surveillance.“ IEEE Conference
on Cybernetics and Intelligent Systems. Vol. 1. 2004.
Region Pixel Count
Pixel Count is not a good
feature to be used in
complex environments
21. Related Work
Texture & Edge Features
21
Segmented
Region
Texture
Edge
GLCM
HOG
Pixel Count
Chan, Antoni B., Zhang-Sheng John Liang, and Nuno Vasconcelos. "Privacy preserving crowd monitoring: Counting people
without people models or tracking.". IEEE Conference on Computer Vision and Pattern Recognition (CCPR). 2008.
22. Chan, Antoni B., Zhang-Sheng John Liang, and Nuno Vasconcelos. "Privacy preserving crowd monitoring: Counting people
without people models or tracking.". IEEE Conference on Computer Vision and Pattern Recognition (CCPR). 2008.
Related Work
Perspective Distortion
22
P, X
P
23. Chan, Antoni B., Zhang-Sheng John Liang, and Nuno Vasconcelos. "Privacy preserving crowd monitoring: Counting people
without people models or tracking.". IEEE Conference on Computer Vision and Pattern Recognition (CCPR). 2008.
Related Work
Perspective Distortion
23
P, X
P
24. Chan, Antoni B., Zhang-Sheng John Liang, and Nuno Vasconcelos. "Privacy preserving crowd monitoring: Counting people
without people models or tracking.". IEEE Conference on Computer Vision and Pattern Recognition (CCPR). 2008.
Related Work
Perspective Distortion
24
P, X
P
Error
12.997%
25. Related Work
25
Pixel Count
Texture - GLCM
Edge - HOG
Chen, Ke, et al. "Feature mining for localised crowd counting." BMVC. Vol. 1. No. 2. 2012.
26. Related Work
26
Pixel Count
Texture - GLCM
Edge - HOG
Error
17.96%
Chen, Ke, et al. "Feature mining for localised crowd counting." BMVC. Vol. 1. No. 2. 2012.
27. Related Work
KeyPoints
27Al-Zaydi, Zeyad QH, et al. "A robust multimedia surveillance system for people counting." Multimedia Tools and
Applications 76.22 (2017): 23777-23804.
KeyPoint
SIFT
FAST
28. Related Work
KeyPoints
28Al-Zaydi, Zeyad QH, et al. "A robust multimedia surveillance system for people counting." Multimedia Tools and
Applications 76.22 (2017): 23777-23804.
Error
14.11%
KeyPoint
SIFT
FAST
29. Index
29
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
33. Regression Modelling
▰ Regression model maps independent variable (feature) to
some independent variables (people count)
33
Features Count
Regression Model
Independent Dependent
GPR
RF
RPF
LASSO
KNN
Ryan, David, et al. "An evaluation of crowd counting methods, features and regression models." Computer Vision
and Image Understanding 130 (2015): 1-17.
Loy, Chen Change, et al. "Crowd counting and profiling: Methodology and evaluation." Modeling, Simulation and
Visual Analysis of Crowds. Springer, New York, NY, (2013). 347-382.
39. 2nd Experiment Results
Covering All Variations using Cross Validation
39
Problem
Random Sample
Selection only
Covered 33 Levels
Ground Truth
31
Before CV
22.49
40. 2nd Experiment Results
Covering All Variations using Cross Validation
40
Cross
Validation
Select Samples
from All Levels
Problem
Random Sample
Selection only
Covered 33 Levels
Ground Truth
31
Before CV
22.49
After CV
30.99
Solution
41. 2nd Experiment Results
Covering All Variations using Cross Validation
41
Cross
Validation
Select Samples
from All Levels
Problem
Random Sample
Selection only
Covered 33 Levels
Ground Truth
31
Before CV
22.49
After CV
30.99
Solution
42. Index
42
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
43. Applying Proposed Method with Overcrowded Dataset
UCF Crowd Dataset – VERY CHALLENGING
43
Idrees, Haroon, et al. "Multi-source multi-scale counting in extremely dense crowd images." IEEE Conference on
Computer Vision and Pattern Recognition (CVPR). 2013.
44. Applying Proposed Method with Overcrowded Dataset
UCF Crowd Dataset – VERY CHALLENGING
44
50 Images
Idrees, Haroon, et al. "Multi-source multi-scale counting in extremely dense crowd images." IEEE Conference on
Computer Vision and Pattern Recognition (CVPR). 2013.
40 Training
10 Testing
MAE : 338.41
Error Percent : 26.45%
48. Applying Proposed Method with Overcrowded Dataset
Marathon Crowd Dataset
48
Ali, Saad, and Mubarak Shah. "Floor fields for tracking in high density crowd scenes." European conference on
computer vision. Springer, Berlin, Heidelberg, 2008.
49. Applying Proposed Method with Overcrowded Dataset
Marathon Crowd Dataset
49
492
Images
Ali, Saad, and Mubarak Shah. "Floor fields for tracking in high density crowd scenes." European conference on
computer vision. Springer, Berlin, Heidelberg, 2008.
350 Training
142 Testing
MAE : 13.88
Error Percent : 3.79%
50. Index
50
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
63. Feature Tracking
Computational Time
▰ 85.12% of the time consumed to extract features is saved (i.e.
we have not to call the FE for 85.12% of the total regions).
63
69. Index
69
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
70. Conclusion
▰ This work proposed a technique for crowd density estimation based multiple
features.
▰ Less Prediction Error Compared to Previous Works using All Features.
▰ Enhanced Results using Cross Validation.
▰ Accuracy Proved by using Different Datasets.
▰ Increasing Model Capacity after Feature Reduction.
▰ Reduced Computational Time using Feature Tracking.
70
71. Index
71
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
72. Publications
▰ A. Gad, A. Hamad, K. Amin. "Crowd Density Estimation Using
Multiple Features Categories and Multiple Regression
Models." 12th IEEE International Conference on Computer
Engineering & Systems (ICCES), pp. 430-435, Dec. 2017.
▰ Estimating People Count in Crowded Scenes Using Multiple
Features Categories and Multiple Regression Models. Pattern
Analysis and Applications Journal, Springer, Under Review.
▰ Time-Efficient Crowd Density Estimation using Feature Tracking.
Prepared for submission.
72
73. Index
73
• Introduction
• Problem Definition
• Related Work
• Proposed Method & Experimental Results
• Decrease Prediction Error
• Testing using Overcrowded Scene
• Feature Reduction
• Decrease Computation Complexity
• Conclusion
• Publications
• References
74. References
▰ C. C. Loy, K. Chen, S. Gong, and T. Xiang, "Crowd counting and profiling: Methodology and evaluation," Modeling, Simulation and Visual Analysis of
Crowds,Springer, pp. 347-382, 2013.
▰ W. Zhen, L. Mao, and Z. Yuan, "Analysis of trample disaster and a case study–Mihong bridge fatality in China in 2004," Safety Science, vol. 46, pp.
1255-1270, 2008.
▰ D. Helbing, A. Johansson, and H. Z. Al-Abideen, "Dynamics of crowd disasters: An empirical study," Physical review E, vol. 75, p. 046109, 2007.
▰ B. Krausz and C. Bauckhage, "Loveparade 2010: Automatic video analysis of a crowd disaster," Computer Vision and Image Understanding, vol. 116,
pp. 307-319, 2012.
▰ B. Wu and R. Nevatia, "Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors,"
International Journal of Computer Vision, vol. 75, pp. 247-266, 2007.
▰ D. Ryan, S. Denman, S. Sridharan, and C. Fookes, "An evaluation of crowd counting methods, features and regression models," Computer Vision and
Image Understanding, vol. 130, pp. 1-17, 2015.
▰ A. B. Chan, Z.-S. J. Liang, and N. Vasconcelos, "Privacy preserving crowd monitoring: Counting people without people models or tracking,". IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-7, 2008.
▰ A. B. Chan and N. Vasconcelos, "Counting people with low-level features and Bayesian regression," IEEE Transactions on Image Processing, vol. 21,
pp. 2160-2177, 2012.
▰ L. Dong, V. Parameswaran, V. Ramesh, and I. Zoghlami, "Fast crowd segmentation using shape indexing,". IEEE 11th International Conference on
Computer Vision (ICCV), pp. 1-8, 2007.
▰ Z. Q. Al-Zaydi, D. L. Ndzi, M. L. Kamarudin, A. Zakaria, and A. Y. Shakaff, "A robust multimedia surveillance system for people counting," Multimedia
Tools and Applications, pp. 1-28, 2016.
74
75. References
75
▰ R. Liang, Y. Zhu, and H. Wang, "Counting crowd flow based on feature points," Neurocomputing, vol. 133, pp. 377-384, 2014.
▰ D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International journal of computer vision, vol. 60, pp. 91-110, 2004.
▰ K. Chen, C. C. Loy, S. Gong, and T. Xiang, "Feature Mining for Localised Crowd Counting," BMVC, p. 3, 2012.
▰ B. Xu and G. Qiu, "Crowd density estimation based on rich features and random projection forest,"IEEE Winter Conference on Applications of
Computer Vision (WACV), pp. 1-8, 2016.
▰ D. Kong, D. Gray, and H. Tao, "A viewpoint invariant approach for crowd counting," 18th International Conference on in Pattern Recognition (ICPR).
pp. 1187-1190, 2006.
▰ Zeng, Xinchuan, and Tony R. Martinez. "Distributed-balanced stratified cross-validation for accuracy estimation." Journal of Experimental &
Theoretical Artificial Intelligence vol. 12, pp. 1-12, 2000.
▰ Ojala, Timo, Matti Pietikainen, and Topi Maenpaa. "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns."
IEEE Transactions on pattern analysis and machine intelligence, vol. 24, pp. 971-987, 2002.
▰ S. L. Kukreja, J. Löfberg, and M. J. Brenner, "A least absolute shrinkage and selection operator (LASSO) for nonlinear system identification," IFAC
Proceedings Volumes, vol. 39, pp. 814-819, 2006.
▰ D. Kang, D. Dhar, and A. B. Chan, "Crowd Counting by Adapting Convolutional Neural Networks with Side Information," arXiv preprint
arXiv:1611.06748, 2016.
▰ C. Zhang, H. Li, X. Wang, and X. Yang, "Cross-scene crowd counting via deep convolutional neural networks," IEEE Conference on Computer Vision
and Pattern Recognition, pp. 833-841, 2015.