Weitere ähnliche Inhalte
Ähnlich wie Li charles biometrics analytics & big data 122013a for release (20)
Kürzlich hochgeladen (20)
Li charles biometrics analytics & big data 122013a for release
- 1. Dr. Charles Li
Analytics Solution Center
Charles_Li@us.ibm.com
Biometrics, Identity and Big Data Analytics
© 2013 IBM Corporation
- 2. Leveraging Information for Smarter Organizational Outcomes
Topics
Biometrics, Identity & ID Management
Views on Biometrics Technology and System
Big Data Analytics and Challenges
Identity Establishment from All Sources
Identity and Biometrics in the Cloud
Identity and Biometrics Analytics in Motion
Summary
2
© 2009 IBM Corporation
- 3. Leveraging Information for Smarter Organizational Outcomes
Biometrics, Identity and ID Management
Entitlement(s)
Actions
Identity
Reputation
(History)
Trust
(Rules)
Identity
Establishment
Status
(Environment)
Identity Management
Players
© 2009 IBM Corporation
- 4. Leveraging Information for Smarter Organizational Outcomes
Views on biometrics
technology and system
What is missing?
4
© 2009 IBM Corporation
- 5. Leveraging Information for Smarter Organizational Outcomes
Big Data Concept
Extract insight from a high volume, variety and velocity of data in a
timely and cost-effective manner
Data in many forms –
Variety: structured, unstructured, text
and multimedia
Data in Motion – Analysis of
Velocity: streaming data to enable
decisions within fractions of a
second
Volume: Data at Scale - from
terabytes to zettabytes
5
© 2009 IBM Corporation
- 6. Leveraging Information for Smarter Organizational Outcomes
Analytics Concept
What is
happening
How many,
how often,
where?
What
exactly is
the
problem?
Structured
Data &
Unstructured
Content
6
Made
consumable
and
accessible to
everyone
What
actions are
needed?
Biometrics
Quality
Monitoring
What could
happen?
Simulation
What if
these
trends
continue?
Forecasting
How can we
achieve the best
outcome?
Optimisation
What will
happen
next if?
Predictive
Modelling
How can we
achieve the best
outcome and
address variability?
Stochastic
Optimisation
Descriptive Predictive
Analytics
Analytics
Prescriptive
Analytics
Biometrics
Reports
Extracting
insight,
concepts and
relationships
Content
Analytics
Deep insights
to improve
visualization
and
marketing
interactions
Visual
Analytics
© 2009 IBM Corporation
- 7. Leveraging Information for Smarter Organizational Outcomes
Biometrics Data at Scale – Static & Single Instance
ID Cards/Border Crossings/Benefits/Multiple
Instances
7,000,000,000x(10 Print 0.5-1MB + Face 200KB +
IRIS KB)
DHS IDENT over 150 million
identities;
125,000 transactions daily
7 Exabytes
~100-300 Terabytes
1 GigaBytes = 1000MB
1 TeraBytes = 1000GB
FBI NGI ~ over100 Million
Fingerprints & More PetaBytes
1 coming plus
Faces/Iris
= 1000TB
1 ExaByes
~100-200 Terabytes = 1000PB
1 ZettaBytes = 1000EB
1 YottaBytes = 1000ZB
US DoS has in the range of
100 million faces & Others
~ at least 10-50 Terabytes
EU VIS Biometrics Matching System (BMS) at
70 million individuals and 100K daily enrollment
Prolific Usage of Mobile Phones
6 Billion Mobile Phones
6 Exabytes of behavior data
1 Billion Arrivals 2012 world wide
United States – 100-200 million
international arrivals 2012
1 Exabytes traveling data
Unique Identification Authority of India (UIDAI)
plans to enroll 1.2 billion citizens.(UID
Program) ( enroll million /day; half billion by
3-4 Exabytes Biometrics &
2014)
Biographic Data
~100 Terabyte
many instances, history, transaction, logs… data in reality
© 2009 IBM Corporation
- 8. Leveraging Information for Smarter Organizational Outcomes
Big Data Sources
System Transaction, Log and Transition Data – Several Times More!
8
© 2009 IBM Corporation
- 9. Leveraging Information for Smarter Organizational Outcomes
Other Big data examples
By 2016, annual Internet traffic
will reach 1.3 Zettabytes
Google processes
> 24 Petabytes
of data in a single day
Facebook processes
Twitter processes
500+ Terabytes of data daily
12 Terabytes of data daily
150 Exabytes global size of
AT&T transfers about
30 Petabytes of data through
its network daily
“Big Data” in Healthcare, growing
between 1.2 and 2.4 EX / year
We don’t have the most challenging problem!
Hadron Collider at CERN
generates 40 Terabytes
of usable data / day
For every session, NY Stock
Exchange captures 1 Terabyte
of trade information
© 2009 IBM Corporation
- 10. Leveraging Information for Smarter Organizational Outcomes
Biometric Performance at Giga Scale*
“Brutal Force” De-Duplication
• Cumulative de-duplication / Total number of checks= N(N-1)/2 –
“Combination Problem”
• De-duplicate 100 million population enrollment results
4,999,999,950,000,000 checking!!!
• 15 years to complete with 10 million matches per second
Biometric Accuracy Challenge
• FMR at 1 Identification false match per million;
• 500 False Matches with 1 million enrollment population (de-duplicate)
• 5 million false matches with 100 million enrollment population
Prohibitive!
We have some unique challenges!
* Courtesy to Bojan Cukic
© 2009 IBM Corporation
- 11. Leveraging Information for Smarter Organizational Outcomes
Face the Challenges
Identity Establishment with All Data Sources
- Leverage Entity Resolution Technologies
- Leverage ‘Context Accumulation’
Biometrics Services in the Cloud
- Leverage Big Data Infrastructure, Platforms
- Leverage Software Services
Biometrics and Identity Analytics in Motion
- Monitor quality
- Monitor performance
11
© 2009 IBM Corporation
- 12. Leveraging Information for Smarter Organizational Outcomes
Establishment Identity with All Sources
Biometrics(physical and behavioral)
• Reduce search space and
computing resources
• Compliment to low quality images
• Cost and benefits tradeoff
• Systematic research necessary
• Successful programs
Biographic information
Behavior data (Social media usage)
Travel data (API, PNR)
Credit Card/Banking Information
Entity /Identity
Resolution
With all
Sources
Web or Mobile App usage behavior
• Emails
• Multimedia
Spatial and temporal information
12
Entity / Identity Resolution - a
complex process involving the
application of sophisticated
algorithms across multiple
heterogeneous data sources to
resolve multiple records into a
single fused view of an individual
© 2009 IBM Corporation
- 13. Leveraging Information for Smarter Organizational Outcomes
Biometrics Services in the Cloud - Leverage Big Data
Infrastructure, Platform and Software Services
Cloud Solutions
Software and Business Process as a Service
Business Process
BPaaS
Business Analytics
and Optimization
Social Business
Smarter Commerce
Smarter Cities
Enrolment Service
Process
Data
Process
Data
1:1 Identification Service
Process
Data
….
Software
SaaS
Standard Interface
Cloud Services
Infrastructure and Platform as a Service
Application Services
Platform
PaaS
Application
Lifecycle
Application
Resources
Application
Environments
Enterprise
Fingerprint
Face
Iris
Biometric Data
Infrastructure
aaS
Infrastructure
Management
Availability and
Platform
and Administration Performance
Application
Management
Integration
Enterprise+
Security and
Compliance
Usage and
Accounting
Deployment
Note: Cloud & Big Data not the same
Private, Public and Hybrid Models
© 2009 IBM Corporation
- 14. Leveraging Information for Smarter Organizational Outcomes
Exemplary Progress
A Prototype - Leveraging the cloud for Big Data Biometrics
• E. Kohlwey et al. “Leveraging the Cloud for Big Data Biometrics,
2011
• A prototype system for generalized searching of cloud-scale
biometric data as well as an application of this system to the task of
matching collection of synthetic human iris images
• Implemented with Hadoop (Map/Reduce framework)
Successful deployment of Identification algorithms for India
UID program
• Non-traditional matching vendor technologies
Biometrics as a Service
• Business process as a service
• Software as a service
14
© 2009 IBM Corporation
- 15. Leveraging Information for Smarter Organizational Outcomes
Challenges
Focus on Parallelism and Scalability
• Excellent research and testing areas
• Bring algorithms into operational environment
Explore defining biometrics as a service program –
new way of thinking about acquisition
• Business process as a service
• Software as a service
Encourage partnership among Big Data & Analytics
developers, traditional biometrics solution
providers
• Big Data and Analytics players
15
© 2009 IBM Corporation
- 16. Leveraging Information for Smarter Organizational Outcomes
Big Data Appliance Examples
IBM Nettezza
Oracle EXADATA
Terradata
EMC2 Greenplum
SAP HANA
Schooner Appliance MySQL
Example - (CBP) 40TB data (per appliance, a few hundreds
cores) hosted by a little more than a dozen appliances support
30 – 40 % of DHS’s operations
16
© 2009 IBM Corporation
- 17. Leveraging Information for Smarter Organizational Outcomes
Biometrics and Identity Analytics in Motion
ROC curve calibration along the security vs convenience
• Allow systems to dynamically change operation criteria based on live situation
• This is a real challenge due to the needed ground truth…
Quality Feedback to the Collection
• Avoid collecting ‘bad’ data to degrade the system
Operating Metrics Monitoring
• Rates on enrollment, rejection and etc.
• Geo-location and temporal information
Fuse all data sources based on real time feedback
• Dynamically allocating fusion algorithms and configurations
Provide controlled parallelism
• System and algorithms levels
17
© 2009 IBM Corporation
- 18. Leveraging Information for Smarter Organizational Outcomes
One Approach - Streams Technology in Working
Continuous ingestion
Continuous analysis
Filter / Sample
Infrastructure provides services for
Scheduling analytics across hardware hosts,
Establishing streaming connectivity
Annotate
Transform
Correlate
Classify
Near Real Time on Big Data Platform
Achieve scale:
By partitioning applications into software components
By distributing across stream-connected hardware hosts
© 2013 IBM
1
Corporation
Where appropriate:
Elements can be fused together
for lower communication latency
© 2009 IBM Corporation
- 19. Leveraging Information for Smarter Organizational Outcomes
Summary
Re-focus on Identity
• Biometrics as an enabling technology
Re-thinking on
• Open architecture
• Vendor agnostic solution via biometrics middleware
Big Impact by Big Data and Cloud Technologies
• Biometrics as a Service to Leverage Cloud Computing
Big Data Real Time Platform
• Near real time analytics requirements
19
© 2009 IBM Corporation
- 21. Leveraging Information for Smarter Organizational Outcomes
A New Look - Identity and Biometrics Analytics
Real-time feeds
Business
Workflow Resolution
Real
Time
Biometrics
Capture Data
Biographic
Data
Stream in
Parallel
Including
many
Models
Entity /Identity
Resolution
Pipeline
Identification
Services
Predictive Models
Unstructured data
Social Media
Info on Web
Behavioral data
High
Volume
Content
Analytics
Big Data
Platform
Travel Data
Banking Data
Spatial Data
Temporal Data
21
Big Data
Solution
Massively
Parallel
Processing
Visualization Analytics
Report – Descriptive
Analytics
© 2009 IBM Corporation