Exploring Capturable Everyday Memory for Autobiographical Authentication, at Ubicomp 2013

Exploring Capturable Everyday
Memory for Autobiographical
Authentication
Sauvik Das, Eiji Hayashi, Jason Hong
1
{sauvikda, ehayashi, jasonh}@cs.cmu.edu

Mobile Authentication Is…
3
• Underutilized
– Up to 50% do not use
authentication.
• Outdated
– Ported from the
desktop world.
http://confidenttechnologies.com/files/Infographic%20small%20image.png

4
Smartphones know a lot about us.

6
They are our communication hubs.

8
They know where we’re going.

9
We use them to browse the web.

11
And the apps! Oh, the apps we use!

Key Observation
12
Smartphone Logs Human Memory
Capturable Everyday Memory

Key Observation
• Capturable Everyday Memory can be used as
the basis for a series of autobiographical
challenge-response questions.
• This is what we call Autobiographical
Authentication.
13

14
What did you eat for lunch
yesterday?

15
What application did you use at
1pm?

16
Who did you call yesterday at
4pm?

A Model of Capturable Everyday Memory
• How well can users answer questions about
capturable everyday memory?
• What factors affect their performance?
18

A Framework for Autobiographical
Authentication
• How can we move from raw, noisy question-
answer responses to an authentication
decision?
• How do we handle inaccuracies/lapses in
human memory?
19

Approach
• Ran 3 studies to get a handle on both
questions that could be asked and can be
asked.
• 2 Mturk studies based on self-report data.
• 1 Field study with ground truth data.
21

myAuth
• Indexed “knowledge” on the phone:
– Sensor readings
– System-level content-providers
• Capable of asking 13 questions.
23

24
QType Description
FBApp What application did you use on <time>?
FBLoc Where were you on <time>?
FBOCall Who did you call on <time>?
FBInCall Who called you on <time>?
FBOSMS Who did you SMS message on <time>?
FBInSMS Who SMS messaged you on <time>?
FBIntSrc What did you search the internet for on <time>?
FBIntVis What website did you visit on <time>?
NAOSMS Name someone you SMS messaged in the last 24 hours.
NAInSMS Name someone who SMS messaged you in the last 24 hours.
NAOCall Name someone you called in the last 24 hours.
NAInCall Name someone who called you in the last 24 hours.
NAApp Name an application you used in the last 24 hours.

25
QType Description

26
QType Description

Study Design
• Answer 5 questions a day for 14 days.
• Incentive to answer questions correctly.
• Could skip questions if they chose.
28

Descriptive Stats
• 24 users
– Average age: 25 (s.d. 6.25, range 18-43)
– 14 male (58%)
• 2167 question-answer responses collected
29

1381 questions answered
correctly (64%)
+
168 near misses (8%)
31

Recognition > Recall
68% recognition
questions answered
correctly vs. 62%
recall questions (p =
0.008).
32

Performance Stable Over Time
No difference between
first and last 20% of
responses (64.1% vs.
64.8%, p = 0.73).
33

Time Bucketing Does Not Help
No difference between
“Fact-Based” and
“Name-Any” questions
(64% vs. 63%, p=0.5).
35

AUTOBIOGRAPHICAL
AUTHENTICATION
36

Users only get 64% of answers correct.
But, performance is stable and errors
are systematic.
37

Given the systematic and stable nature of user
errors, we can make an authentication decision
given both correct and incorrect answers.
38

Confidence Estimator
39
C(u| seq,û)= P(u| seq,û)*S(seq |u)
Confidence that the
attempting authenticator is
the user. Range 0 – P(u).
Probability that the
observed sequence comes
from the user,
given an adversary model.
Value, from 0-1, that the
observed sequence of
responses matches what we
expect from the user.
User
model
Observed
question/answers
Adversary
model

AutoAuth takes in a sequence of
autobiographical question-answer responses
and an adversary model, and outputs a
confidence score from 0-P(u).
40

Evaluation Plan
• Use field study data to run simulations.
– What confidence scores would users get versus
impersonators?
• Simulate 5 probable adversaries, weak and
strong.
42

5 Adversaries Simulated
Naïve Adversary
Observing Adversary
Always Correct Adversary
Empirical Observing Adversary
Empirical Knows Correct Adversary
43
Weakest
Strongest

Naïve Adversary
• Simulates complete stranger who steals
phone.
• Has a 1/10 chance of guessing the correct
answer. Random guess on recognition
question.
44

Always Correct Adversary
• Simulates stalker, or software that has
compromised the knowledge base.
• Always answers correctly.
45

Empirical Knows Correct Adversary
• Simulates adversary with all correct answers
and an understanding of how an average user
answers questions.
• Purposely gets certain questions wrong to
best simulate the “average” user.
46

USER PERFORMANCE
Evaluation Results
47

ADVERSARY PERFORMANCE
Evaluation Results
51

Evaluation Take-Aways
• Users always get relatively high confidence
scores.
• We can easily defend against simple
adversaries.
• Advanced adversaries do better but can also
be detected when modeled against.
55

Summary
• People answered 64% of autobiographical
questions correctly, on average.
• Their errors can be signals, too!
• Autobiographical Authentication is promising
and robust against some tough adversaries.
57

Limitations
• Autobiographical Authentication is slow (22
seconds on average per question).
• Requires constant device usage to replenish
the knowledge base.
• Remains unclear how users will react to this
sort of authentication in practice.
58

Practical Use Cases
• Password Reset.
• Scalable, Dynamic Authentication.
• Tiered Authentication.
60

Why might we want this?
• Scalable by context.
• Authentication is dynamic.
– Shoulder-surfing, social engineering, brute-force
attacks are all much harder to execute.
61

DERIVING THE AUTOAUTH
EQUATION
63

Systematic Response Error Model
Given an observed sequence of
responses, seq, and a user, u, we want
something like:
P(u | seq)
64

65
Using Bayes Law
P(u | seq) =
P(seq |u)P(u)
P(seq)
Prior probability that it is the user.
Probability that we would
observe this sequence of
responses from the user.
Overall probability that we would
observe this sequence of responses.

66
Adopting an Advesary
P(seq) is hard to compute perfectly: Requires knowledge of all possible
impersonators. But we can break it into two components.
P(seq) = P(seq |u)+P(seq |û)
Adversary Model

Modified Bayesian Equation
67
P(u | seq,û) =
P(seq |u)P(u)
P(seq | u)+ P(seq |û)

Problem: To Get a High Score…
68
P(u | seq,û) =
P(seq |u)P(u)
P(seq | u)+ P(seq |û)
Minimize this term.

Unlike Adversary Model Attack
A clever impersonator can get a high score
simply by being unlike the adversary
model, even if s/he is not like the user.
69

Add Bit-String Similarity
70
S(seq |u) =
n-| E(seq |u)-correct(seq)|
n
Number of responses.
Expected answer
correctness from user.
Actual answer correctness
from authenticator.
Bit string similarity.

Final Equation
71
C(u| seq,û)= P(u| seq,û)*S(seq |u)
Confidence that the
attempting authenticator is
the user. Range 0 – 100.
Probability that the
observed sequence
comes from the user,
given an adversary model.
Bit-string similarity between
the correctness of the
observed sequence and the
expected correctness of the
observed sequence.

Modeling Capturable Everyday
Memory
• Used a Mixed-Effects Logistic Regression
• Users as a random effect
– Each user had his/her own baseline likelihood of
getting a question correct (intercept).
72

Fixed Effect Description
Age Integer age.
Gender Male or Female.
Time to Answer Number of seconds it took the user to answer the question.
Time since Correct Answer Number of hours since the correct answer event occurred.
Day of Study The day number of the study (0-13).
Correct Answer Entropy The Shannon entropy of correct answers for this question type.
Answer Uniqueness Inverse of the percentage of times that the correct answer to
this question was the correct answer to this type of question.
Confidence Self-reported confidence in the answer (1-5).
Ease of Remember Answer Self-reported ease of remembering the answer (1-5).
Difficulty of Others Guessing Self-reported perceived difficulty of others guessing the answer
(1-5).
Answer Type Recognition or Recall.
Question Type The question type.
73

Unique Answers Hard to Remember
75

• A simplified system:
– Two question types
– Training data
– An observed authentication attempt
76
QType Probability Correct
QT1 0.7
QT2 0.4
Training Data Probability
Distribution
# QType Correct?
1 QT1 Yes
2 QT2 No
Observed Question-Answer
Response Sequence

• We can calculate the probability that we observe
a sequence of responses given the user:
77
QType Probability Correct
QT1 0.7
QT2 0.4
Training Data Probability
Distribution
# QType Correct?
1 QT1 Yes
2 QT2 No
Observed Question-Answer
Response Sequence
P(seq |u) = P(correct(QT1)|u)P(incorrect(QT2)|u)
= 0.7*(1-0.4)= 0.42

78
Adopting an Advesary
We can simplify the calculation of
P(seq) by adopting a specific
adversary model, û.

Exploring Capturable Everyday Memory for Autobiographical Authentication, at Ubicomp 2013

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (7)

Ähnlich wie Exploring Capturable Everyday Memory for Autobiographical Authentication, at Ubicomp 2013

Ähnlich wie Exploring Capturable Everyday Memory for Autobiographical Authentication, at Ubicomp 2013 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Exploring Capturable Everyday Memory for Autobiographical Authentication, at Ubicomp 2013

Hinweis der Redaktion