This document describes a study that used smartphone sensors to detect sleep and infer sleep quality without requiring changes in user behavior. The study collected sleep data from 27 participants over one month. It achieved the following:
1. Detected bedtimes within 35 minutes, wake times within 31 minutes, and sleep durations within 49 minutes on average.
2. Inferred daily sleep quality with 84% accuracy by analyzing sound, motion, light and device usage data.
3. Classified participants as good or poor sleepers with 81.5% accuracy based on their inferred daily sleep qualities over the month.
Automated Analysis of Behavioral Data – Examples from Fear Conditioning Studies
Smartphone Sleep and Sleep Quality Detection
1. Toss ‘N’ Turn:
Smartphone as
Sleep and Sleep Quality Detector
Jun-Ki Min
(loomlike@cs.cmu.edu)
Afsaneh Doryab
Jason Wiese
Shahriyar Amini
John Zimmerman
Jason I. Hong
5. How well a smartphone can sense sleep without
requiring changes in our behavior?
Task 1. Detect bedtime, waketime and duration
Task 2. Infer daily sleep quality
Task 3. Classify good or poor sleeper
6. Toss’N’Turn
Sound amplitude
Acceleration
Ambient light intensity
Screen proximity
Processes
Battery state
Screen state
Sleep diary
Datapreprocessing
Featureextraction
Database Server
Toss ‘N’ Turn (Data Collection Ver.)
7. Modeling
SoundMotion Sleep
Good or poor
Bedtime Waketime
Duration
…01000000111111111011101011000
…0000000011111111111111111100010-minute window
Sleeping (1) or not (0)
…00 …010
…000
Every 30 minutes,
run the sleep detection model
8. User Study
• Recruited good and poor sleepers
– Living in US, age > 18
– Pay $2 USD for each diary entry (a maximum $72)
• Collected sleep data for a month
• 30 participants signed up and 27 completed
– Total 795 sleep-diary entries
9. Ground Truthing
User Study
Global score > 5 indicates
a subject is having poor sleep
Subjective sleep quality
+ Sleep latency
+ Sleep efficiency
+ Sleep duration
+ Use of medication
+ Sleep disturbances
----------------------------------
= Global sleep quality
10. Demographics
User Study
11
Share bed with
3
8
3
1
1
Disrupting noises in the bedroom
12
15
Yes
No
Age
10
10
5
1
1
20
30
40
50
?
Regularly work
22
5No
Yes
Sex
19
8
Poor sleeper
(PSQI global score > 5)
Good sleeper
(PSQI global score ≤ 5)
11. Evaluation
• Classifier
– Bayesian network (BN) with correlation-based feature selection
• Task 1. Detect bedtime, waketime and duration
• Task 2. Infer daily sleep quality
– Train the model individually (leave-one-day-out cross validation)
• Task 3. Classify good or poor sleeper
– Leave-one-person-out cross validation
12. Task 1: Sleep Detection
• Detect sleep windows Detect sleep time
• 94.5% in classifying sleep/not-sleep windows
Evaluation
Bedtime
detection
Baseline (avg. time)
Our method
Waketime
detection
Baseline (avg. time)
Sleep duration
inference
Baseline (avg. time)
-150 150-120 1209060300-30-60-90
Average minutes of over (+) and under (-) estimation errors
Our method
Our method
14. Task 3: Good/Poor Sleeper Classification
Evaluation
• Infer daily qualities Classify the sleeper type
• 81.5% in classifying good/poor sleepers
Our methodRandom
Accuracy (%) Poor sleeper
detection
(F-score)
15. Discussion
How well a smartphone can sense sleep without
requiring changes in our behavior?
Task 1. Detect bedtime, waketime and duration
within 35, 31, and 49 minutes of errors, respectively
Task 2. Infer daily sleep quality with 84% accuracy
Task 3. Classify good or poor sleeper with 81% accuracy
16. Top Five Features
– Time
– Battery charging / not-charging
– Min. movement
– Std. sound amplitude
– Q3 sound amplitude
– Bedtime
– Waketime
– Sleep duration
– Std. movement
– Yesterday’s sleep quality
Discussion
Sleep detection
Sleep quality
inference
18. General vs. Individual Models
• Sleep detection: 93.06% vs. 94.52%
– Need 3 days of ground truthing to train an individual model
• Sleep quality inference: 77.23% vs. 83.97%
– Need 3 weeks of ground truthing to train an individual model
Discussion
19. Limitations
• Subjective vs. objective sleep quality
– “How was your sleep last night? Rate it on a one to five scale
score” does not capture the full extent of a sleep session
• People tend to over / underestimate their sleep
• Small sample size of poor-quality sleep
Discussion
20. Thanks!
• More info at cmuchimps.org
or email loomlike@cs.cmu.edu
• Special thanks to:
– DARPA, Google
22. Data Collection Frequency
Sensed value (frequency)
Data collection cycle
Night Day Btry<30% Btry<15%
Sound amplitude (1hz) Cont. Every other minute
Stop
Acceleration (5hz) Cont. Every other minute
Light, screen proximity (1/5hz) Cont. Every other minute
List of running apps (1/10hz) When the screen is turned on
Battery states
When the battery level is changed or the power cable
is plugged in/out
Screen states When screen is turned on/off
Sleep diary Every morning (notification)
23. Features
Category Factor Feature variables
Sleepdetection
(32features
foreachwindow)
Noise level Sound amplitudes{Min., Q1, Med., Q3, Max., Avg., Std.}
Movement Acceleration changes{Min., Q1, Med., Q3, Max., Avg., Std.}
Light intensity Light intensities & screen proximities{Min., Q1, Med., Q3, Max., Avg., Std.}
Device state & usage
Duration of screen-on time{Min., Q1, Med., Q3, Max., Avg., Std.},
battery state{plug-in/out, charging/not-charging} & alarm app usage
Regular sleep time Timestamp
Dailyquality
(122featuresforeach
sleep)
Sleep duration Bedtime, waketime & sleep duration (detected)
Sleep latency,
efficiency &
disturbances
Sensor values{#peaks, Avg. width of peaks, Avg. height of peaks, interval of peaks,
position of peaks, Min., Q1, Med., Q3, Max., Avg., Std.} &
yesterday’s sleep quality (previously inferred)
Globalquality
(198featuresforeach
participant)
Sleep regularity
Bedtimes, waketimes, sleep durations & qualities for a month of
sleeps{Med., Avg., Std.} (previously detected and inferred)
Global sleep latency,
efficiency &
disturbances
Daily sleep quality features{Med., Avg., Std.}
24. Modeling: Data Processing
• Rational for “10 minutes”
– Level of granularity when participants report sleep time
– Median sleep latency = 10.9 minutes
• 90,097 windows,
711 not-sleep and 728 sleep segments
…
SoundMotion
10-minute window
Sleep
Sleep Not-sleepNot-sleep
Bedtime
Waketime
25. Detection
Sleep time
detection model
Smoothing
Bedtime, Waketime, Duration
Every
30m
Sleep Detection & Quality Inference
Modeling
…
SoundMotion
10-minute window
Sleep
Classification
Sleep window
detection model
Feature extraction
1 = sleep or
0 = not-sleep
…0011000001010000011111111011010100001010000100000…
…0000000000000000011111111111111100000000000000000…
Bedtime Waketime
Duration
Classification
Sleep quality
inference model
Feature extraction
Good or poor
26. Infer Other Contexts
• Sleep alone vs. with others
– 84.2%
• Phone on the bed vs. near the bed vs.
elsewhere
– 91.9%