Tutorial @Ubicomp 2015: Bridging the Gap -- Machine Learning for Ubiquitous Computing (machine learning and ubicomp primer session).
A tutorial on promises and pitfalls of Machine Learning for Ubicomp (and Human Computer Interaction). From Practitioners for Practitioners.
Presenter: Thomas Ploetz <tom.ploetz@gmail.com>
video recording of talks as they were held at Ubicomp:
https://youtu.be/LgnnlqOIXJc?list=PLh96aGaacSgXw0MyktFqmgijLHN-aQvdq
Vision and reflection on Mining Software Repositories research in 2024
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp Primer
1. Bridging the Gap:
Machine Learning for Ubicomp
Thomas Ploetz
— ML Primer & ML applications for Ubicomp —
2. What is Machine Learning?
• Develop algorithms (“computer programs” [sic!] …) that adapt
(learn!) towards generalisation through analysing sample data
“Machine learning studies computer algorithms for
learning to do stuff”
[Robert Schapire]
2
4. The Machine Learning Principles
1. Use parametric models to represent classes of interest
2. Use statistical learning for deriving parameter values from
representative sample sets
4 [from the Internet …]
5. 3 Postulates of PR / ML (there are more …)
1. Collect information about problem area Ω → representative sample set
5
y additional information, i.e., annotation
= (1
fff(xxx), y1),2
fff(xxx), y2), . . . ,N
fff(xxx), yN )
2. Features characterise patterns’ affiliation to a specific class
fff(x) ccc, with dim(ccc) dim( fff)
3. Features form compact space
(per class) in global feature
space
(compactness)
6. Principles of PR / ML
Classification represents mapping:
Classification → costs, optimise average loss V(f):
= arg min V ( )
ccc k {1, 2, . . . , K} or ccc {0, 1, . . . , K} (with rejection)
Classification systems:
fff(xxx) recording preprocessing feature calc. ccc classification k
6
7. PR / ML Systems — Overview
Recording
(Digitalisation, Quantisation)
Preprocessing
Segmentation
Feature Extraction
Association of feature vector
to pattern class
Training or refreshing of
classifier
Classifier
feature vector
classification parameters
classified
feature vector
supervised
learning
decision supervised learning
digital pattern
improved pattern
(for classification)
number "1"
class ωi
class ωi
class ωi
7
8. Fundamental Elements of
Statistical Classification
1. pk — prior probabilities of classes
2. p(c| Ωk) — class-dependent densities
3. rƛk — classification costs → V(𝛅)
4. 𝛅(Ωƛ |c) — decision rule
8
9. Machine Learning for / in
Automated analysis of sensor data (recorded
using opportunistic / parasitic approaches) as pre-
requisite for …
Context Awareness!
9
11. Applications — Context Awareness!
Any information that can be used to characterize the situation of
an entity:
➡ Who, what, where, when; novel interaction.
11
Activity Recognition Location Awareness HCI
13. Location Applications
— very biased and non-exhaustive example set —
Identification of meaningful
places [e.g., Krumm]
Route prediction from GPS traces [Horvitz, 2012]
Mobility patterns inference
[Ganti et al., 2013]
13
14. Location Analysis: Methods
• Many methods for robust location sensing
• actual measurement techniques (triangulation and such)
• de-noising (signal processing)
• interpolation for missing data
• Very (!) sophisticated machine learning methods for
• tracking
• classification
• prediction
• Examples:
• bag of words features and topic models for classification
• Particle filtering for tracking
• Markovian models for sequential analysis and prediction
• …
14
15. Activity Recognition
Activity recognition aims to recognize the actions and
goals of one or more agents from a series of
observations on the agents' actions and the
environmental conditions.
What? When?
15
18. Indirect Activity Recognition through
Infrastructure Mediated Sensing
hydrosense electrisense gassense
[Patel et al.]
18
19. Event Detection through IMS:
HydroSense
water&tower&
incoming&cold&
water&from&
supply&line&
thermal&&
expansion&&
tank&
laundry&
bathroom 1
hose&
spigot&
hot&&
water&&
heater& bathroom 2
kitchen
dishwasher&
pressure&
regulator&
Closed Pressure System
15&
19
incoming cold
water from
supply line
water tower
[Froehlich et al., 2009]
20. Event Detection through IMS:
HydroSense
20
• Event segmentation
• Feature extraction
• Event classification
[Froehlich et al., 2009]
21. Activity Recognition using IMS
→ Actual activity recognition on top of event classification
[Thomaz et al., 2012]
Shave, Brush teeth, Wash hands, Flush toilet, Wash hands, Fill up teakettle,
Make a salad, Rinse a fruit, Take a glass of water, Do dishes (light load), Do
dishes (heavy load)
21
22. What it all (largely) boils down to …
Analysis of sequential data / time series data!
22
24. Sequential Data — Challenges
• Segmentation vs Classification
→ “chicken and egg” problem
• Noise, noise, and noise …
• … more noise :-(
• Evaluation — “ground truth”?
24
25. Noise …
• filtering
• trivial (technically)
• lag
• no higher level
variables (speed)
ˆxi =
1
n
iX
j=i n+1
zj ˆxi = median{zi n+1, zi n+2, . . . , zi 1, zi}
25
26. Direct Observations vs State
• Idea: Assume internal “system” state
• Approach: Infer state by exploiting
measurements / observations
• Kalman Filter
→ explicit consideration of
(Gaussian) noise
26
27. Direct Observations vs State
• Idea: Assume internal “system” state
• Approach: Infer state by exploiting
measurements / observations
• Kalman Filter
→ explicit consideration of
(Gaussian) noise
• Particle Filter
→ no limitation to Gaussian noise
→ prob. model for measurements
27
28. Direct Observations vs State
• Idea: Assume internal “system” state
• Approach: Infer state by exploiting
measurements / observations
• Kalman Filter
→ explicit consideration of
(Gaussian) noise
• Particle Filter
→ no limitation to Gaussian noise
→ prob. model for measurements
28
29. Direct Observations vs State
• Idea: Assume internal “system” state
• Approach: Infer state by exploiting
measurements / observations
• Kalman Filter
→ explicit consideration of
(Gaussian) noise
• Particle Filter
→ no limitation to Gaussian noise
→ prob. model for measurements
• Hidden Markov Model
→ meas. model: conditional probability
→ dynamic model: limited memory,
transition probabilities
29