There is a rapid intertwining of sensors and mobile devices into the fabric of our lives. This has resulted in unprecedented growth in the number of observations from the physical and social worlds reported in the cyber world. Sensing and computational components embedded in the physical world is termed as Cyber-Physical System (CPS). Current science of CPS is yet to effectively integrate citizen observations in CPS analysis. We demonstrate the role of citizen observations in CPS and propose a novel approach to perform a holistic analysis of machine and citizen sensor observations. Specifically, we demonstrate the complementary, corroborative, and timely aspects of citizen sensor observations compared to machine sensor observations in Physical-Cyber-Social (PCS) Systems.
Physical processes are inherently complex and embody uncertainties. They manifest as machine and citizen sensor observations in PCS Systems. We propose a generic framework to move from observations to decision-making and actions in PCS systems consisting of: (a) PCS event extraction, (b) PCS event understanding, and (c) PCS action recommendation. We demonstrate the role of Probabilistic Graphical Models (PGMs) as a unified framework to deal with uncertainty, complexity, and dynamism that help translate observations into actions. Data driven approaches alone are not guaranteed to be able to synthesize PGMs reflecting real-world dependencies accurately. To overcome this limitation, we propose to empower PGMs using the declarative domain knowledge. Specifically, we propose four techniques: (a) automatic creation of massive training data for Conditional Random Fields (CRFs) using domain knowledge of entities used in PCS event extraction, (b) Bayesian Network structure refinement using causal knowledge from Concept Net used in PCS event understanding, (c) knowledge-driven piecewise linear approximation of nonlinear time series dynamics using Linear Dynamical Systems (LDS) used in PCS event understanding, and the (d) transforming knowledge of goals and actions into a Markov Decision Process (MDP) model used in PCS action recommendation.
We evaluate the benefits of the proposed techniques on real-world applications involving traffic analytics and Internet of Things (IoT).
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social Systems
1. Knowledge-‐empowered
Probabilis3c
Graphical
Models
for
Physical-‐Cyber-‐Social
Systems
Pramod
Anantharam
PhD
Disserta+on
Defense
April
14,
2016
The
Ohio
Center
of
Excellence
in
Knowledge-‐enabled
Compu+ng
(Kno.e.sis),
Wright
State
University
Commi%ee:
Dr.
Payam
Barnaghi
(University
of
Surrey),
Dr.
Shalini
Forbis
(BoonshoP
School
of
Medicine),
Dr.
Cory
Henson
(Bosch
Research),
Dr.
Biplav
Srivastava
(IBM
Research),
Prof.
Shaojun
Wang
(Wright
State
University/Alibaba)
Advisors:
Prof.
Amit
Sheth,
Prof.
Krishnaprasad
Thirunarayan
2. 2
Multimodal Manifestation of Real-World Events: Power Grid Scenario
Image
Credit:
Twi%er,
hUp://bit.ly/1SsE924
1Six
Degrees:
The
Science
of
Connected
Age,
Duncan
WaUs
2One
of
four
main
reasons
of
failure.
Inves+ga+on
report
by
The
U.S.-‐Canada
Power
System
Outage
Task
Force
August
14,
2003
Blackout
in
the
Midwest
U.S.
"failed
to
manage
adequately
tree
growth
in
its
transmission
right-‐of-‐way.”
2
August
10,
1996
Blackout
in
the
West
U.S.
“
…
inadequate
understanding
of
the
interdependencies
present
in
the
system.”
1
Power
Grid
related
events
manifest
in
physical,
cyber,
and
social
(PCS)
modali+es
3. 3
Multimodal Manifestations of Real-World Events: Asthma Scenario
Image
Credit:
hUp://www.rtmagazine.com/2015/10/brown-‐univ-‐fight-‐childhood-‐asthma-‐au+sm-‐obesity/
NODE Sensor
(exhaled Nitric Oxide)
Fitbit ChargeHR
(Activity, sleep quality)
Sensordrone
(Carbon monoxide,
temperature, humidity)
Pollen level
Temperature & Humidity
Air
Quality
Prevalence of Asthma
Personal
Level
Signals
Popula+on
Level
Signals
Asthma
related
events
manifest
in
physical,
cyber,
and
social
(PCS)
modali+es
4. Multimodal Manifestation of Real-World Events: Traffic Scenario
4
Traffic
related
events
manifest
in
physical,
cyber,
and
social
(PCS)
modali+es
Amit
Sheth,
Pramod
Anantharam,
Cory
Henson,
'Physical-‐Cyber-‐Social
Compu+ng:
An
Early
21st
Century
Approach,'
IEEE
Intelligent
Systems,
vol.
28,
no.
1,
pp.
78-‐82,
Jan.-‐Feb.,
2013.
hUp://doi.ieeecomputersociety.org/10.1109/MIS.2013.20
5. Processing Multimodal Manifestations of Real-World Events
5
“Informa3on
is
a
source
of
learning.
But
unless
it
is
organized,
processed,
and
available
to
the
right
people
in
a
format
for
decision
making,
it
is
a
burden,
not
a
benefit.”
—
William
Pollard,
(1828
–
1893)
“…the
OODA
Loop
is
an
explicit
representa+on
of
the
process
that
human
beings
and
organiza+ons
use
to
learn,
grow,
and
thrive
in
a
rapidly
changing
environment
—
be
it
in
war,
business,
or
life.”1
1The
Tao
of
Boyd:
How
to
Master
the
OODA
Loop:
hUp://www.artofmanliness.com/2014/09/15/ooda-‐loop/
Observe
Orient
Decide
Act
John
Boyd’s
Observe,
Orient,
Decide,
and
Act
(OODA)
Loop
for
organizing,
processing,
and
decision
making:
Feedback
Feed
Forward
Feed
Forward
Feed
Forward
6. Processing Multimodal Manifestations in PCS Systems
6
The
Tao
of
Boyd:
How
to
Master
the
OODA
Loop:
hUp://www.artofmanliness.com/2014/09/15/ooda-‐loop/
Observe
Orient
Decide
Act
Feedback
Feed
Forward
Feed
Forward
Feed
Forward
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
Observe
–
Collect
as
much
informa+on
as
possible
from
the
environment
Orient
–
Assimilate
all
the
informa+on
to
understand
the
environment
Decide
–
Determine
the
course
of
ac+on
based
on
an
objec+ve
Act
–
Follow
through
the
course
of
ac+on
7. 7
Thesis Statement
Observa3ons
from
diverse
modali3es
can
provide
complementary,
corrobora3ve,
and
3mely
informa+on
about
events
in
Physical-‐Cyber-‐Social
systems.
Probabilis3c
Graphical
Models
with
the
help
of
declara3ve
domain
knowledge
provide
an
effec+ve
mechanism
to:
(a)
uncover
and
interpret
mul3modal
event
manifesta3ons
in
textual
and
numerical
data,
(b)
explore
event
interac3ons
and
dynamics,
and
(c)
formalize
op3mal
ac3on
recommenda3on
in
Physical-‐Cyber-‐Social
systems.
8. 8
“Graphical
models
are
a
marriage
between
probability
theory
and
graph
theory.
They
provide
a
natural
tool
for
dealing
with
two
problems
that
occur
throughout
applied
mathema3cs
and
engineering
-‐-‐
uncertainty
and
complexity
…”
-‐
Michael
Jordan,
UC
Berkley,
1998.
What are Probabilistic Graphical Models (PGMs)?
Alex
wants
to
model
the
reasons
for
asthma
a%acks.
Random
Variables:
AUack
(A),
Medica+on
(M),
Steps
(S),
Pollen
(P)
Joint
Probability
distribu3on:
p(A,
M,
S,
P)
Parameters:
For
four
binary
variables,
there
are
24
=
16
probability
assignments1
p(A,
M,
S,
P)
=
p(A
|
M,
S,
P)
p(M,
S,
P)
=
p
(A
|
M,
S,
P)
p(M
|
S,
P)
p(S,
P)
=
p
(A
|
M,
S,
P)
p(M
|
S,
P)
p(S
|
P)
P(P)
=
p
(A
|
M,
P)
p(M)
p(S
|
P)
p(P),
because,
#
of
parameters
=
22
+
1
+
2
+
1
=
8
probability
assignments
(A ! S),(M ! S),(M ! P)
A
M P
S
Structure:
Parameters:
(8
probability
assignments)
1hUp://www.freemars.org/jeff/2exp100/powers.htm
p
(A
|
M,
P)
p(M)
p(S
|
P)
p(P)
9. 9
Example of Declarative Domain Knowledge
road
ice
Causes
accident
Linked
Open
Data
(Declara+ve
Knowledge
from
ConceptNet
5)
Delay
go
to
baseball
game
traffic
jam
traffic
accident
traffic
jam
Ac+veEvent
ScheduledEvent
Causes
traffic
jam
Causes
traffic
jam
CapableOf
slow
traffic
CapableOf
occur
twice
each
day
Causes
is_a
bad
weather
CapableOf
slow
traffic
TimeOfDay
go
to
concert
HasSubevent
car
crash
accident
RelatedTo
car
crash
BadWeather
Causes
Causes
is_a
is_a
is_a
is_a
is_a
is_a
is_a
10. Processing Multimodal Manifestations in PCS Systems
10
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
• What
are
the
events
of
interest?
• How
do
they
manifest
in
observa+onal
data?
• How
can
we
extract
events
from
observa+onal
data?
• What
is
the
role
of
declara+ve
knowledge
in
event
extrac+on?
• How
do
events
influence
one
another?
• How
do
we
infer
the
interac3ons
from
observa3onal
data
across
mul3ple
modali3es
(numerical
and
textual
data)?
• What
is
the
role
of
declara+ve
knowledge
in
event
understanding?
• How
can
we
represent
tasks
and
ac+ons?
• How
can
we
u+lize
declara+ve
knowledge
to
recommend
ac+ons?
• How
can
we
formalize
the
no+on
of
op+mal
ac+on?
[ACM-‐TIST-‐15]
[ITS-‐13]
[AAAI-‐16]
[SDM-‐13]
[IEEE-‐Int.-‐Sys.-‐13]
[IBM-‐Tech.-‐Rep.-‐14]
[Bosch-‐Internship-‐14]
11. Processing Multimodal Manifestations in PCS Systems
11
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
• What
are
the
events
of
interest?
• How
do
they
manifest
in
observa+onal
data?
• How
can
we
extract
events
from
observa+onal
data?
• What
is
the
role
of
declara+ve
knowledge
in
event
extrac+on?
• How
do
events
influence
one
another?
• How
do
we
infer
the
interac3ons
from
observa3onal
data
across
mul3ple
modali3es
(numerical
and
textual
data)?
• What
is
the
role
of
declara+ve
knowledge
in
event
understanding?
• How
do
we
u+lize
our
understanding
to
recommend
ac+ons?
• How
can
we
recommend
best
possible
ac+on?
• What
is
the
role
of
declara+ve
knowledge
and
PGMs
in
ac+on
recommenda+on?
[AAAI-‐16]
[SDM-‐13]
[IEEE-‐Int.-‐Sys.-‐13]
[ASG-‐14]
[AAAI-‐16]
Understanding
City
Traffic
Dynamics
U+lizing
Sensor
and
Textual
Observa+ons.
The
Thir+eth
AAAI
Conference
on
Ar+ficial
Intelligence,
2016
[SDM-‐13]
Traffic
Analy+cs
using
Probabilis+c
Graphical
Models
Enhanced
with
Knowledge
Bases,
2nd
Interna+onal
Workshop
on
Analy+cs
for
Cyber-‐Physical
Systems
(ACS-‐2013)
at
SIAM
Interna+onal
Conference
on
Data
Mining
(SDM13),
2013
[IEEE-‐Int.-‐Sys.-‐13]
Physical-‐Cyber-‐Social
Compu+ng:
An
Early
21st
Century
Approach,
IEEE
Intelligent
Systems,
2013
[ACM-‐TIST-‐15]
[ITS-‐13]
12. • Why?
– Explain/Interpret
average
speed
and
link
travel
+me
varia+ons
using
events
provided
by
city
authori+es
and
traffic
events
shared
on
TwiUer
– Prior
work:
Predict
conges+on
based
on
historical
sensor
data
• What?
– Combine
• 511.org
data
about
Bay
Area
Road
Network
Traffic
– E.g.,
Average
speed
and
link
travel
+me
data
stream
(Sensor
data)
– E.g.,
(Happened
or
planned)
event
reports
(Textual
data)
• Tweets
that
report
traffic
related
events
(Textual
data)
Multimodal Data Integration: Traffic Scenario
12
13. • How?
o Step
1:
Extract
textual
events
from
tweets
stream
o Step
2:
Build
sta+s+cal
models
of
normalcy,
and
thereby
anomaly,
for
sensor
+me
series
data
o Step
3:
Correlate
mul3modal
streams,
using
spa+o-‐
temporal
informa+on,
to
explain
“anomalies”
in
sensor
+me
series
data
with
textual
events
Multimodal Data Integration: Traffic Scenario
13
14. • How?
o Step
1:
Extract
textual
events
from
tweets
stream
o Step
2:
Build
sta+s+cal
models
of
normalcy,
and
thereby
anomaly,
for
sensor
+me
series
data
o Step
3:
Correlate
mul3modal
streams,
using
spa+o-‐
temporal
informa+on,
to
explain
“anomalies”
in
sensor
+me
series
data
with
textual
events
Multimodal Data Integration: Traffic Scenario
14
15. Processing Multimodal Manifestations in PCS Systems
15
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
• What
are
the
events
of
interest?
• How
do
they
manifest
in
observa+onal
data?
• How
can
we
extract
events
from
observa3onal
data?
• What
is
the
role
of
declara+ve
knowledge
in
event
extrac+on?
• How
do
events
influence
one
another?
• How
do
we
infer
the
interac3ons
from
observa3onal
data
across
mul3ple
modali3es
(numerical
and
textual
data)?
• What
is
the
role
of
declara+ve
knowledge
and
PGMs
in
event
understanding?
• How
can
we
represent
tasks
and
ac+ons?
• How
can
we
u+lize
declara+ve
knowledge
to
recommend
ac+ons?
• How
can
we
formalize
the
no+on
of
op+mal
ac+on?
[ACM-‐TIST-‐15]
[ITS-‐13]
[AAAI-‐15]
[SDM-‐13]
[IEEE-‐Int.-‐Sys.-‐13]
[ACM-‐TIST-‐15]
Extrac+ng
City
Traffic
Events
from
Social
Streams.
ACM
Transac+ons
on
Intelligent
Systems
and
Technology
Journal
2015.
[ITS-‐13]
City
No+fica+ons
as
a
Data
Source
for
Traffic
Management,
20th
ITS
World
Congress
2013.
[IBM-‐Tech.-‐Rep.-‐14]
[Bosch-‐Internship-‐14]
16. 16
People Reporting Various Events in a City on Twitter
Public
Safety
Urban
planning
Gov.
&
agency
admin.
Energy
&
water
Environmental
Transporta3on
Social
Programs
Healthcare
Educa+on
17. 17
Extracting City Events from Twitter: Proposed Solution
[ACM-‐TIST-‐15]
Extrac+ng
City
Traffic
Events
from
Social
Streams.
ACM
Transac+ons
on
Intelligent
Systems
and
Technology
Journal
2015.
Event
Extrac+on
Tool
on
Open
Science
Founda+on:
hUps://osf.io/b4q2t/wiki/home/
18. 18
Label
image
sequence
of
Jus+n
Bieber’s
day
J
Sleeping
Driving
Exercising
Driving
Sleeping
Singing
This
image
of
concert
was
Important
in
labeling
the
next
image
Edwin
Chen’s
blog
on
CRF:
hUp://blog.echen.me/2012/01/03/introduc+on-‐to-‐condi+onal-‐random-‐fields/
Image
Credit:
hUp://bit.ly/1Th8CgL,
hUp://bit.ly/1Nzk5DR,
hUp://bit.ly/1VBbx7e,
hUp://bit.ly/1QkmBhb,
hUp://bit.ly/1SsyYzd,
hUp://bit.ly/1Nzl7j7
City Event Annotation: Conditional Random Fields (CRFs) – Intuition
19. 19
The
global
normaliza+on
and
the
discrimina+ve
nature
of
the
model
dis+nguishes
CRFs
from
other
models
allowing
it
to
capture
long
distance
dependencies
City Event Annotation: Conditional Random Fields (CRFs) – Formalism
Last
O
night
O
I
O
was
O
in
O
CA...
O
(@
O
Half
B-‐LOCATION
Moon
I-‐LOCATION
Bay
B-‐
LOCATION
Brewing
I-‐LOCATION
Company
O
w/
O
8
O
others)
O
hUp://t.co/w0eGEJjApY
O
{B-‐LOCATION,
I-‐LOCATION,
B-‐EVENT,
I-‐EVENT,
O}
Tagset
=
20. 20
0.6
miles
Max-‐lat
Min-‐lat
Min-‐long
Max-‐long
0.38
miles
37.7545166015625, -122.40966796875
37.7490234375, -122.40966796875
37.7545166015625,
-122.420654296875
37.7490234375, -122.420654296875
4
37.74933, -122.4106711
Hierarchical
spa+al
structure
of
geohash
for
represen+ng
loca+ons
with
variable
precision.
Here,
the
loca+on
string
is
5H34
0
1
2
3
4
5
6
7
8
9
B
C
D
E
F
G
H
I
J
K
L
0
1
7
2
3
4
5
6
8
9
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
8
Geohashing
wiki:
hUp://wiki.xkcd.com/geohashing/
Image
Credit:
Google
Maps
City Event Extraction: Spatio-Temporal-Thematic Aggregation
21. 21
• City
Event
Annota+on
– Automated
crea+on
of
training
data
– Annota+on
task
(our
CRF
model
vs.
baseline
CRF
model)
• City
Event
Extrac+on
– Use
aggrega+on
algorithm
for
event
extrac+on
– Extracted
events
vs.
ground
truth
• Dataset
(Aug
–
Nov
2013)
– Over
8
million
tweets
from
San
Francisco
Bay
Area
(extracted
1042
events)
– 311
ac+ve
events
and
170
scheduled
events
from
511.org
(ground
truth)
Evaluation: Extracting City Events from Twitter
22. Evaluation: City Event Annotation
22
Baseline
Annota+on
Model
[RiUer
et
al.
2012]
Our
Annota+on
Model
• Baseline
CRF
model
(trained
on
a
huge
manually
created
data)
works
well
on
generic
tasks
• Our
CRF
model
trained
on
automa+cally
generated
training
data
performs
on
par
with
the
baseline
• Our
CRF
model
does
beUer
on
the
event
extrac+on
task
due
to
the
availability
of
event
related
knowledge
[RiUer
et
al.
2012]
Alan
RiUer,
Mausam,
Oren
Etzioni,
and
Sam
Clark
2012.
Open
domain
event
extrac+on
from
TwiUer.
In
Proceedings
of
the
18th
ACM
SIGKDD
Interna+onal
Conference
on
Knowledge
Discovery
and
Data
Mining.
ACM,
New
York,
NY,
1104–1112.
23. Complementary
Events
Textual Events from Tweets vs. 511.org: Complementary
23
traffic
incident;
road-‐construc+on
24. Textual Events from Tweets vs. 511.org: Corroborative
Corrobora+ve
Events
24
fog
visibility-‐air-‐quality;
fog
26. Extracting Textual Events from Tweets for Data from May-14 to May-15
1Event
Extrac+on
Tool
on
Open
Science
Founda+on:
hUps://osf.io/b4q2t/wiki/home/
NER
–
Named
En+ty
Recogni+on
OSM
–
Open
Street
Maps
39,208
traffic
related
incidents
extracted
from
over
20
million
tweets1
26
[ACM-‐TIST-‐15]
Extrac+ng
City
Traffic
Events
from
Social
Streams.
ACM
Transac+ons
on
Intelligent
Systems
and
Technology
Journal
2015.
27. • How?
o Step
1:
Extract
textual
events
from
tweets
stream
o Step
2:
Build
sta+s+cal
models
of
normalcy,
and
thereby
anomaly,
for
sensor
+me
series
data
o Step
3:
Correlate
mul3modal
streams,
using
spa+o-‐
temporal
informa+on,
to
explain
“anomalies”
in
sensor
+me
series
data
with
textual
events
Multimodal Data Integration: Traffic Scenario
27
28. Image
credit:
hUp://traffic.511.org/index
Mul+ple
events
Varying
influence
Event
interac+ons
Time
of
Day
(approx.
1
observa+on/minute)
Speed
in
km/h
Building Normalcy Models of Traffic Dynamics*: Challenges
*Traffic
Dynamics
here
refers
to
speed
and
travel
+me
varia+ons
observed
in
sensor
data
28
29. • Temporal
landmarks
:
peak
hour
vs.
off-‐peak
traffic
vs.
weekend
traffic
• Effect
of
loca+on
• Scheduled
events
such
as
road
construc+on,
baseball
game,
or
music
concert
• Unexpected
events
such
as
accidents,
heavy
rains,
fog
• Random
varia+ons
(viz.,
stochas+city)
such
as
people
visi+ng
downtown
by
mere
coincidence
Possible Causes of Nonlinearity in Traffic Dynamics
29
30. Modeling City Traffic Dynamics: A Closer Look
Image
credits:
hUp://bit.ly/1N1wu5g,
hUp://bit.ly/1O8d9gn,
hUp://bit.ly/1N8L5•,
hUp://bit.ly/1HLDYui
Events
People
Influx
Vehicle
Influx
Vehicle
Speed
Hidden
State
Observed
Evidence
30
link1
link2
link3
road1
=
[link1,link2,link3]
31. Modeling City Traffic Dynamics: Nature of the Problem
Hidden
States
Observed
Evidence
1.
There
are
both
hidden
states
and
observed
evidence
2.
Current
observed
evidence
indica3ve
of
the
current
hidden
state
3.
Current
hidden
states
depends
on
the
previous
hidden
states
T
is
a
discrete
3me
step
in
the
3me
series
data
being
modeled
31
Events
People
Influx
Vehicle
Influx
Events
(T)
People
Influx
(T)
Vehicle
Influx
(T)
Events
(T)
People
Influx
(T)
Vehicle
Influx
(T)
Events
(T-‐1)
People
Influx
(T-‐1)
Vehicle
Influx
(T-‐1)
Vehicle
Speed
Vehicle
Speed
(T)
32. Modeling the Problem as Linear Dynamical System (LDS)
1.
There
are
both
hidden
states
and
observed
evidence
2.
Current
observed
evidence
indica3ve
of
the
current
hidden
state
3.
Current
hidden
state
depends
on
the
previous
hidden
state
v1
s1
…
…
v2
s1
vT
sT
v1
s1
…
…
v2
s1
vT
sT
v1
s1
…
…
v2
s1
vT
sT
For
simplicity
of
explana+on,
we
consider
vehicle
influx
as
a
hidden
variable
and
the
observed
speed
as
evidence
variable
Vehicle
influx
at
a
certain
point
in
+me
t
would
influence
speed
of
vehicles
at
the
same
+me
t
Vehicle
influx
at
a
certain
point
in
+me
t
depends
only
on
the
vehicle
influx
at
+me
t-‐1
32
33. Probabilistic Reasoning Over Time: Discrete Variables
Russell,
Stuart,
and
Peter
Norvig.
"Ar+ficial
intelligence:
a
modern
approach."
(1995).
Image
credits:
hUp://bit.ly/1Q9qmvk,
hUp://bit.ly/1lm9BAs,
hUp://bit.ly/1LXqOFd
Evidence
(U)
States
(R)
State
transi+on
model
is
given
by
With
First-‐Order
Markov
assump3on,
the
transi+on
model
is
Transi3on
model
Observa3on
model
Observa+on
model
with
sensor
Markov
assump3on
is
given
by
P(Rt
|
R0:t-‐1)
P(Rt
|
Rt-‐1)
P(Ut
|
R0:t,U0:t-‐1)
=
P(Ut
|
Rt)
Specifying
t
transi+on
and
observa+on
models
is
imprac+cal.
So,
another
assump+on:
sta3onary
process
Rt-‐1
P(Rt)
t
0.7
f
0.3
Rt
P(Ut)
t
0.9
f
0.1
33
34. Probabilistic Reasoning Over Time: Continuous Variables
v1
s1
…
…
v2
s1
vT
sT
Linear
Dynamical
System
(LDS):
Replacing
discrete
valued
state
and
observa+on
nodes
(previous
slide)
with
conHnuous
valued
states
and
observa+ons,
we
get
an
LDS
model
The
transi3on
model
is
specified
by
At
and
the
observa3on
model
is
specified
by
Bt
along
with
associated
Gaussian
noise
The
joint
distribu+on
over
all
the
hidden
and
observed
variables
is
shown
along
with
the
condi+onal
distribu+ons
Barber,
David.
Bayesian
reasoning
and
machine
learning.
Cambridge
University
Press,
2012.
34
35. Hourly Link Speed Dynamics Over all Mondays between Aug-14 to Jan-15
x-‐axis:
observa3on
number
for
each
hour
of
day
y-‐axis:
average
speed
of
vehicles
in
km/h
35
36. 36
Switching Linear Dynamical Systems
v1
s1
…
…
v2
s1
vT
sT
h1
h2
hT
…
Switching
Linear
Dynamical
System
(SLDS):
A
discrete
switch
variable
at
each
+me
t
describes
the
appropriate
LDS
to
be
used.
SLDS
can
capture
jumps
between
mul3ple
linear
dynamics.
v1
s1
…
…
v2
s1
vT
sT
h1
h2
hT
…
Restricted
Switching
Linear
Dynamical
System
(RSLDS):
Restric+ng
the
switch
variable
transi+ons
in
SLDS,
we
proposed
RSLDS
[AAAI-‐16]
which
captures
the
switching
behavior
based
on
hour
of
the
day
and
day
of
the
week.
The
transi3on
model
is
specified
by
At(ht)
and
the
observa3on
model
is
specified
by
Bt(ht)
[AAAI-‐16]
Understanding
City
Traffic
Dynamics
U+lizing
Sensor
and
Textual
Observa+ons.
The
Thir+eth
AAAI
Conference
on
Ar+ficial
Intelligence,
2016
37. Modeling City Traffic Dynamics: Choosing a Suitable Model
"All
models
are
wrong,
but
some
are
useful.”
-‐
George
Box
• Differen+ate
various
traffic
dynamics
– Gaussian
mixture
model
does
not
discriminate
between
increasing
speed
vs.
decreasing
speed
dynamics
• Account
for
unobserved
factors
– Autoregressive
models
cannot
capture
unobserved
factors
• E.g.,
“Unobservable”
traffic
volume
dictates
event
manifesta+ons
in
link
speed
and
travel
+me
varia+ons
– Linear
Dynamical
System
introduces
latent
state-‐based
model
• E.g.,
Traffic
volume,
road
lane
closures,
and
weather
condi+ons
• Emission/Transi+on
matrix
and
Gaussian
noise
captures
stochas+city
37
38. 38
Learning Context Specific LDS Models
7
×
24
LDS(1,1),
LDS(1,2)
,….,
LDS(1,24)
LDS(7,1),
LDS(7,2)
,….,
LDS(7,24)
.
.
.
di
hj
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
Sun.
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
Sun.Speed/travel-‐+me
+me
series
data
from
a
link
Time
series
data
for
each
hour
of
day
(1-‐24)
for
each
day
of
week
(Monday
–
Sunday)
Mean
+me
series
computed
for
each
day
of
week
and
hour
of
day
along
with
the
medoid
168
LDS
models
for
each
link;
Total
models
learned
=
425,712
i.e.,
(2,534
links
×
168
models
per
link)
Step
1:
Index
data
for
each
link
for
day
of
week
and
hour
of
day
u+lizing
the
traffic
domain
knowledge
for
piece-‐
wise
linear
approxima+on
Step
2:
Find
the
“typical”
dynamics
by
compu+ng
the
mean
and
choosing
the
medoid
for
each
hour
of
day
and
day
of
week
Step
3:
Learn
LDS
parameters
for
the
medoid
for
each
hour
of
day
(24
hours)
and
each
day
of
week
(7
days)
resul+ng
in
24
×
7
=
168
models
for
each
link
39. Learning Normalcy for Each Link, Day of Week, and Hour of Day
Log-‐likelihood
score
39
Five-‐number
summary
of
log-‐likelihood
scores
for
a
link,
day
of
week,
hour
of
day
40. 40
Tagging Anomalies using Context Specific LDS Models
Compute
Log
Likelihood
for
each
hour
of
observed
data
(di,hj)
LDS(hj,di)
7
×
24
Lik(1,1),
Lik(1,2)
,….,
Lik(1,24)
Lik(7,1),
Lik(7,2)
,….,
Lik(7,24)
.
.
.
Train
?
Yes
(Training
phase)
Tag
Anomalous
hours
using
the
Log
Likelihood
Range
No
(di,hj)
(min.
likelihood)
Anomalies
L
=
Par33on
based
on
(di,hj)
Speed
and
travel-‐+me
+me
Observa+ons
from
a
link
Log
likelihood
min.
and
max.
values
obtained
from
five
number
summary
Par33on
based
on
(di,hj)
7
×
24
LDS(1,1),
LDS(1,2)
,….,
LDS(1,24)
LDS(7,1),
LDS(7,2)
,….,
LDS(7,24)
.
.
.
di
hj
(Input)
(Output)
41. • How?
o Step
1:
Extract
textual
events
from
tweets
stream
o Step
2:
Build
sta+s+cal
models
of
normalcy,
and
thereby
anomaly,
for
sensor
+me
series
data
o Step
3:
Correlate
mul3modal
streams,
using
spa+o-‐
temporal
informa+on,
to
explain
“anomalies”
in
sensor
+me
series
data
with
textual
events
Multimodal Data Integration: Traffic Scenario
41
42.
• Anomaly
in
link
data
during
+me
period
[ast,aet],
is
explained
by
an
event
if
the
event
occurs
within
0.5km
radius
and
during
[ast-‐1,
aet+1].
• CAVEAT:
An
anomaly
may
not
be
explained
because
of
missing
data.
Explaining Anomalies in Sensor Data using Textual Events
42
Anomalies
⟨et,
el,
est,
eet,
ei⟩
Explained_by
Link
sensor
data
City
tweets
⟨ast,
aet⟩
Δte
=
est
~
eet
Δta
=
(ast
–
1)
~
(aet
+
1)
Explains
(if
there
is
an
overlap
between
Δte
and
Δta)
PCS
Event
Extrac3on
43. • Data
collected
from
San
Francisco
Bay
Area
between
May
2014
to
May
2015
– 511.org:
(1)
1,638
traffic
incident
reports
(2)
1.4
billion
speed
and
travel
+me
observa+ons
– TwiUer
Data:
39,208
traffic
related
incidents
extracted
from
over
20
million
tweets
• Learning
normalcy
model
for
one
link
takes
40
minutes1
(~
2
months
for
processing
2,534
links)
• Scalable
implementa+on
on
Apache
Spark2
resulted
in
learning
normalcy
models
for
2,534
links
within
24
hours
Real-World Dataset and Scalability Issues
43
12.66
GHz,
Intel
Core
2
Duo
with
8
GB
main
memory
machine
2Cluster
used
for
evalua+on
had
865
cores
and
17TB
main
memory
45. • Examined
the
theore3cal
nature
of
the
problem
of
modeling
traffic
dynamics
to
systema+cally
recommend
Linear
Dynamical
Systems
(LDS)
• Formalized
nonlinear
traffic
dynamics
using
piecewise
linear
approxima+on
derived
from
traffic
domain
knowledge
• Created
normalcy
models
based
on
log-‐likelihood
scores
for
spo‡ng
traffic
anomalies
in
sensor
data
• Evaluated
our
approach
over
a
real-‐world
dataset
collected
from
511.org
and
TwiUer
for
over
a
year
(May-‐2014
to
May
2015)
with
promising
results
45
Multimodal Data Integration: Conclusion
46. Processing Multimodal Manifestations in PCS Systems
46
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
• What
are
the
events
of
interest?
• How
do
they
manifest
in
observa+onal
data?
• How
can
we
extract
events
from
observa3onal
data?
• What
is
the
role
of
declara+ve
knowledge
and
PGMs
in
event
extrac+on?
• How
do
events
influence
one
another?
• How
do
we
infer
the
interac3ons
from
observa3onal
data
across
mul3ple
modali3es
(numerical
and
textual
data)?
• What
is
the
role
of
declara+ve
knowledge
and
PGMs
in
event
understanding?
• How
can
we
represent
tasks
and
ac+ons?
• How
can
we
u+lize
declara+ve
knowledge
to
recommend
ac+ons?
• How
can
we
formalize
the
no+on
of
op+mal
ac+on?
[ATMSB-‐15]
[ATS-‐13]
[SAH-‐13]
[IBM-‐Tech.-‐Rep.-‐14]
[Bosch-‐Internship-‐14]
[IBM-‐Tech.-‐Rep.-‐14]
Dynamic
Update
of
Public
Transport
Schedules
in
Ci+es
Lacking
Traffic
Instrumenta+on,
IBM
Research
Technical
Report
2014.
[Bosch-‐Internship-‐14]
Task
Assistance
within
IoTS
Network,
Bosch
Summer
Internship
Work,
2014.
[ACM-‐TIST-‐15]
[ITS-‐13]
47. • Contributed
to
a
language
to
represent
tasks
– Using
Seman+c
Web
based
representa+on
for
• Reusing
knowledge
on
the
web
• Integra+on
of
knowledge
in
distributed
environments
(like
the
web
and
UhU1
/
IoTS
network)
• Developed
algorithms
to
recommend
tasks
– Formulated
the
problem
of
recommending
op+mal
ac+on
toward
a
goal2
by
handling
task
failure
in
a
robust
manner
• Developed
a
framework
to
evaluate
task
recommenda+on
– Using
a
simulator
for
world
states
and
user
ac+ons
47
Do-It-Yourself (DIY) Task Recommendation: Bosch Internship, 2014
1Bosch
IoT
middleware
2
Op+mal
ac+on
is
formulated
as
a
Markov
Decision
Process
with
transi+on
and
cost
matrices
ini+alized
using
declara+ve
knowledge
of
tasks
48. Revisiting the Thesis Statement
48
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
[ACM-‐TIST-‐15]
[ITS-‐13]
[AAAI-‐16]
[SDM-‐13]
[IEEE-‐Int.-‐Sys.-‐13]
[IBM-‐Tech.-‐Rep.-‐14]
[Bosch-‐Internship-‐14]
U3lize
declara3ve
knowledge
of
loca3ons
and
events
to
train
sequence
labeling
models
for
annota3on
and
event
extrac3on
U3lize
declara3ve
knowledge
of
ac3ons
to
formulate
the
problem
of
op3mal
ac3on
recommenda3on
as
a
sequen3al
decision
problem
U3lize
textual
events
to
explain
varia3ons
in
sensor
data
modeled
using
context
(link,
loca3on,
3me)
specific
probabilis3c
3me
series
models
Observa3ons
from
diverse
modali3es
can
provide
complementary,
corrobora3ve,
and
3mely
informa+on
about
events
in
Physical-‐Cyber-‐Social
systems.
Probabilis3c
Graphical
Models
with
the
help
of
declara3ve
domain
knowledge
provide
an
effec+ve
mechanism
to:
(a)
uncover
and
interpret
mul3modal
event
manifesta3ons
in
textual
and
numerical
data,
(b)
explore
event
interac3ons
and
dynamics,
and
(c)
formalize
op3mal
ac3on
recommenda3on
in
Physical-‐Cyber-‐Social
systems.
49. 49
Conclusion
• Observa+ons
from
people
can
provide
complementary,
corrobora3ve,
and
3mely
informa+on
in
PCS
systems.
• We
demonstrated
that
probabilis+c
graphical
models
(PGMs)
are
a
natural
fit
to
deal
with
PCS
challenges.
• We
found
that
declara3ve
domain
knowledge
can
complement
PGMs
in
– Automa+c
crea+on
of
large
training
data
for
training
sequence
labeling
models
– Knowledge-‐driven
piecewise
linear
approxima+on
of
nonlinear
+me
series
dynamics
using
Linear
Dynamical
Systems
(LDS)
– Bayesian
Network
structure
refinement
using
ConceptNet5
– Transforming
knowledge
of
goals
and
ac+ons
into
a
Markov
Decision
Process
(MDP)
formalism
50. 50
Probabilistic Graphical Models, Declarative Knowledge, and PCS Systems
Declara+ve
Knowledge
Data
Textual
Numerical
Parameters
Annotate
Parameters
Structure
PGMs
(e.g.,
CRF,
BN,
LDS,
MDP)
PCS
Applica+ons
(e.g.,
SmartCity,
SmartHealth,
DIY
Task
Recommenda+on)
Commonsense
Knowledge
Domain
Ontologies
and
Open
Data
Mul+modal
Data
Top-‐down
Bokom-‐up
PCS
Event
Extrac3on
PCS
Event
Understanding
PCS
Ac3on
Recommenda3on
[ACM-‐TIST-‐15]
[AAAI-‐16]
[ACM-‐TIST-‐15]
[Bosch-‐Internship-‐14]
[SDM-‐13]
CRF
–
Condi+onal
Random
Field
BN
–
Bayesian
Network
LDS
–
Linear
Dynamical
Systems
MDP
–
Markov
Decision
Process
Structure
[SDM-‐13]
[AAAI-‐16]
[Bosch-‐Internship-‐14]
51. 51
Personalized Digital Health for Asthma Management in Children
Sensordrone
(Carbon monoxide,
temperature, humidity)
Sensor Platforms
Android Device
(w/ kHealth App)
Node Sensor
(exhaled Nitric Oxide)
Fitbit ChargeHR
(Activity, sleep quality)
Pollen level Air
Quality
Temperature & Humidity
kHealth
for
asthma
project
page:
hUp://wiki.knoesis.org/index.php/Asthma
kHealth
project
page:
hUp://knoesis.org/projects/khealth
52. 52
PhD @ Kno.e.sis
Awards
and
Recogni3on
2016
Outstanding
Graduate
Student
Award
in
the
PhD
in
Computer
Science
and
Engineering
Program.
2015
Selected
to
par+cipate
in
the
NSF-‐funded
Data
Science
Workshop
at
University
of
Washington,
SeaUle,
Aug
5–7.
2014
Offered
the
Eric
&
Wendy
Schmidt
Data
Science
for
Social
Good
Fellowship.
2013
A
short
ar+cle
on
my
research
appeared
in
Wright
State
University
newsroom.
2013
Invited
to
aUend
Dagstuhl
Seminar
on
Physical-‐Cyber-‐Social
Compu+ng.
2012
Best
research
showcase
award
for
my
internship
work
at
IBM
Research,
India.
Professional
Experience
• 2014
Internship
at
Bosch
Research
and
Technology
Center
• 2013
Visi+ng
Doctoral
Student
at
University
of
Surrey
• 2011,
2012
Internships
at
IBM
Research
Published
in
ACM
TIST
Journal,
AAAI,
ACM
Web
Science,
and
IEEE
Computer
Program
Commikee
(PC)
member
of
conferences
such
as
WWW-‐16,
WWW-‐15,
WWW-‐14,
ISWC-‐15,
ISWC-‐14,
ISWC-‐13,
ESWC-‐16,
IJCAI-‐13
Tutorials
• Data
Processing
and
Seman+cs
for
Advanced
Internet
of
Things
(IoT)
Applica+ons:
modeling,
annota+on,
integra+on,
and
percep+on,
Tutorial
Presenta+on
at
The
3rd
Interna+onal
Conference
on
Web
Intelligence,
Mining
and
Seman+cs
(WIMS
'13),
Madrid,
Spain.
• Trust
Networks:
Interpersonal,
Sensor,
and
Social,
Tutorial
Presenta+on
at
Interna+onal
Conference
on
Collabora+ve
Technologies
and
Systems
(CTS
2011),
Philadelphia,
Pennsylvania,
USA.
Proposals
NSF:
Contributed
to
mul+ple,
out
of
which,
one
collabora+ve
proposal
was
funded
($1.9
million).
NIH:
Lead
one
proposal
which
is
recommended
for
funding
($900K).
EU
FP7:
Contributed
to
CityPulse,
a
mul+-‐ins+tu+on
IoT
based
Smart
City
project
(€2.5
million).
Patents
• US20150006644
A1:
Assessing
Impact
of
Events
on
Public
Transporta+on
Network
• US20140372364
A1:
A
System
and
Method
for
U+lity-‐Based
Evolu+on
in
a
Constrained
Ontology
53. [AAAI-‐16]
Pramod
Anantharam,
Krishnaprasad
Thirunarayan,
Surendra
Marupudi,
Amit
Sheth,
Tanvi
Banerjee.
(2016)
Understanding
City
Traffic
Dynamics
U+lizing
Sensor
and
Textual
Observa+ons.
at
The
Thir+eth
AAAI
Conference
on
Ar+ficial
Intelligence
(AAAI-‐16),
February
12-‐-‐17,
Phoenix,
Arizona,
USA
(accepted)
[ACM-‐TIST-‐15]
Pramod
Anantharam,
Payam
Barnaghi,
Krishnaprasad
Thirunarayan,
and
Amit
Sheth.
2015.
Extrac+ng
City
Traffic
Events
from
Social
Streams.
ACM
Trans.
Intell.
Syst.
Technol.
6,
4,
Ar+cle
43
(July
2015),
27
pages.
DOI=10.1145/2717317
hUp://doi.acm.org/10.1145/2717317
[IBM-‐Tech.-‐Rep.-‐14]
Pramod
Anantharam,
Biplav
Srivastava,
Raj
Gupta.
Dynamic
Update
of
Public
Transport
Schedules
in
Ci+es
Lacking
Traffic
Instrumenta+on,
IBM
Research
Technical
Report
2014.
[ITS-‐13]
Pramod
Anantharam
and
Biplav
Srivastava,
City
No+fica+ons
as
a
Data
Source
for
Traffic
Management,
In
Proceedings
of
the
20th
ITS
World
Congress
2013,
October
14-‐18,
2013,
Tokyo,
Japan.
[SDM-‐13]
Pramod
Anantharam,
Krishnaprasad
Thirunarayan,
and
Amit
Sheth,
Traffic
Analy+cs
using
Probabilis+c
Graphical
Models
Enhanced
with
Knowledge
Bases,
2nd
Interna+onal
Workshop
on
Analy+cs
for
Cyber-‐Physical
Systems
(ACS-‐2013)
at
SIAM
Interna+onal
Conference
on
Data
Mining
(SDM13),
Texas,
USA,
May
2-‐4,
2013.
[ACM-‐WebScience-‐12]
Pramod
Anantharam,
Krishnaprasad
Thirunarayan,
and
Amit
Sheth,
Topical
Anomaly
Detec+on
from
TwiUer
Stream,
Research
Note:
In
the
Proceedings
of
ACM
Web
Science
2012,
Evanston,
Illinois,
pp.
23-‐26,
June
22-‐24,
2012.
[IEEE-‐Int.-‐Sys.-‐13]
Amit
Sheth,
Pramod
Anantharam,
Cory
Henson,
Physical-‐Cyber-‐Social
Compu+ng:
An
Early
21st
Century
Approach,
IEEE
Intelligent
Systems,
vol.
28,
no.
1,
pp.
78-‐82,
Jan.-‐
Feb.,
2013.
hUp://doi.ieeecomputersociety.org/10.1109/MIS.2013.20
[FCGS-‐13]
Krishnaprasad
Thirunarayan,
Pramod
Anantharam,
Cory
Henson,
and
Amit
Sheth,
Compara+ve
Trust
Management
with
Applica+ons:
Bayesian
Approaches
Emphasis,
In
the
Journal
of
Future
Genera+on
Computer
Systems
(FGCS),
Elsevier,
25
pages,
May
2013,
hUp://dx.doi.org/10.1016/j.future.2013.05.006
[Bosch-‐Internship-‐14]
Task
Assistance
within
IoTS
Network,
Bosch
Internship
Work,
Summer
2014.
53
Selected Publications
54. 54
Dr.
Payam
Barnaghi
Dr.
Biplav
Srivastava
Dr.
Cory
Henson
Dr.
Shalini
Forbis,
MD,
MPH
Prof.
Amit
Sheth
(Advisor)
Prof.
Krishnaprasad
Thirunarayan
(Advisor)
Prof.
Shaojun
Wang
Acknowledgements
55. Thank you J
55
kHealth
Team
Dr.
Tanvi
Banerjee
Surendra
Marupudi
Vaikunth
Sridharan
Dan
Vanuch
Sujan
Perera
And
all
my
colleagues
and
friends…
Vahid
Taslimi
Kno.e.sis,
Data
Mining
Lab