Rajesh K. Gupta presented work on using controlled mobility of data mules to improve scalability and efficiency in cyber-physical systems for spatiotemporal data processing. The data mule scheduling problem involves selecting a path for the data mule, controlling its speed, and scheduling data collection from sensors to minimize total travel time. This problem generalizes real-time scheduling and allows exploiting mobility to reduce data delivery latency compared to static approaches. He described formulations and algorithms for solving variants of the 1D data mule scheduling problem and extensions that consider energy-latency tradeoffs and uncertainty.
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Gupta datamule
1. Rajesh K. Gupta
Department of Computer Science and Engineering
University of California, San Diego
DAC 2011, San Diego.
Controlled Mobility for Scalability and Efficiency in
CPS Spatio-temporal Processing
2. Thought-line
1. Embedded systems when scaled to societal
levels become Cyber-Physical Systems
2. (Spatiotemporal) data gathering and
processing are central to CPS
3. Processing and communication architectures
are central to scalability and efficiency of data
gathering.
2
Sensor Localization [INFOCOM 2011]
Controlled Mobility for Scale & Efficiency
[TOSN 2010, TMC 2010, INFOCOM 2009]
3. 3
Mobility Often Differentiates Sensor Networks
Habitat monitoring [Mainwaring et al., 2002]
Environmental monitoring [Batalin et al., 2004] Emergency response [Malan et al., 2004]
ZebraNet [Juang et al., 2002]
Farm management [Sikka et al., 2006]
[Kansal et al., 2004] [Todd et al., 2007][Vasilescu et al., 2005]
None Predictable Random Voluntary?
Controlled
Either sensor data moves through a network of sensor nodes (Either sensor data moves through a network of sensor nodes (MonitoringMonitoring), or), or
Sensed object moves through a network of sensor nodes (Sensed object moves through a network of sensor nodes (TrackingTracking))
4. 4
Our Pet Project with Los Alamos National Labs
http://www.lanl.gov/projects/ei/
• SHM (Structural Health Monitoring) application
– Post-event assessments for large-scale civil structures
– Passive sensors; measuring peak displacement
• Data collection by UAV (Unmanned Aerial Vehicle)
– Communication with sensors via ZigBee radio
– GPS-based autonomous control
• Quick data collection is required
– Limited fuel
5. 5
A network is not always the best way to move data
• Sensor nodes must form a network
• Network needs to be connected
• Need non-trivial resources for the network
– Networking and communications drains
nodes (some more than the others)
– Need to make sure that the nodes stay alive
and stay reachable
Base station
Sensor nodes
6. Data Mule Advantages
• Reduced hops
– Less congestion
– Less retransmissions, higher network capacity
– Less synchronization errors
• May even be faster to send large data over constrained
bandwidths
• Simpler nodes, energy harvested sensors
• DM as a resource delivery platform.
6
Base station
Data mule
Data mule approach
Sensor nodes
Optimize the motion (path, speed) of the mule to improve data delivery latency
Sensor nodes
7. 7
About an eight-year old problem
• Data Mobile Ubiquitous LAN Extensions by Shah+Roy, WSN 2003
– Random mobility
• “Mobile Router” by Kansal+Srivastava, MobiSys 2004: Controlled
mobility along a fixed path
– Controlled mobility on periodic routes, builds upon Directed Diffusion.
Problems
Assumptions
RemarksComm. only
at node
Speed of
data mule
Instant
move/stop
[Zhao et al., 03] Path +
Speed
Variable Heuristic algorithm
[Kansal et al., 04] Speed Variable Adaptive heuristic
algorithm
[Somasundara et al., 04] Path Constant +
Stop
NP-hardness;
Heuristic algorithm
[Ma, Yang, 06] Path Constant Heuristic; Assume
negligible comm. time
[Ma, Yang, 07] Path Constant +
Stop
Heuristic; Stop to
communicate
[Xing et al., 07] Path Constant +
Stop
Heuristic;
MANETs with mobile nodes: Epidemic Routing, Message Ferrying, Many-to-many comms.
8. Data Mule Scheduling (DMS) Problem
• Path Selection Problem (1D DMS)
• Energy-latency tradeoffs with DMS
• DMS as a proxy for other important CPS
problems
– scale-parameterized scheduling problems
(DVFS)
8
9. 9
B
Idea of DMS formulation
• Assumptions
– Communication is possible only within the intervals
– Communication takes fixed amount of time
– Node location, communication range/time are given
A
C
Data mule
Communication
range
Data mule's path
Location
A
B
C
eB
eC
eA
Communication timeNode
10. 10
Terminology and definitions
Time
A
B eB
C eC
eA
Jobs
Release time Deadline
Feasible interval
Execution timeSimple jobs
Location
A
B
C
eB
eC
eA
Location jobs
Simple
location jobs
General
location jobs
In real-time scheduling …
In DMS (Data Mule Scheduling) problem …
D eD
General jobs
D eD
Execution time
Release location Deadline location
Feasible location interval
11. 11
B
Idea of DMS formulation (1/2)
• Consider communication as a location job
– Location constraint: Feasible location intervals
– Time constraint: Execution time
A
C
Data mule
Communication
range
Data mule's path
Location
A
B
C
eB
eC
eA
Execution timeLocation jobs
12. 12
Idea of DMS formulation (2/2)
Location
A
B
C
eB
eC
eA
Location jobs
Time
A'
B'
C'
eB
eC
eA
Jobs
+ Time-speed profile
(i.e., change of speed over time)
Set of
Location jobs
Set of
(real-time) jobs
Time
A'
B'
C'
eB
eC
eA
Jobs
Faster speed
• Location job is mapped to "job" when the speed is given
– Real-time scheduling problem
13. 13
Data Mule Scheduling (DMS) Problem
• Three subproblems
– Path selection
• Which trajectory the data mule follows
– Speed control
• How the data mule changes the speed along the path
– Job scheduling
• From which sensor the data mule collects data at certain time
• Objective
– Minimize the total travel time (≈ data delivery latency)
1-D DMS
Path selection
Communication range
node A
node B node C
Location job
Speed control
Speed
Time
Location
A
B
C
Execution time
e(A)
e(B)
e(C)
Job scheduling
Time
Execution time
A’
B’
C’
Job
Time
A’
B’
C’
e(A)
e(B)
e(C)
14. 14
1-D DMS: Closer look
A
B
Location
C
Location
Speed
Time
Time
A
B
C
Input: Set of location jobs Time-Speed profile (Solution for the problem)
Corresponding Real-time Scheduling problem
Time
Time-Location profile (determined by Time-Speed profile)
Execution time
Execution time
Location job
Job
e(A)
e(B)
e(C)
e(A)
e(B)
e(C)
15. 15
1-D DMS: Job scheduling
• Simple jobs (i.e., one feasible interval)
– Earliest deadline first (EDF) is an optimal online scheduling algorithm
[Liu, Layland, 1973]
• “Always execute the job with the earliest deadline”
• General jobs (i.e., multiple feasible intervals)
– No optimal online algorithm: Proof by an adversary argument
• An adversary can make any online schedulers fail by releasing a new job
– Offline: Linear programming (LP) formulation
2z Time
1e
2e
3e
25x
11x 12x
22x 23x
13x 14x 15x
33x 34x 35x 36x
26x 27x
16. 16
1-D DMS: Speed control
• Three different cases:
– Simple cases
• Constant speed
• Variable speed
– General case
• Variable speed with acceleration constraint
17. 17
Speed control: Simple cases
• “Processor demand” (for simple jobs)
– Sum of execution time of the jobs that are completely contained in the
interval
• Feasibility test [Baruah et al., 1993] [Yao et al., 1995]
– Optimal offline algorithm from processor speed scaling by YDS.
– Find minimum maximum speed from all ‘tight’ intervals.
– Move any faster, there exists at least one infeasible interval.
timet1 t2
job 1: e1
job 2: e2
job 3: e3
job 4: e4
(Const. speed, Variable speed)
Pr : set of release time
P : set of deadline
A set of simple jobs
is feasible
Proc. demand
for any
Feasible interval of job
18. 18
Complexity of 1-D DMS problem
Simple jobs General jobs
Offline Online Offline Online
Real-time scheduling EDF [Liu, Layland, 1973] LP Non-existent
1-D
DMS
Simple
case
Constant
speed
Non-existent LP Non-existent
Variable
speed
(vmin = 0)
LP Non-existent
(vmin > 0)
Non-existent
General
case
Variable
speed with
acceleration
constraint
Open Non-existent
(fixed k ≥ 2)
NP-hard
Non-existent(k arbitrary)
NP-hard in the
strong sense
Contributions
Hard problems Design heuristic algorithm
19. 19
Heuristic algorithm for general case (1/2)
• Simplify
– Convert all general location jobs to simple location jobs
– Proportionally distribute the execution time
• Maximize
– Increase the speed until a tight interval is found
• Tight interval: Processor demand = interval length
Location
Speed
Full accel/decel at the maximum rate
Location
Speed
Tight interval
Accel
interval
Decel
interval
Plateau interval
Proc. demand = time duration
General
location job
A
Execution
time
Execution
timeSimple
location job
A1
A2
A3
20. 20
Heuristic algorithm for general case (2/2)
• Trim
– Eliminate all fixed intervals from remaining jobs
• Fixed interval: Intervals that the speed is already determined
• Execution time of each location job is changed accordingly
• Recursion
– Repeat from “Maximize” for the remaining intervals
Location
Speed
Recursively maximize
Accel interval
Location
Decel interval
LocationLocation
Tight interval
Location
Speed
Tight interval
Accel interval Decel intervalPlateau interval
22. 22
1-D DMS
Path selection
Communication range
node A
node B node C
Location job
Speed control
Speed
Time
Location
A
B
C
Execution time
e(A)
e(B)
e(C)
Job scheduling
Time
Execution time
A’
B’
C’
Job
Time
A’
B’
C’
e(A)
e(B)
e(C)
Data Mule Scheduling (DMS) Problem
• Three subproblems
– Path selection
• Which trajectory the data mule follows
– Speed control
• How the data mule changes the speed along the path
– Job scheduling
• From which sensor the data mule collects data at certain time
• Objective
– Minimize the total travel time (≈ data delivery latency)
23. 23
Path selection problem
• Objective:
– Find a path s.t. the induced 1-D DMS problem has the minimum travel time
– Problem: Difficult to find such a path
• Interrelation between path and travel time is unclear
• Infinite choices of path
• Idea: Simplify the problem as a graph problem
– Find the shortest tour that covers all labels
s
1
2 3
4
5
Location
1
2
3
4
5
start end
Location
1
2
3
4
5
endstart
Location
1
2
3
4
5
start end
25. 25
Experiments
• Methods and parameters
– 20 nodes randomly deployed in circular area (radius: 500m)
– Each node has data that needs 10 secs for transmission
– Data mule movement: m/s
– Average of 100 trials
Proposed algorithm successfully exploits larger communication rangeProposed algorithm successfully exploits larger communication range
Totaltraveltime(sec)
Communication range
0
100
200
300
400
500
600
700
0 20 40 60 80 100 120
No remote communication, constant speed
e.g., [Ma, Yang, 2006], [Xing et al., 2007]
Remote communication, constant speed, stop to transmit
e.g., [Ma, Yang, 2007]
Remote communication, visit all nodes, variable speed
e.g., [Zhao et al., 2003]
Label Covering TSP Formulation
[Sugihara TOSN 2010]
26. 26
Energy-Latency Tradeoff
• Objective:
– Add flexibility to energy-latency trade-off
• Idea:
– Combine multihop forwarding and data mule approach
Energy consumption
Data delivery
latency
Data mule
Multihop forwarding
27. 27
Hybrid approach
• “Forwarding” subproblem
– Determine where to gather data under energy constraint
– Objective:
• Find a forwarding strategy s.t. the induced DMS problem has
the minimum travel time
– Difficult
• Simplified: Forward as close to the base station as possible
29. 29
Preliminary result: Disconnected network
0
100
200
300
400
500
600
700
0 5 10 15 20
Time(sec)
Travel time
Lower bound of forwarding time
Energy consumption limit
30. 30
DMS with uncertainty
Location
Known comm. range
Unknown comm. range
Data mule’s path
• Idea: Semi-online scheduling
– Assumption: “Communication is possible in the vicinity
of node”
• ... plus unknown communication range
– Strategy
• Offline scheduling with known communication range
• Opportunistically exploit unknown communication range
– As in online scheduling
31. Idea of 2-D Semi-online Algorithm
A
B C
D
E
P
Node 1
Known comm. range
Unknown comm. range
Data mule’s path
Node 2
Job execution in offline schedule
Actual job execution
33. Closing Thoughts
• DMS provides a framework for solving
spatiotemporal data collection problems
• Scale-parameterized scheduling
– Real-time scheduling where some parameters are
scaled by a factor
• Discretize location, speed and time
• Find an appropriate step size for each of these three that
guarantees an approximation ratio
• In the discretized configuration space, use dynamic
programming to find the optimal trajectory
• Similarity with speed scaling
– Inverse relationship between data mule's speed and processor
speed
• Ongoing work addresses assumptions made…
33
Hinweis der Redaktion
Becoming to be used in more and more various applications Mobility is one of the aspects that have been diversifying sensor networks deal with various forms of “mobility” Now, we focus on mobility
One of the most characteristic features of this application is to use a UAV to collect data from the sensors
We can consider this path as one-dimensional location axis
For convenience, we introduce some terminology and definitions
“ Location on the path”; We can consider this path as one-dimensional location axis
If it’s too fast, data mule may not be able to finish all the jobs. "How should we change the speed of data mule so that it can collect the data from all the sensor nodes in the shortest amount of time?" "Also, in that travel, from what node should the data mule collect data at what time?"
We first talk about 1-D DMS problem and then about path selection
change speed instantaneously
One interesting observation is that the 1-D DMS problem is related to DVS problem, and we can use the optimal algorithm for DVS for solving DMS problem
... except this (variable-simple-online) case which we can use EDF-based algorithm
constant rate acceleration is represented as a quadratic curve in location-speed profile
constant rate acceleration is represented as a quadratic curve in location-speed profile
Not clear until we solve each of these 1-D DMS problems Finding continuous paths is difficult + Not clear which path leads to shorter travel time Edge (1,3) passes through the communication range of nodes 1,2,3.
bottom: pink: DM goes to node’s exact location regardless of size of comm range
In multihop forwarding approach, energy consumption is higher and latency is presumably lower than DM approach
Base station is the bottleneck The result suggests that, in this case, multihop forwarding approach is not optimal and we can further improve the data delivery time by using hybrid approach.