Powerpoint exploring the locations used in television show Time Clash
Trajectory Data Warehousing
1. A Study on Trajectory
Data Aggregation
Simone Campora
February 4th 2009
2. Introduction
The Purpose:
◦ Building a data warehouse for Rio de Janeiro Traffic Dpt
◦ Prototype a trajectory datawarehouse
◦ Explore DW potentialities
Challenges:
◦ How to integrate GPS data into a TrDW?
◦ Which kind of information could be extracted?
◦ Discover the issues
◦ How the results could be presented?
3. Rio The Janeiro: a Metropolis
Population (2007)
Municipality 7,145,472
Density 4,781/km2
Metro 13,782,000
HDI (2000) 0.842 – high
Streets 13858
Some Facts:
Number of cars is mostly doubled during
the last year!
4. Problems
Which problems will be considered?
Congestions: is a condition on any network as use
increases and is characterized by slower speeds,
longer trip times, and increased queuing
Emissions of CO2: How much CO2 is produced by
vehicles? Calculated on average production index
of 200 gr/Km
6. From Velocity to Traffic Density
How to use that information?
◦ We can extract the same information while
looking at vehicles’ average speed
points/KM
(50 km/h)
Traffic
Density
72 9
96 18
144 35
288 70
-> ∞ 140
7. Why a TrDW for Rio?
We would like to run queries like
◦ How the traffic congestions are evolving during
the week? (Spatial)
◦ Q2: Which are the most polluted streets?
(Spatio-Temporal)
◦ Q3: Which streets are the most congested?
(Numeric)
1
2
3
8. How could Trajectories be helpful?
Trajectory is the unit of work for our
traffic management application
we partially use the trajectory model
developed (i.e. Stops-Moves)
◦ Stops have been already calculated and are
represented by an attribute for each trajectory
◦ Trajectory segmentation is constrainted by
road network segmentation
Note
9. Our Dataset
GPS Signals
◦ Position
◦ Time
◦ Speed
Street Network
◦ Street segmentation
◦ Street names
$GPRMC, €,V,2253.7009,S,04321.2711,W,,,,021.8,W,N*1C
$GPGGA,,2253.7009,S,04321.2711,W,0,00,00.00,000012.8,M,-005.8,M,,*6E
$GPZDA,103037,11,05,2007,+00,00*65
$GPRMC, €,V,2253.7009,S,04321.2711,W,,,,021.8,W,N*1C
...
$GPRMC,103501,A,2300.0632,S,04319.8165,W,017.2,100.3,110507,021.8,W,A*0C
$GPRMC,103502,A,2300.0642,S,04319.8120,W,015.1,103.3,110507,021.8,W,A*0B
$GPRMC,103503,A,2300.0651,S,04319.8082,W,013.1,103.4,110507,021.8,W,A*00
Trajectories
14. Third Query: Numeric
Which streets have globally the worst
traffic conditions?
Traffic Index Street
25,131
For the Overall Rio
39,077
AVN AMARO CAVALCANTE
27,886 ACESSO A PTE PRES COSTA E SILVA
24,032
ACESSO AVN GOVERN CARLOS LACERDA
(LINHA AMARELA)
15,651
ACESSO DO VTO DE MANGUINHOS
15. Remarks using Oracle OLAP
Positives:
◦ Good Expressive Power for Aggregations
◦ Multi-dimensional representation
◦ SQL interface from MOLAP to Relational
Drawbacks
◦ Too many Catalog tables!
◦ No robust bulk loading methods: Fatal Errors!
◦ Slow queries also with simple mapping to Relational
◦ To query a Cube with streets and Time dimension, it is
required 3-4 Mins.
◦ Limitations of supported types:
◦ Only TEXT, Number, Date
◦ No Complex Objects
16. Conclusions
The design process is dangerous!
◦ Lack of Error Handling
SQL interface leads to wider uses e.g. GIS
tools
Future work: use OLAP DML to enhance
running times