SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Finding the Shortest Path in Transit Networks

                                        Victor Chircu

         Faculty of Computer Science, “AlexandruIoanCuza” University, Iasi, Romania
                           victor.chircu@info.uaic.ro



        Abstract.More and more governments provide Sparql
        endpoints to query over public data, which sometimes
        includes route configuration for transit agencies.
        Knowing the route configuration and geolocation for
        each route stop in a transit network, one can run an
        algorithm to find the (k-)shortest path(s) from point
        A to point B. Since the Romanian Government does not
        provide this kind of information, this article
        describes a way one can access this data, store it in
        a triple server, and expose it in Sparql endpoint.

        Keywords: semantic web, rdf, sparql, triple store, virtuoso universal server,
        dotNetRdf, transit network, public transportation network, k-shortest path, A*


1       Introduction

The focus of this paper is to provide a way to acquire, store and expose transit data, if
this data is not found open over the Web. In recent years, public transportation agen-
cies have been providing transit data freely over the internet, data that can be used by
developers to build useful applications. In some cases1, the agencies even encourage
developers to do so, by organizing applications contests. This trend started in 2005,
when TriMet2and Google developed the Google Transit Feed Specification (GTFS)3
format that will be used for finding routes using the Google Transit Trip Planner4.
Over the years, more transit agencies incorporated trip planners in their websites, and
provided data in GTFS format to be used with the Google Transit Trip Planner.
    Unfortunately, this is not the case for Romania, where public transportation
(linked) data is still closed. For example, the RATB5(the main bus transit agency in
Bucharest) websitedoes not provide a trip planner, but does provide an HTML view
of all the routes, and stops per routes. On the other hand, Metrorex6 (subway transit

1
    http://mtaappquest.com/
2
    http://trimet.org/
3
    https://developers.google.com/transit/gtfs/reference
4
    http://www.google.com/intl/en/landing/transit/#mdy
5
    http://www.ratb.ro/
6
    http://www.metrorex.ro
agency in Bucharest) does provide a trip planner, but provides only a visual route
map, which cannot be parsed using a computer. But Metorex’s route planner would
need some improvement, since it provides routes only from station to station (the
average user would prefer to enter the origin and destination by clicking on a map)
                                enter
and does not connect with the RATB routes. Thus, there is a need for a city level tran-
                                                                                    tra
sit route planner that uses routes from all agencies in a city. This is the main purpose
of my dissertation paper, but this paper focuses only on getting the data, organizing it
                           bu
in a manner that fits my purpose and exposing it in a Sparql endpoint, so it can be
used by other developers.


2      Getting the data

I have managed to get part of the data I need as an SQL database. This database co
                                                                    This          con-
tains most of the routes (name, geometry) and stops (name, location) in the Bucharest
transit network. The routes are managed by the two biggest transit agencies in the
city, RATB and Metrorex. This data is incomplete, because I need to know the route
stops in each route and their sequence. To get this information, I can use a technique
calledweb scarping, because the RATB website does provide a way for human users
                    ,
to see what stops are in each route. As defined in [1], web scraping is“is a computer
                                                                               omputer
software technique of extracting information from websites
                                                   websites”.
Part of the page for one of the RATB routes on the RATB website is represented in
Figure 1.




                         Fig.1.Route page on the RATB website.
                         Fig.
As you can see, you can select the route from one of the drop down lists (each list
represents a type of route: rail, trolley, bus). The selected route name must be sent as
a parameter to the server. To find out how, we can use a Net monitoring tool, like
Firebug7for Mozila’s Firefox browser. So, for example, let’s say we have chosen 106
                        irefox
from the bus route drop down list (the third from the left). This selection automatica
                                                                            automatical-
ly triggers a postback. The post parameters are shown in Figure 2.




                             2.Using Firebug to detect post parameters.
                         Fig.2


As you can see, the route name is indeed sent as a post parameter, with the name
tlin3. Knowing this, I can build a small (PowerShell) script that makes a GET request
                                             (
to http://www.ratb.ro/v_trasee.php, gets all the options from the HTML select element
   http://www.ratb.ro/v_trasee.php,
that I am interested in (e.g.: the bus route list), and for each option value, makes an
HTTP Post requestwith the required post parameters (e.g.: if we choose the third drop
              questwith
down list, then thetlin1, tlin2, tlin4, tlin5 parameters are set to 0, and tlin3 is set to the
                   tlin1,
option value). The POST returns the page in HTML format. Now, I have to parse the
page, extract the information from the table and write it to disk. The next step is to
                                                                              next
write the route steps to the existing SQL database. This can be done manually, or
automatically, by matching the stop name from the file with the stop name in the d         da-
tabase. Following these steps I have found the complete route itinerary for three        th
routes.




7
    http://getfirebug.com/
3       Modeling the data

3.1     General Information
Taking in account the classes of entities, and the relationships between these classes,
the domain I want to model is shown in Figure 3.




                                      Fig.3.Transit Domain Model.

Because I want the dataset that I am exposing to be extensible and interoperable with
other datasets, I have chosen to model the data using RDF triples, store it in a Triple
Store and expose it in a SPARQL endpoint.
In the semantic Web, a knowledge base co  contains a set ofRDF (Resource Description
                                                           RDF
Framework) [2] triples of the form <subject, predicate, object>. Resources are identi-
                                                                                 ident
fied by a unique URI. Subjects and predicates are always resources, while objects can
be either resources or literals (having an associated data type). Using this model,any
                                           a
kind of domain can be represented.
   Because in semantic web it is strongly advised to use, if possible, existing RDF v
                                                      use                           vo-
cabularies, and not create a new one, I have searched for one that would suit my
                                        ,
needs. I have found the T Transit8vocabulary, which models exactly the domain I am

8
    http://vocab.org/transit/terms/
trying to describe. Transit is “a vocabulary for describing transit systems and routes
                  .             a                                               routes”
and is based on the General Transit Feed Specification published by Google. The
      s                                                                    Google.
Transit vocabulary core classes are shown in Figure 4.
                      re
   The Transit vocabulary is already used to expose MTA New York city transit data 9
and bus route data in Southampton, UK10 , so it definitely is the most known transit
                       Southampton
RDF vocabulary.




                             Fig. 4.Transit Vocabulary Core Classes.

As you can see, the classes in this vocabulary map over the classes in my domain
model. There is one class that I do not use in my domain module, the Schedule class.
This is done because in most of the cities in Romania, public transportation does not
run according to a fixed schedule. By eliminating this class, I have greatly minimized
the amount of data I have to manipulate.




9
     http://kasabi.com/dataset/mta
                          aset/mta-new-york-city-transit
10
     http://data.southampton.ac.uk/bus-routes.html
     http://data.southampton.ac.uk/bus
3.2      Transit Core Classes and Properties
In this section, I will go over the main Transit classes and properties that I am using
anddefine them according tothe Transit vocabulary document11. You may notice that I
use only a subset of the classes and properties defined in this document, because I use
a model a little less general, by eliminating the Schedule and Service classes.

Note: Each resource’s URI is written in the page footnote section


Classes

 Agency12 - an agency is an organization that oversees public transportation for a
  city or region (e.g. RATB, Metrorex).
 Route13 - a public transportation route; some of its subclasses are:
  ─ Bus Route14
  ─ Rail Route15
  ─ Subway Route16
 Stop17 - a location where passengers board or disembark from a transit vehicle
 Route Stop18 - a location where passengers board or disembark from a transit ve-
  hicle for a specific route.


Properties


 agency19 - the agency that operates this public transportation route
 route20 - A route associated with the given resource
 routeStop21 - Links a route to a particular stop and the sequence of that stop in the
  route
 stop22 - the physical stop associated with this route stop (Note: this property is not
  used according to the Transit vocabulary, because having this property implied be-
  ing a ServiceStop. In my domain, having this property implies being a RouteStop,
  because I do not need to use the ServiceStop class.


11
     http://vocab.org/transit/terms
12
     http://vocab.org/transit/terms/Agency
13
     http://vocab.org/transit/terms/Route
14
     http://vocab.org/transit/terms/BusRoute
15
     http://vocab.org/transit/terms/RailRoute
16
     http://vocab.org/transit/terms/SubwayRoute
17
     http://vocab.org/transit/terms/Stop
18
     http://vocab.org/transit/terms/RouteStop
19
     http://vocab.org/transit/terms/agency
20
     http://vocab.org/transit/terms/route
21
     http://vocab.org/transit/terms/routeStop
22
     http://vocab.org/transit/terms/stop
3.3      Other Vocabularies

Some of the other RDF vocabularies I am using are:

 FOAF23, for the name and page properties
 RDF24, for the type property
 RDF Schema25, for the label property
 Geometry Ontology 26 , for describing the geometry of a route in Well-Known
  Text27 (WKT) format.
 WGS84 Geo Positioning28, for representing latitude and longitude information in
  the WGS84 geodetic reference datum


4        Technical Information

4.1      General Information
Because of my hands-on experience with the Microsoft development stack, I have
chosen to develop the application using .Net/C#. This might be a challenge, because
most of the semantic web tools focus on non-Windows client, as seen in the pie chart
represented in Figure 5, taken from [3]. The same source[3] shows that this trend is
constant, since in 2011 there were not developed any new semantic tools for the .Net
platform.
I have taken in consideration switching to the Java platform, which provides a set of
powerful tools for working with RDF and OWL, most notably the Jena Framework29,
developed by Apache, but after doing some research I have found a tool written for
the .Net platform which suits my needs. This tool will be presented in detail in section
4.4 of this paper.




23
     http://xmlns.com/foaf/0.1/
24
     http://www.w3.org/1999/02/22-rdf-syntax-ns#
25
     http://www.w3.org/2000/01/rdf-schema#
26
     http://data.ordnancesurvey.co.uk/ontology/geometry/
27
     http://edndoc.esri.com/arcsde/9.0/general_topics/wkt_representation.htm
28
     http://www.w3.org/2003/01/geo/wgs84_pos#
29
     http://incubator.apache.org/jena/
Fig.4. Semantic Web Tools by Programming Language


4.2      Knowledge base

One of the disadvantages of using 3rd party triple stores is that there aren’t any open
                                                                                   open-
source products. But because of the nature of my problem, I could not use an in       in-
memory triple store, I needed an efficient one, with a powerful query engine. Upon   pon
researching different options, I have decided to use OpenLink’s Virtuoso Universal
                      option ,
Server30 as a Triple Store. My option was based on Virtuoso maturity, and its RDF
Graph Model features31:

 Backward
    ackward     Chaining    OWL       Reasoner    covering:      rdfs:subClassOf,
  rdfs:subPropertyOf, owl:sameAs, owl:equivalentClass, owl:equivalentProperty,
         bPropertyOf,
  owl:InverseFunctionalProperty, owl:inverseOf, owl:SymmetricalProperty, and
  owl:TransitiveProperty
 SPARQL 1.1 Query Language, Protocol, and Results Serialization support
 SPARQL Create, Update, and Delete (SPARUL)



30
     http://virtuoso.openlinksw.com/
31
     http://virtuoso.openlinksw.com/rdf-quad-store/
     http://virtuoso.openlinksw.com/rdf
 Supports data broad range of RDF model data representation formats:
  HTML+RDFa, RDF-JSON, N3, Turtle, TriG, TriX, and RDF/XML
 REST interfaces for Create, Read, Update, and Delete operations
 RDF Data is accessible also accessible via ODBC, JDBC, ADO.NET (Entity
  Frameworks compatible), OLE DB, and XMLA data providers / drivers.


Because the application that I’m building is a non-commercial application, I am not
interested in acquiring a Virtuoso commercial license at the moment. OpenLink does
provide 2 X 15 days trial of the software. I have used the first one while configuring
and testing the Virtuoso Server, and will use the second one on demos later on. While
working on the application, I will use in-memory triple stores, loaded and saved to the
file system.


4.3      .Net API for working with RDF
While looking for a .Net API for working with RDF I have found three possible can-
didates:

 Linq2RDF32, a LINQ query provider that converts queries into the SPARQL query
  language. Unfortunately, it is not a mature enough API and the last update for this
  project was in august 2008, so it is not under development anymore.
 Jena .NET33, a flexible .NET port of the Jena semantic web toolkit. Unfortunately
  this project is abandoned too, while still a beta 0.3 release.
 dotNetRDF34, an open-source semantic web/RDF library for C#/.Net. Even if this
  is just a beta 0.5 release, it is still under development, which is a big advantage
  over the other two options. This API will be described in the next section.


4.4      dotNetRDF

General Information
Some of the points of interest regarding the API are:

 currently a beta release (version 0.5.1)
 works on .Net 3.5 (but according to the project’s Issue Tracker35, moving the li-
  brary to .Net 4.0 is a top priority)
 "simple but powerful API for working with RDF"
 operates primarily with Triples, Graphs and Triple Stores
 has limited support for Inference
 no support for OWL


32
     http://code.google.com/p/linqtordf/
33
     http://semanticweb.org/wiki/Jena_.NET
34
     http://www.dotnetrdf.org/
35
     http://www.dotnetrdf.org/tracker/Issues/IssueDetail.aspx?id=22
Known formats
The library can read RDF fragments (including Graphs and Triple Stores) from
strings, files and even URIs. It can also write RDF fragments to files and strings.
Reading and writing can be done in all of the most used RDF formats: RDF/XML,
RDF/JSON, NTriples, Turtle, Notation 3, XHTML + RDFa/


Graphs
The API has support for getting Nodes and Triples from a Graph by a given criteria
(which is a combination of subject, predicate and object), merging graphs and compu-
ting graph difference and equality.


Triple Stores
The library can work with both in-memory Triple Stores and native Triple Stores.
It provides support for working with:

 in-memory triple stores, loaded and saved from and to disk in two ways:
  ─ a folder, where each files represents a single Graph, and there is an additional
     index file
  ─ a single file, using one the following formats: TriG, TriX and NQuads
 simple SQL based stores with MySQL and Microsoft SQL Server databases
 native 3rd party Triple Store: AllegroGraph36, Dydra37, 4store38, Fuseki39, Joseki40,
  Sesame 41 (any Sesame based store e.g. BigOWLIM 42 ), SPARQL Graph Store
  HTTP Protocol for compliant stores, Stardog43, the Talis Platform44 and Virtuoso.

By providing an easy way to work with Virtuoso based Triple Stores, dotNetRDF
proves to be the right choice.


Querying
Using dotNetRDF one can query easily over:

 in-memory Graph using the library’s SPARQL implementation
 remote SPARQL endpoints
 3rd party Triple Stores, using native query (this is very important, since we can
  rely on the more powerful Virtuoso query engine and not on the weaker dot-
  NetRDF implementation.

36
     http://www.franz.com/agraph/allegrograph/
37
     http://dydra.com/
38
     http://4store.org/
39
     http://incubator.apache.org/jena/documentation/serving_data/index.html
40
     http://www.joseki.org/
41
     http://www.openrdf.org/
42
     http://www.ontotext.com/owlim
43
     http://stardog.com/
44
     http://www.talis.com/platform/
The query mechanism is compatible with the current draft45 of the SPARQL 1.1 stan-
dard.


Inference and Reasoning
The current version provided three types of reasoners:

 RDFS Reasoner, which does not apply the full range of possible RDFS based infe-
  rencing but does do the following:
  ─ asserts additional type triples for anything which has a type which is a sub-class
     of another type
  ─ asserts additional triples where the property (predicate) is a sub-property of
     another property
  ─ asserts additional type triples based on the domain and range of properties
 SKOS Reasoner is a simple concept hierarchy reasoner which can infer additional
  triples where the subject has an object which is a skos:Concept in the taxonomy by
  following skos:narrower and skos:broader links as appropriate.
 Simple N3 Rules Reasoneris a reasoner that is able to apply simple N3 Rule

Unfortunately, there is no API support for using inference with 3rd party Triple
Stores. Because of this, the reasoner that comes with the Virtuoso Universal Server
cannot be used.


Configuration
The library comes with a very useful Configuration API that can be used to load dy-
namically commonly used objects (such as Graphs, connections to Triple Stores etc.),
and a couple of tools for deploying RDF enabled ASP.NET Web Applications. Be-
cause of these last two features, exposing a SPARQL endpoint is a trivial task.


5        Implementation Details

5.1      Populating the Virtuoso RDF Triple Store
As mentioned in section 2, I have the data in a MS SQL Server Database, and, as
mentioned in section 3, I have the RDF vocabulary. What I have to do is migrate the
data from the SQL database into the Virtuoso Triple Store. I have done this in two
steps:

 Write the data from the SQL database into a set of files. To do this in the simplest
  way, I have used the RAD features of the Visual Studio 2010 IDE together with
  Entity Framework: the Entity Framework46 has created a set of classes, based on



45
     http://www.w3.org/TR/sparql11-query/
46
     http://msdn.microsoft.com/en-us/data/ee712906
the database’s tables. In code, I got the data from the DB, using these classes, and
  wrote the data to a set of files.
 The second step was to write the data from the files to a Graph. This could not be
  done because EF 4.1 is not compatible with .NET 3.5, and the dotNetRDF library
  is built on .NET 3.5. After this step, I have written the data from the Graph to the
  Virtuoso Triple Store (dotNetRDF makes this task easier).



5.2    Exposing the data through a SPARQL Endpoint
To expose the data in the Virtuoso Triple Store through a SPARQL Endpoint, I
created a new ASP.NET Web Application, and Added in the App_Data folder, a con-
figuration file with the following content:

@prefix dnr: <http://www.dotnetrdf.org/configuration#> .

# Firstly note that our Handler must have a subject which
is a special
dotNetRDF URI as discussed in Configuration API - HTTP
Handlers
<dotnetrdf:/sparql> a dnr:HttpHandler ;
dnr:type "VDS.RDF.Web.QueryHandler" ; # States that we're
using the
QueryHandler
dnr:queryProcessor _:proc .

_:proc a dnr:SparqlQueryProcessor ;
dnr:type "VDS.RDF.Query.SimpleQueryProcessor" ;
dnr:usingStore _:store .

_:store a dnr:TripleStore ;
dnr:type "VDS.RDF.NativeTripleStore" ;
dnr:genericManager _:manager .

# Register the Virtuoso Ffactory
_:virtuosoFactory a dnr:ObjectFactory ;
 dnr:type "VDS.RDF.Configuration.VirtuosoObjectFactory, dot-
NetRDF.Data.Virtuoso" .

# Now we define the initial dataset
_:manager adnr:GenericIOManager ;
dnr:type "VDS.RDF.Storage.VirtuosoManager, dot-
NetRDF.Data.Virtuoso" ;
dnr:server "myIp" ;
dnr:port "1111" ;
dnr:database "DB" ;
dnr:user "user" ;
dnr:password<appSettings:VirtuosoPassword> .


As you can see, we register an Http Handler47 of type QueryHandler. To this handler,
we associate a QueryProcessor. The QueryProcessor use a Native Triple Store, which
is defined by a manager that points to the Virtuoso Universal Server instance.
Now all I have to do is to register the handlers in the web application’s configuration
fiel (web.config), which is done automatically by the rdfDeploy tool that comes with
dotNetRDF.


6        Future Development

As described in [4], [5], any algorithm that aims to get the shortest path in a network
transit system needs some sort of pre-processing. This is needed since web usage
studies have shown that the path computation time should be less than 7 seconds [6],
[7]. The pre-processed data needs to be stored in the knowledge base, and since it is
algorithm dependent, I might have to extend the Transit vocabulary with a couple of
new classes and properties in order to store the information.




47
     http://msdn.microsoft.com/en-us/library/aa479332.aspx
7     Bibliography
1. ***, Web Scraping, http://en.wikipedia.org/wiki/Web_scraping
2. Allemang, D., Hendler j., Semantic Web for the Working Ontologist, Morgan Kaufmann,
   2008
3. ***,        The       State      of       Tooling        for      Semantic      Technolo-
   gies,http://www.mkbergman.com/991/the-state-of-tooling-for-semantic-technologies/
4. J. Jariyasunant, D. Work, B. Kerkez, R. Sengupta, S. Glaser, A. Bayen., Mobile Transit
   Trip Planning with Real–Time Data. Presented at the Transportation Research Board ,
   2010
5. Qiujin Wu, Joanna Hartley, Using K-Shortest Paths Algorithms To Accommodate User
   Preferences In The Optimization Of Public Transport Travel, Applications of Advanced
   Technologies in Transportation Engineering, Proceedings of the Eighth International Con-
   ference, pp. 181-186, 2004
6. R. Jain, T. Raleigh, C. Graff, and M. Bereschinsky, Mobile internet access and qos guaran-
   tees using mobile ip and rsvp with location registers, IEEE Int. Conf. Commun., vol. 3, pp.
   1690–1695, 1998.
7. T. Erl, Ed., Service-oriented architecture (SOA): concepts, technology, and design. Pren-
   tice Hall, 2005.

Weitere ähnliche Inhalte

Ähnlich wie Public Transportation Path Finder

DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...ijmech
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...ijmech
 
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...ijmech
 
A new approach in position-based routing Protocol using learning automata for...
A new approach in position-based routing Protocol using learning automata for...A new approach in position-based routing Protocol using learning automata for...
A new approach in position-based routing Protocol using learning automata for...ijasa
 
Network analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networksNetwork analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networksDepartment of Applied Geology
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsNicole Adams
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsAlicia Buske
 
Transport mapping: The OSM Route
Transport mapping: The OSM RouteTransport mapping: The OSM Route
Transport mapping: The OSM RouteHarry Wood
 
Predicting Operating Train Delays into New York City using Random Forest Regr...
Predicting Operating Train Delays into New York City using Random Forest Regr...Predicting Operating Train Delays into New York City using Random Forest Regr...
Predicting Operating Train Delays into New York City using Random Forest Regr...AI Publications
 
Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...
Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...
Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...IJCNCJournal
 
Traffic assignment
Traffic assignmentTraffic assignment
Traffic assignmentMNIT,JAIPUR
 
Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Papitha Velumani
 
SoTM US Routing
SoTM US RoutingSoTM US Routing
SoTM US RoutingMapQuest
 
T drive enhancing driving directions with taxi drivers’ intelligence
T drive enhancing driving directions with taxi drivers’ intelligenceT drive enhancing driving directions with taxi drivers’ intelligence
T drive enhancing driving directions with taxi drivers’ intelligenceJPINFOTECH JAYAPRAKASH
 
6 10-presentation
6 10-presentation6 10-presentation
6 10-presentationRemi Arnaud
 
PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...
PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...
PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...IJCNCJournal
 
Improving End-to-End Network Throughput Using Multiple Best Pa.docx
Improving End-to-End Network Throughput Using Multiple Best Pa.docxImproving End-to-End Network Throughput Using Multiple Best Pa.docx
Improving End-to-End Network Throughput Using Multiple Best Pa.docxsheronlewthwaite
 

Ähnlich wie Public Transportation Path Finder (20)

Mt croid
Mt croidMt croid
Mt croid
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
 
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
 
A new approach in position-based routing Protocol using learning automata for...
A new approach in position-based routing Protocol using learning automata for...A new approach in position-based routing Protocol using learning automata for...
A new approach in position-based routing Protocol using learning automata for...
 
Network analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networksNetwork analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networks
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment Algorithms
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment Algorithms
 
Transport mapping: The OSM Route
Transport mapping: The OSM RouteTransport mapping: The OSM Route
Transport mapping: The OSM Route
 
Predicting Operating Train Delays into New York City using Random Forest Regr...
Predicting Operating Train Delays into New York City using Random Forest Regr...Predicting Operating Train Delays into New York City using Random Forest Regr...
Predicting Operating Train Delays into New York City using Random Forest Regr...
 
Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...
Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...
Performance comparison of aodv and olsr using 802.11 a and dsrc (802.11p) pro...
 
Traffic assignment
Traffic assignmentTraffic assignment
Traffic assignment
 
p27
p27p27
p27
 
Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks
 
SoTM US Routing
SoTM US RoutingSoTM US Routing
SoTM US Routing
 
Survey of mirp for vehicular ad hoc networks in urban environments
Survey of mirp for vehicular ad hoc networks in urban environmentsSurvey of mirp for vehicular ad hoc networks in urban environments
Survey of mirp for vehicular ad hoc networks in urban environments
 
T drive enhancing driving directions with taxi drivers’ intelligence
T drive enhancing driving directions with taxi drivers’ intelligenceT drive enhancing driving directions with taxi drivers’ intelligence
T drive enhancing driving directions with taxi drivers’ intelligence
 
6 10-presentation
6 10-presentation6 10-presentation
6 10-presentation
 
PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...
PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...
PERFORMANCE ANALYSIS OF ROUTING PROTOCOLS WITH ROADSIDE UNIT INFRASTRUCTURE I...
 
Improving End-to-End Network Throughput Using Multiple Best Pa.docx
Improving End-to-End Network Throughput Using Multiple Best Pa.docxImproving End-to-End Network Throughput Using Multiple Best Pa.docx
Improving End-to-End Network Throughput Using Multiple Best Pa.docx
 

Kürzlich hochgeladen

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Kürzlich hochgeladen (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Public Transportation Path Finder

  • 1. Finding the Shortest Path in Transit Networks Victor Chircu Faculty of Computer Science, “AlexandruIoanCuza” University, Iasi, Romania victor.chircu@info.uaic.ro Abstract.More and more governments provide Sparql endpoints to query over public data, which sometimes includes route configuration for transit agencies. Knowing the route configuration and geolocation for each route stop in a transit network, one can run an algorithm to find the (k-)shortest path(s) from point A to point B. Since the Romanian Government does not provide this kind of information, this article describes a way one can access this data, store it in a triple server, and expose it in Sparql endpoint. Keywords: semantic web, rdf, sparql, triple store, virtuoso universal server, dotNetRdf, transit network, public transportation network, k-shortest path, A* 1 Introduction The focus of this paper is to provide a way to acquire, store and expose transit data, if this data is not found open over the Web. In recent years, public transportation agen- cies have been providing transit data freely over the internet, data that can be used by developers to build useful applications. In some cases1, the agencies even encourage developers to do so, by organizing applications contests. This trend started in 2005, when TriMet2and Google developed the Google Transit Feed Specification (GTFS)3 format that will be used for finding routes using the Google Transit Trip Planner4. Over the years, more transit agencies incorporated trip planners in their websites, and provided data in GTFS format to be used with the Google Transit Trip Planner. Unfortunately, this is not the case for Romania, where public transportation (linked) data is still closed. For example, the RATB5(the main bus transit agency in Bucharest) websitedoes not provide a trip planner, but does provide an HTML view of all the routes, and stops per routes. On the other hand, Metrorex6 (subway transit 1 http://mtaappquest.com/ 2 http://trimet.org/ 3 https://developers.google.com/transit/gtfs/reference 4 http://www.google.com/intl/en/landing/transit/#mdy 5 http://www.ratb.ro/ 6 http://www.metrorex.ro
  • 2. agency in Bucharest) does provide a trip planner, but provides only a visual route map, which cannot be parsed using a computer. But Metorex’s route planner would need some improvement, since it provides routes only from station to station (the average user would prefer to enter the origin and destination by clicking on a map) enter and does not connect with the RATB routes. Thus, there is a need for a city level tran- tra sit route planner that uses routes from all agencies in a city. This is the main purpose of my dissertation paper, but this paper focuses only on getting the data, organizing it bu in a manner that fits my purpose and exposing it in a Sparql endpoint, so it can be used by other developers. 2 Getting the data I have managed to get part of the data I need as an SQL database. This database co This con- tains most of the routes (name, geometry) and stops (name, location) in the Bucharest transit network. The routes are managed by the two biggest transit agencies in the city, RATB and Metrorex. This data is incomplete, because I need to know the route stops in each route and their sequence. To get this information, I can use a technique calledweb scarping, because the RATB website does provide a way for human users , to see what stops are in each route. As defined in [1], web scraping is“is a computer omputer software technique of extracting information from websites websites”. Part of the page for one of the RATB routes on the RATB website is represented in Figure 1. Fig.1.Route page on the RATB website. Fig.
  • 3. As you can see, you can select the route from one of the drop down lists (each list represents a type of route: rail, trolley, bus). The selected route name must be sent as a parameter to the server. To find out how, we can use a Net monitoring tool, like Firebug7for Mozila’s Firefox browser. So, for example, let’s say we have chosen 106 irefox from the bus route drop down list (the third from the left). This selection automatica automatical- ly triggers a postback. The post parameters are shown in Figure 2. 2.Using Firebug to detect post parameters. Fig.2 As you can see, the route name is indeed sent as a post parameter, with the name tlin3. Knowing this, I can build a small (PowerShell) script that makes a GET request ( to http://www.ratb.ro/v_trasee.php, gets all the options from the HTML select element http://www.ratb.ro/v_trasee.php, that I am interested in (e.g.: the bus route list), and for each option value, makes an HTTP Post requestwith the required post parameters (e.g.: if we choose the third drop questwith down list, then thetlin1, tlin2, tlin4, tlin5 parameters are set to 0, and tlin3 is set to the tlin1, option value). The POST returns the page in HTML format. Now, I have to parse the page, extract the information from the table and write it to disk. The next step is to next write the route steps to the existing SQL database. This can be done manually, or automatically, by matching the stop name from the file with the stop name in the d da- tabase. Following these steps I have found the complete route itinerary for three th routes. 7 http://getfirebug.com/
  • 4. 3 Modeling the data 3.1 General Information Taking in account the classes of entities, and the relationships between these classes, the domain I want to model is shown in Figure 3. Fig.3.Transit Domain Model. Because I want the dataset that I am exposing to be extensible and interoperable with other datasets, I have chosen to model the data using RDF triples, store it in a Triple Store and expose it in a SPARQL endpoint. In the semantic Web, a knowledge base co contains a set ofRDF (Resource Description RDF Framework) [2] triples of the form <subject, predicate, object>. Resources are identi- ident fied by a unique URI. Subjects and predicates are always resources, while objects can be either resources or literals (having an associated data type). Using this model,any a kind of domain can be represented. Because in semantic web it is strongly advised to use, if possible, existing RDF v use vo- cabularies, and not create a new one, I have searched for one that would suit my , needs. I have found the T Transit8vocabulary, which models exactly the domain I am 8 http://vocab.org/transit/terms/
  • 5. trying to describe. Transit is “a vocabulary for describing transit systems and routes . a routes” and is based on the General Transit Feed Specification published by Google. The s Google. Transit vocabulary core classes are shown in Figure 4. re The Transit vocabulary is already used to expose MTA New York city transit data 9 and bus route data in Southampton, UK10 , so it definitely is the most known transit Southampton RDF vocabulary. Fig. 4.Transit Vocabulary Core Classes. As you can see, the classes in this vocabulary map over the classes in my domain model. There is one class that I do not use in my domain module, the Schedule class. This is done because in most of the cities in Romania, public transportation does not run according to a fixed schedule. By eliminating this class, I have greatly minimized the amount of data I have to manipulate. 9 http://kasabi.com/dataset/mta aset/mta-new-york-city-transit 10 http://data.southampton.ac.uk/bus-routes.html http://data.southampton.ac.uk/bus
  • 6. 3.2 Transit Core Classes and Properties In this section, I will go over the main Transit classes and properties that I am using anddefine them according tothe Transit vocabulary document11. You may notice that I use only a subset of the classes and properties defined in this document, because I use a model a little less general, by eliminating the Schedule and Service classes. Note: Each resource’s URI is written in the page footnote section Classes  Agency12 - an agency is an organization that oversees public transportation for a city or region (e.g. RATB, Metrorex).  Route13 - a public transportation route; some of its subclasses are: ─ Bus Route14 ─ Rail Route15 ─ Subway Route16  Stop17 - a location where passengers board or disembark from a transit vehicle  Route Stop18 - a location where passengers board or disembark from a transit ve- hicle for a specific route. Properties  agency19 - the agency that operates this public transportation route  route20 - A route associated with the given resource  routeStop21 - Links a route to a particular stop and the sequence of that stop in the route  stop22 - the physical stop associated with this route stop (Note: this property is not used according to the Transit vocabulary, because having this property implied be- ing a ServiceStop. In my domain, having this property implies being a RouteStop, because I do not need to use the ServiceStop class. 11 http://vocab.org/transit/terms 12 http://vocab.org/transit/terms/Agency 13 http://vocab.org/transit/terms/Route 14 http://vocab.org/transit/terms/BusRoute 15 http://vocab.org/transit/terms/RailRoute 16 http://vocab.org/transit/terms/SubwayRoute 17 http://vocab.org/transit/terms/Stop 18 http://vocab.org/transit/terms/RouteStop 19 http://vocab.org/transit/terms/agency 20 http://vocab.org/transit/terms/route 21 http://vocab.org/transit/terms/routeStop 22 http://vocab.org/transit/terms/stop
  • 7. 3.3 Other Vocabularies Some of the other RDF vocabularies I am using are:  FOAF23, for the name and page properties  RDF24, for the type property  RDF Schema25, for the label property  Geometry Ontology 26 , for describing the geometry of a route in Well-Known Text27 (WKT) format.  WGS84 Geo Positioning28, for representing latitude and longitude information in the WGS84 geodetic reference datum 4 Technical Information 4.1 General Information Because of my hands-on experience with the Microsoft development stack, I have chosen to develop the application using .Net/C#. This might be a challenge, because most of the semantic web tools focus on non-Windows client, as seen in the pie chart represented in Figure 5, taken from [3]. The same source[3] shows that this trend is constant, since in 2011 there were not developed any new semantic tools for the .Net platform. I have taken in consideration switching to the Java platform, which provides a set of powerful tools for working with RDF and OWL, most notably the Jena Framework29, developed by Apache, but after doing some research I have found a tool written for the .Net platform which suits my needs. This tool will be presented in detail in section 4.4 of this paper. 23 http://xmlns.com/foaf/0.1/ 24 http://www.w3.org/1999/02/22-rdf-syntax-ns# 25 http://www.w3.org/2000/01/rdf-schema# 26 http://data.ordnancesurvey.co.uk/ontology/geometry/ 27 http://edndoc.esri.com/arcsde/9.0/general_topics/wkt_representation.htm 28 http://www.w3.org/2003/01/geo/wgs84_pos# 29 http://incubator.apache.org/jena/
  • 8. Fig.4. Semantic Web Tools by Programming Language 4.2 Knowledge base One of the disadvantages of using 3rd party triple stores is that there aren’t any open open- source products. But because of the nature of my problem, I could not use an in in- memory triple store, I needed an efficient one, with a powerful query engine. Upon pon researching different options, I have decided to use OpenLink’s Virtuoso Universal option , Server30 as a Triple Store. My option was based on Virtuoso maturity, and its RDF Graph Model features31:  Backward ackward Chaining OWL Reasoner covering: rdfs:subClassOf, rdfs:subPropertyOf, owl:sameAs, owl:equivalentClass, owl:equivalentProperty, bPropertyOf, owl:InverseFunctionalProperty, owl:inverseOf, owl:SymmetricalProperty, and owl:TransitiveProperty  SPARQL 1.1 Query Language, Protocol, and Results Serialization support  SPARQL Create, Update, and Delete (SPARUL) 30 http://virtuoso.openlinksw.com/ 31 http://virtuoso.openlinksw.com/rdf-quad-store/ http://virtuoso.openlinksw.com/rdf
  • 9.  Supports data broad range of RDF model data representation formats: HTML+RDFa, RDF-JSON, N3, Turtle, TriG, TriX, and RDF/XML  REST interfaces for Create, Read, Update, and Delete operations  RDF Data is accessible also accessible via ODBC, JDBC, ADO.NET (Entity Frameworks compatible), OLE DB, and XMLA data providers / drivers. Because the application that I’m building is a non-commercial application, I am not interested in acquiring a Virtuoso commercial license at the moment. OpenLink does provide 2 X 15 days trial of the software. I have used the first one while configuring and testing the Virtuoso Server, and will use the second one on demos later on. While working on the application, I will use in-memory triple stores, loaded and saved to the file system. 4.3 .Net API for working with RDF While looking for a .Net API for working with RDF I have found three possible can- didates:  Linq2RDF32, a LINQ query provider that converts queries into the SPARQL query language. Unfortunately, it is not a mature enough API and the last update for this project was in august 2008, so it is not under development anymore.  Jena .NET33, a flexible .NET port of the Jena semantic web toolkit. Unfortunately this project is abandoned too, while still a beta 0.3 release.  dotNetRDF34, an open-source semantic web/RDF library for C#/.Net. Even if this is just a beta 0.5 release, it is still under development, which is a big advantage over the other two options. This API will be described in the next section. 4.4 dotNetRDF General Information Some of the points of interest regarding the API are:  currently a beta release (version 0.5.1)  works on .Net 3.5 (but according to the project’s Issue Tracker35, moving the li- brary to .Net 4.0 is a top priority)  "simple but powerful API for working with RDF"  operates primarily with Triples, Graphs and Triple Stores  has limited support for Inference  no support for OWL 32 http://code.google.com/p/linqtordf/ 33 http://semanticweb.org/wiki/Jena_.NET 34 http://www.dotnetrdf.org/ 35 http://www.dotnetrdf.org/tracker/Issues/IssueDetail.aspx?id=22
  • 10. Known formats The library can read RDF fragments (including Graphs and Triple Stores) from strings, files and even URIs. It can also write RDF fragments to files and strings. Reading and writing can be done in all of the most used RDF formats: RDF/XML, RDF/JSON, NTriples, Turtle, Notation 3, XHTML + RDFa/ Graphs The API has support for getting Nodes and Triples from a Graph by a given criteria (which is a combination of subject, predicate and object), merging graphs and compu- ting graph difference and equality. Triple Stores The library can work with both in-memory Triple Stores and native Triple Stores. It provides support for working with:  in-memory triple stores, loaded and saved from and to disk in two ways: ─ a folder, where each files represents a single Graph, and there is an additional index file ─ a single file, using one the following formats: TriG, TriX and NQuads  simple SQL based stores with MySQL and Microsoft SQL Server databases  native 3rd party Triple Store: AllegroGraph36, Dydra37, 4store38, Fuseki39, Joseki40, Sesame 41 (any Sesame based store e.g. BigOWLIM 42 ), SPARQL Graph Store HTTP Protocol for compliant stores, Stardog43, the Talis Platform44 and Virtuoso. By providing an easy way to work with Virtuoso based Triple Stores, dotNetRDF proves to be the right choice. Querying Using dotNetRDF one can query easily over:  in-memory Graph using the library’s SPARQL implementation  remote SPARQL endpoints  3rd party Triple Stores, using native query (this is very important, since we can rely on the more powerful Virtuoso query engine and not on the weaker dot- NetRDF implementation. 36 http://www.franz.com/agraph/allegrograph/ 37 http://dydra.com/ 38 http://4store.org/ 39 http://incubator.apache.org/jena/documentation/serving_data/index.html 40 http://www.joseki.org/ 41 http://www.openrdf.org/ 42 http://www.ontotext.com/owlim 43 http://stardog.com/ 44 http://www.talis.com/platform/
  • 11. The query mechanism is compatible with the current draft45 of the SPARQL 1.1 stan- dard. Inference and Reasoning The current version provided three types of reasoners:  RDFS Reasoner, which does not apply the full range of possible RDFS based infe- rencing but does do the following: ─ asserts additional type triples for anything which has a type which is a sub-class of another type ─ asserts additional triples where the property (predicate) is a sub-property of another property ─ asserts additional type triples based on the domain and range of properties  SKOS Reasoner is a simple concept hierarchy reasoner which can infer additional triples where the subject has an object which is a skos:Concept in the taxonomy by following skos:narrower and skos:broader links as appropriate.  Simple N3 Rules Reasoneris a reasoner that is able to apply simple N3 Rule Unfortunately, there is no API support for using inference with 3rd party Triple Stores. Because of this, the reasoner that comes with the Virtuoso Universal Server cannot be used. Configuration The library comes with a very useful Configuration API that can be used to load dy- namically commonly used objects (such as Graphs, connections to Triple Stores etc.), and a couple of tools for deploying RDF enabled ASP.NET Web Applications. Be- cause of these last two features, exposing a SPARQL endpoint is a trivial task. 5 Implementation Details 5.1 Populating the Virtuoso RDF Triple Store As mentioned in section 2, I have the data in a MS SQL Server Database, and, as mentioned in section 3, I have the RDF vocabulary. What I have to do is migrate the data from the SQL database into the Virtuoso Triple Store. I have done this in two steps:  Write the data from the SQL database into a set of files. To do this in the simplest way, I have used the RAD features of the Visual Studio 2010 IDE together with Entity Framework: the Entity Framework46 has created a set of classes, based on 45 http://www.w3.org/TR/sparql11-query/ 46 http://msdn.microsoft.com/en-us/data/ee712906
  • 12. the database’s tables. In code, I got the data from the DB, using these classes, and wrote the data to a set of files.  The second step was to write the data from the files to a Graph. This could not be done because EF 4.1 is not compatible with .NET 3.5, and the dotNetRDF library is built on .NET 3.5. After this step, I have written the data from the Graph to the Virtuoso Triple Store (dotNetRDF makes this task easier). 5.2 Exposing the data through a SPARQL Endpoint To expose the data in the Virtuoso Triple Store through a SPARQL Endpoint, I created a new ASP.NET Web Application, and Added in the App_Data folder, a con- figuration file with the following content: @prefix dnr: <http://www.dotnetrdf.org/configuration#> . # Firstly note that our Handler must have a subject which is a special dotNetRDF URI as discussed in Configuration API - HTTP Handlers <dotnetrdf:/sparql> a dnr:HttpHandler ; dnr:type "VDS.RDF.Web.QueryHandler" ; # States that we're using the QueryHandler dnr:queryProcessor _:proc . _:proc a dnr:SparqlQueryProcessor ; dnr:type "VDS.RDF.Query.SimpleQueryProcessor" ; dnr:usingStore _:store . _:store a dnr:TripleStore ; dnr:type "VDS.RDF.NativeTripleStore" ; dnr:genericManager _:manager . # Register the Virtuoso Ffactory _:virtuosoFactory a dnr:ObjectFactory ; dnr:type "VDS.RDF.Configuration.VirtuosoObjectFactory, dot- NetRDF.Data.Virtuoso" . # Now we define the initial dataset _:manager adnr:GenericIOManager ; dnr:type "VDS.RDF.Storage.VirtuosoManager, dot- NetRDF.Data.Virtuoso" ; dnr:server "myIp" ; dnr:port "1111" ;
  • 13. dnr:database "DB" ; dnr:user "user" ; dnr:password<appSettings:VirtuosoPassword> . As you can see, we register an Http Handler47 of type QueryHandler. To this handler, we associate a QueryProcessor. The QueryProcessor use a Native Triple Store, which is defined by a manager that points to the Virtuoso Universal Server instance. Now all I have to do is to register the handlers in the web application’s configuration fiel (web.config), which is done automatically by the rdfDeploy tool that comes with dotNetRDF. 6 Future Development As described in [4], [5], any algorithm that aims to get the shortest path in a network transit system needs some sort of pre-processing. This is needed since web usage studies have shown that the path computation time should be less than 7 seconds [6], [7]. The pre-processed data needs to be stored in the knowledge base, and since it is algorithm dependent, I might have to extend the Transit vocabulary with a couple of new classes and properties in order to store the information. 47 http://msdn.microsoft.com/en-us/library/aa479332.aspx
  • 14. 7 Bibliography 1. ***, Web Scraping, http://en.wikipedia.org/wiki/Web_scraping 2. Allemang, D., Hendler j., Semantic Web for the Working Ontologist, Morgan Kaufmann, 2008 3. ***, The State of Tooling for Semantic Technolo- gies,http://www.mkbergman.com/991/the-state-of-tooling-for-semantic-technologies/ 4. J. Jariyasunant, D. Work, B. Kerkez, R. Sengupta, S. Glaser, A. Bayen., Mobile Transit Trip Planning with Real–Time Data. Presented at the Transportation Research Board , 2010 5. Qiujin Wu, Joanna Hartley, Using K-Shortest Paths Algorithms To Accommodate User Preferences In The Optimization Of Public Transport Travel, Applications of Advanced Technologies in Transportation Engineering, Proceedings of the Eighth International Con- ference, pp. 181-186, 2004 6. R. Jain, T. Raleigh, C. Graff, and M. Bereschinsky, Mobile internet access and qos guaran- tees using mobile ip and rsvp with location registers, IEEE Int. Conf. Commun., vol. 3, pp. 1690–1695, 1998. 7. T. Erl, Ed., Service-oriented architecture (SOA): concepts, technology, and design. Pren- tice Hall, 2005.