WebSpa is a tool that allows the quick, intuitive (and even fun) interrogation of arbitrary SPARQL endpoints. WebSpa runs in the web browser and does not require the installation of any additional software. The tool manages a large variety of pre-defined SPARQL endpoints and allows the addition of new ones. An user account gives the possibility of saving both the interrogation and its results on the local computer, as well as further editing of the queries. The application is written in both Java and Flex. It uses Jena and ARQ application programming interface in order to perform the queries, and the results are processed and displayed using Flex.
Seal of Good Local Governance (SGLG) 2024Final.pptx
Web Spa
1. WebSpa
Monica Macoveiciuc and Constantin Stan
Faculty of Computer Science, Alexandru Ioan Cuza University, Iasi
Abstract. WebSpa is a tool that allows the quick, intuitive (and even
fun) interrogation of arbitrary SPARQL endpoints. WebSpa runs in the
web browser and does not require the installation of any additional soft-
ware. The tool manages a large variety of pre-defined SPARQL endpoints
and allows the addition of new ones. An user account gives the possibility
of saving both the interrogation and its results on the local computer,
as well as further editing of the queries. The application is written in
both Java and Flex. It uses Jena and ARQ application programming
interface in order to perform the queries, and the results are processed
and displayed using Flex.
2. Introduction
The Web is a universal medium for information and data exchange. Exploiting
the huge amount of knowledge distributed on the Web is a significant challenge.
Humans can understand the information, but it takes great effort to find and
combine data from such a large number of sources; on the other hand, computers
can easily browse through millions of pages in no time, but they are not capable
of understanding the content. The Semantic Web is an extension of the World
Wide Web, “in which information is given well-defined meaning, better enabling
computers and people to work in cooperation”[1].
RDF (Resource Description Format), together with SPARQL, provide a power-
ful mechanism for describing and interchanging metadata on the web. RDF is
the W3C standard for encoding knowledge. It is a structure for describing and
interchanging metadata on the Web in numerous forms and purposes. SPARQL
is both a query language and a remote access protocol. A SPARQL endpoint
enables users (human or machine) to query a knowledge base via the SPARQL
language. Results are typically returned in one or more machine-processable for-
mats. Therefore, a SPARQL endpoint is mostly conceived as a machine-friendly
interface towards a knowledge base.
WebSpa is a web based application that manages SPARQL endpoints and allows
users to perform interrogations over them. It provides pre-defined SPARQL end-
points and supports the addition of new ones. Anyone can use the application,
but an user account is required in order to performs certain operations, such as
adding endpoints, saving and later editing a query. The results are displayed in
the browser and they can also be stored locally, in XML format.
The application is written in both Java and Flex. It uses Jena and ARQ ap-
plication programming interface in oder to perform the queries, and the results
are processed and displayed using Flex.
This paper describes the WebSpa application, as well as the way it is built.
The first chapter contains a brief explanation of the concepts of SPARQL and
SPARQL endpoint, and the technologies used for building the application - Java
(Jena, ARQ) and Flex. The next chapter presents the interface, functionality
and storage capabilities of the application. The front and back-end are then de-
scribed and. Conclusions are drawn and perspectives are suggested in the final
chapter.
3. Technologies
1 SPARQL
SPARQL[2] (which is pronounced “sparkle” and has as recursive acronym -
SPARQL Protocol and RDF Query Language) is an RDF query language. It’s
a fresh W3C Recommendation about which Sir Tim Berners-Lee said that “will
make a huge difference”. RDF is pretty foundational to the Semantic Web. Until
SPARQL’s launch, RDF had a data model, a formal semantics, and a concrete
serialization (in XML), but what it didnt have was a standard query language.
SPARQL came in place and now offers to the Semantic Web and to Web 2.0
a common data manipulation language in the form of expressive query against
the RDF data model. Using WSDL 2.0, SPARQL Protocol for RDF describes
a very simple web service with one operation, query which is available with
both HTTP and SOAP bindings. This operation is the way you send SPARQL
queries to other sites and the way you get back the results. The HTTP bindings
are REST-friendly and a simple SPARQL protocol client takes little amount of
code in order to implement.
SPARQL consists of 3 separate specifications. The first one is the query lan-
guage specification (which makes up the core). The second is the query results
XML format (which describes an XML format or serializing the results of an
SPARQL queries - SELECT, ASK). The third specification is the data access
protocol (which uses WSDL 2.0 to define simple SOAP and HTTP protocols for
remotely querying RDF databases - or any data repository that can be mapped
to the RDF model). Altogether it consists of a query language, a mean of con-
veying a query to a query processor service and defining the XML format in
which the results will arrive.
Some issues are not addressed yet by SPARQL. The most notable is that it can’t
modify an RDF dataset (it’s read-only). As we mentioned previously, RDF is
build on the triple pattern (a 3-tuple consisting of subject, predicate, and ob-
ject). Similar to RDF, SPARQL is built on the triple pattern, which also consists
of a subject, predicate and object. SPARQL allows to match patterns in an RDF
graph using triple patterns, which are like triples except they may contain vari-
ables in place of concrete values (the variables are used as “wildcards” to match
RDF terms in the dataset).
The SELECT query can be used to extract data from an RDF graph, returning
it as an array result set. For more complex graph patterns one should use re-
quired and/or OPTIONAL data. UNION queries are also a way of dealing with
selecting alternatives from the dataset. It is possible to apply ordering to the
4. results, jump forward through results using OFFSET, and LIMIT the amount of
data returned. The SPARQL Query Results XML Format specification includes
several relevant examples. Given its obvious simplicity and regular structure,
manipulating this format with XSLT or XQuery is fairly trivial.
The syntax shortcuts make writing queries much simpler. These are especially
useful with repetitive graph patterns and long URIs. SPARQL presents itself as
being the missing and long waited part from the Semantic Web and Web 2.0.
A SPARQL endpoint is a conformant SPARQL protocol service as defined in
the SPROT (SPARQL Protocol for RDF) specification. A SPARQL endpoint
enables users (human or other) to query a knowledge base via the SPARQL
language. Results are typically returned in one or more machine-processable for-
mats. Therefore, a SPARQL endpoint is mostly conceived as a machine-friendly
interface towards a knowledge base. Both the formulation of the queries and the
human-readable presentation of the results should typically be implemented by
the calling software, and not be done manually by human users.
At the time, there is no agreed description for a SPARQL endpoint. Endpoint
descriptions can be used to announce endpoint capabilities and contents, support
discovery through service directories, supply browsing and federation hints.
2 Jena and ARQ
Jena[3] is an open source Semantic Web framework for Java. It provides an
API to extract data from and write to RDF graphs, which are presented as an
abstract “model”. This model can be queried through SPARQL and updated
through SPARUL[4].
Jena uses the concept of graph for dealing with the data: the nodes correspond
to URIs, while the edges are the triples.
The graphs are represented through the Model interface, which has different
implementations: a memory-based one, one which uses a relational database etc.
The memory-based model is the simplest and easier to use one.
A triple is represented through an interface called Statement. A statement cor-
responds to an edge in the graph and consists of three parts:
5. – the subject - the resource from which the arch leaves - implements the Re-
source interface;
– the predicate - the property (the label of the arch) - implements the Property
interface;
– the object - the resource that is pointed by the arch - implements the Re-
source or the Literal interface.
The components of the statement have a common base - the RDFNode interface.
The object component is more complex. A statement can be used as the object
component of the triple, since RDF allows nested statements. Objects imple-
menting the Container, Alt, Bag, or Seq interface can also be used as objects.
An RDF Model is represented as a set of statements. Accessing the compo-
nents of the statement can be achieved through the getSubject, getPredicate
and getObject methods of the Statement class. The API provides methods for
the most common operations:
– addProperty - adds a new statement (triple) to the model;
– listSubjects - lists the subject component of each triple from the model;
– listObjects - lists the object component of each triple from the model;
– write - writes the model in RDF XML format to the output stream given as
parameter;
– read - reads the statements in RDF XML format into a model.
ARQ is a query engine for Jena that supports the SPARQL language. In addition
to implementing SPARQL, ARQ’s query engine can also parse queries expressed
in RDQL[5] or its own internal query language.
3 Flex
The Flex framework provides the declarative language, application services, com-
ponents, and data connectivity developers need to rapidly build rich Internet
applications (RIAs) for the browser or desktop. Flex 3 is a powerful framework
that provides enterprise-level components for the Flash Player platform in a
6. markup language format recognizable to anyone with HTML or XML develop-
ment experience. The Flex Framework provides components for visual layout,
visual effects, data grids, server communication, charts, and much more.
MXML is the language developers use to define the layout, appearance, and be-
haviors of a Flex application. ActionScript 3, an object-oriented language based
on industry-standard ECMAScript, is the language that defines the client-side
application logic. Your MXML and ActionScript are compiled together into a
single SWF file that makes up your Flex application. Because the compiler is
available both as a standalone utility in the Flex 3 SDK and as part of Adobe
Flex Builder 3 software, developers can choose to develop in the Eclipse based
Flex Builder IDE or in an IDE of their choice.
Flex includes a pre-built class library and application services that help develop-
ers assemble and build RIAs. These services include data binding, drag-and-drop
management, the display system that manages the interface layout, the style sys-
tem that manages the look and feel of interface components, and the effects and
animation system that manages motion and transitions. The component library
provides all of the user interface controls that developers need, from simple but-
tons, checkboxes, and radio buttons to complex data grids, combo boxes, and
rich text editors. Use the provided containers to design complex, adaptive lay-
outs with ease, and use (or modify) the visually stunning skins to achieve an
ideal look and feel.
The Adobe AIR runtime extends web applications to the desktop, creating new
opportunities for more engaging, higher performing online/offline applications.
The Flex framework provides native support for the new AIR APIs, and Flex
Builder 3 provides all the tools necessary to build, debug, package, and sign
applications built on Adobe AIR.
Most Flex applications that are designed using just the Flex framework are
as follows: On one side, you have properly decoupled and reusable view compo-
nents, that know nothing about the rest of the application, dispatching events
and using data binding from parents and to children views. On the other side, a
main application container (the Application root tag) acts both as a controller
and a model, sometimes delegating tasks to some “utils” classes that handle
things such as RPC communications. In other words, a big Master object, which
can quickly take the form of a hideous spaghetti-code monster. If the application
is quite simple, and no long time maintenance is required, this might be not such
a bad choice.
Using Events and dataBinding, Flex can achieve a very good view components
decoupling. But your main view container will either have to handle all the logic
by itself, or explicitly delegate it to a controller class with which it will then
be very tightly coupled. The main problem is that user interaction events dis-
7. patched by the views cannot directly communicate with a separate application
controller, unless it is a view. To have a separate controller handle these events
and take actions, the events have to climb the display list up to the main ap-
plication container (the root application tag) which may then lead them to the
controller.
8. WebSpa
4 Functionality
WebSpa is a tool with which one can interrogate arbitrary SPARQL endpoints
in a more intuitive way. It provides pre-defined SPARQL endpoints and supports
the addition of new ones. Anyone can use the application, but an user account
is required in order to performs certain operations, such as adding endpoints,
saving and later editing a query. The results are displayed in the browser and
they can also be stored locally, in XML format.
The interface is easy to use and intuitive.
The top bar contains user-related information: the name of the currently logged
user, a form for either registering, logging in or out. The top right menu can be
used for both login and register, by ticking or un-ticking the “I’m new” checkbox.
The submit button changes according to this action, into either “create”, or “log
in”.
The menu contains three groups of actions:
Query
Result
More
9. The first group contains actions related to the queries. A logged user can choose
to create, open or save a query. The same actions are present in the second menu
group, but they are related with the results of the query. The last menu item
provides information about the application.
The screen is divided in two parts - the top window is bound to the query, while
the bottom window is used for managing the results.
The first one contains a menu which helps a less trained user to write queries. The
“select endpoint” dropdown contains predefined SPARQL endpoints. A user that
has created an account can add and remove his own endpoints. These actions
can be performed through the group of options next to the menu - “Refresh”,
“Save”, “Delete”. The next dropdown contains predefined RDF Schemas that
are automatically inserted in a query when the user selects one.
The “template” menu contains four options, corresponding to the four types
of SPARQL queries supported - SELECT, ASK, DESCRIBE, CONSTRUCT.
When a user selects one of these items, a pattern of the query is written in the
query window. The user can then personalize this pattern. For example, choosing
the “SELECT” option inserts the following content:
SELECT DISTINCT ?s ?p ?o
10. WHERE
{
?s ?p ?o .
}
The “Statement” menu contains code editing features (comment/uncomment
or indent/remove indent), selection applicable stateaments and example (fill in)
statements.
The last item of the bar is a group of buttons that provide quick access to
the most important actions - creating a new query, opening an existing one,
saving a query.
A bar with options divides the screen in two windows. It contains items that
help users manage the queries. The “Run Query” button runs a query, and the
provided results are displayed in the window below. This query can be given a
name and saved in the database for later use. A user can choose to load his saved
queries, which can then be edited or saved as new queries.
5 Front-End
The front-end, realized in Adobe Flex Builder, is using as backbone the Adobe
Flex Framework 3.2.0. Not using a stronger MVC constrain from the framework
side has its advantages and/or disadvantages and this was discussed in previous
material.
As most of the projects built with Adobe Flex Builder, WebSpa is an Web appli-
cation (means that it runs in Flash Player - the Adobe runtime for web browsers).
The whole project targets as Player the version 10.0.0 of Flash Player. This is a
technical requirement because this version of the Player offers an unprecedented
API when comes to work with files: saving and opening local files on the client’s
machine. Previous versions of Flash Player made this possible only via server
side scripts (server was used as a proxy). From now on we’ll refer to Adobe Flex
Builder simply with Flex.
The project has as “entry point” the WebSpa.mxml file. This is where things
11. start to happen. In this MXML file the layout and content of the header and of
the footer of the application are described. Also this class manages users (creates,
logs in or logs out users) and offers an application menu. Into the project we
created a Config class which acts as configuration data holder. This class holds
data as the server paths, project internal data and other type of data. Changing
the project’s server side application location is much easier this way. After 1 line
edit and a recompile the application is “good to go”. All properties within this
class are static.
All styles within the application (if altered) are managed from a *.css file (Main.css).
For the other pages within the project we used MXML components. This fact
assures us the possibility of an easy extension in the future if desired. We’ve
got the About.mxml and ClassicWebSpa.mxml. The About page is quite self ex-
plainable. The ClassicWebSpa is our SparQL interrogation tool which interacts
with the Java server side for providing results to queries written by users.
Inside this page the user is presented the option to refresh / save / delete end-
points or to save / load queries on the server side and have access to them via
his account. This option is available only if the user is logged in. Also the use
has the option to create new / open / save queries or results locally. We even
offer the option to save under 2 different file formats (*.wsq which stands for
WebSpaQuery and *.wsr which stands for WebSpaResult). This is offered to the
user in order to manage/locate files related with our application more easily.
Basically our application aids the user to execute a SparQL query against an
endpoint and retrieve the result of that query for a later manipulation. The ap-
plication menu offers pretty much the same functionality as the ClassicWebSpa
page, plus the connection to the About page.
For the whole communication between front-end and back-end we use an wrap-
per class that automates the server calls (offers the server side script, the GET
parameters, the POST parameters and the callback on execution success). This
make things much more readable and easy to manage in the rest of the classes.
For local files management we use two static classes which handled pretty well
saving / opening queries or/and results.
For the future the project may be extended easily because of the decoupled
way things function within it.
6 Back-End
The back-end is written in Java, using the Jena framework and the ARQ support
for SPARQL that it provides. The code is organized in three packages:
webspa.persistance
12. webspa.servlet
webspa.sparql
The package webspa.servlet contains the central point of the application, the
class WebSpaServlet. It is a servlet that communicates with the Flex front-end
of the application and redirects each call to the appropriate method. The servlet
is also responsible with starting the SQL transaction, as well as with committing
the changes. The servlet contains a HibernateBean object, whose methods are
called for each action. The requests comming from the front-end are analyzed
in order to determine which method to be accessed, based on the value of the
“fn” parameter. Once the method identified, it is passed some general parame-
ters - the request, response and the EntityManager object (used for managing
the communication with the database tables). Some of the methods require ad-
ditional parameters, such as boolean values or certain parameters specified in
the request. Each of the HibernateBean’s methods return the response as a
String variable, which is then passed to Flex for further processing, using the
HttpServletResponse’s PrintWriter object:
if (method.equalsIgnoreCase("create-user")) {
response.getWriter().write(hBean.
registerUser(request, response, entityManager));
}
The method that executes the SPARQL query is accessed differently. A SparQL
object is created and its executeSpaQuery method is called, with the following
parameters: the query’s text, the endpoint’s URL and the PrintWriter object.
The method itself is responsible with writing the response to this object.
Package webspa.persistance contains the classes used for the persistence layer
of the application, which is implemented with Hibernate and JavaPersistence
API (JPA). The data is stored in a MySQL database. There are five classes
mapped to the database tables:
class User - table USERS - manages the users of the application.
class EndPoint - table ENDPOINT - stores endpoint information, such as the
url, the type (default or added by user), the status etc.
class UserEndPoint - table USERENDPOINT - contains a one to many mapping
between an user and his additional endpoints.
class Model - table MODEL - stores RDF Schemas.
class Query - table QUERY - stores save queries.
Each class has attached an .hbm.xml file, that specifies the mapping between
13. class variables and table columns. For example, the User class has the following
variables:
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Integer uid;
private String username;
private String password;
private String lastIp;
@Column(name="created")
private Timestamp creationDate;
@Column(name="lastlogin")
private Timestamp lastLoginDate;
The tokens preceded by “@” are JPA annotations. The corresponding .hbm.xml
file contains the mapping:
<hibernate-mapping>
<class dynamic-insert="false" dynamic-update="false"
mutable="true" optimistic-lock="version"
polymorphism="implicit" select-before-update="false"
name="webspa.persistance.User"
table="users">
<id column="UID" name="uid">
<generator class="increment"/>
</id>
<property column="username" name="username"/>
<property column="password" name="password"/>
<property column="lastip" name="lastIp"/>
<property column="created" name="creationDate"/>
<property column="lastlogin" name="lastLoginDate"/>
</class>
</hibernate-mapping>
A class named HibernateBean is responsible with the communication with the
database. HibernateBean contains methods for retrieving, saving and updating
endpoints, queries and/or results. Each of the methods follows a similar pattern
- validation of the parameters, creation and execution of the query, formatting
and sending the result. The methods implemented are:
registerUser
loginUser
logoutUser
createEndPoint
hideShowEndPoint
saveQuery
updateQuery
checkUserById
14. checkField
checkEndPointByUserUrl
getLoggedUserId
getEndPoints
getModels
getQueries
Along with the class that communicates with the database, the package web-
spa.sparql contains the class SparQL. Using Jena and ARQ APIs, the class runs
queries over SPARQL endpoints. The process of sending a query and getting a
result is simple. First, a QueryExecution object is obtained using the ARQ’s class
QueryExecutionFactory and its sparqlService method. This method requires two
arguments - the query and the endpoint URL, both provided by the user through
the interface.
QueryExecution queryExecution = QueryExecutionFactory.
sparqlService(endpoint, query);
A pattern is used in order to determine the type of the query - SELECT, ASK,
DESCRIBE, CONSTRUCT. Based on the type, different methods are called:
execSelect
execAsk
execDescribe
execConstruct
The ResultSetFormatter’s static method - asXMLString - is used for returning
the results of the SELECT and ASK queries as XML.
ResultSet results = queryExecution.execSelect();
printWriter.write(ResultSetFormatter.asXMLString(results));
The other to interrogation types retrieve the results as a Model object. These
objects write directly to the PrintWriter that communicates with Flex, via their
“write” method.
Model model = queryExecution.execDescribe();
model.write(printWriter);
15. Conclusion and Perspectives
WebSpa is a web based application that manages SPARQL endpoints and allows
users to perform interrogations over them. It provides pre-defined SPARQL end-
points and supports the addition of new ones. Anyone can use the application,
but an user account is required in order to performs certain operations, such as
adding endpoints, saving and later editing a query. The results are displayed in
the browser and they can also be stored locally, in XML format.
The application is written in both Java and Flex. It uses Jena and ARQ ap-
plication programming interface in order to perform the queries, and the results
are processed and displayed using Flex.
There are two versions of WebSpa - the classic one, which has been presented in
this paper, and the graphical one, which is still under development. The latter
allows the user to handle the RDF graph by visualizing its nodes and arches and
directly manipulating them.
Limitations of the Classic WebSpa refer mainly to the way content is displayed,
eg. it might be a little difficult for the users to manually browse to the XML file
in order to view the results.
References
[1] Tim Berners-Lee, James Hendler and Ora Lassila. The Semantic Web. Scientific
American, May 2001.
[2] W3C. SPARQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-
query/
[3] Jena Project Page. http://jena.sourceforge.net/
[4] W3C. SPARQL Update. http://www.w3.org/Submission/SPARQL-Update/
[5] W3C. RDQL - A Query Language for RDF.
http://www.w3.org/Submission/RDQL/
[6] Adobe Flex 3. http://www.adobe.com/ro/products/flex/
[7] ARQ - A SPARQL Processor for Jena.
http://jena.sourceforge.net/ARQ/
[7] onica Macoveiciuc, Constantin Stan. Flex Framewoek Presentation.