Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Comparative Study That Aims Rdf Processing For The Java Platform

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Rdf Processing Tools In Java
Rdf Processing Tools In Java
Wird geladen in …3
×

Hier ansehen

1 von 6 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (18)

Ähnlich wie Comparative Study That Aims Rdf Processing For The Java Platform (20)

Anzeige

Aktuellste (20)

Comparative Study That Aims Rdf Processing For The Java Platform

  1. 1. Comparative study that aims RDF processing for the Java platform Burada Alexandru and Prelipcean Gheorghe Computer Science Faculty, "Al. I. Cuza" University of Iasi Software Systems Engineering specialization Abstract. The semantic web represents the next evolution of the World Wide Web in which the information will be identified by semantics, by both software and human being. RDF is a metadata framework which facilitates the understanding of information by software. Processing RDF using java platform is an important issue today because of its portability of the compiled code. The aim of this document is to present a comparative study of the most popular APIs for RDF processing available for the java platform. Keywords: java, RDF 1 Introduction The World Wide Web was built as a network of documents which are understood by human. His success was the possibility to post documents and retrieving them from another people server. In the beginning the HTTP servers and browsers were responsible only for rendering the content in the HTML form and images. All processing and understanding was for the humans. After that came support for the server side processing like CGI and Java Servlet API. Browsers evolved and support Javascript and other scripting languages and many processing can be made on server side tasks or on the client side. Today, with a big variety of information, browsers and end-user devices, it is practically impossible for web application developers to keep up constantly rewriting server-side and client-side code. The major problem is the inability of web application to understand content. Learning to understand information about content, or metadata, is a major step toward developing a solution. The Resource Description Framework is a standard designed for Web applications, which depend on machine-understandable metadata, and to support interoperability between such applications. RDF is used to create models of metadata that may be understood by processing agents. RDF is also a simple graph-based data model for representing information on the Web.
  2. 2. 2 Burada Alexandru and Prelipcean Gheorghe An RDF graph is a set of triples; each triple is made of a Subject, Predicate and Object. Triples are used to create relationships between the subject and object using different predicates. 2 JRDF - An RDF Library in Java JRDF [2] is a standard set of APIs that is trying to cover the most important features needed by java developers. It also consists from a base implementation of RDF. As main features it includes persistent graphs, SPARQL GUI, distributed and Local SPARQL servers, RDF Parsers, relational query model, SPARQL support, BDB MapFactory, RDF Witer and global graphs. JRDF support the next java versions: - java 1.5, and require StAX library(ref) - StAX an interface that came from DOM and Simple Api for XML parsing schema. Using StAX the application can load progressively only what it needs for processing. -java 1.6 For storing the triples it used in memory and disk based graphs with a standard system level interface. The persisting operation is made in manner: load RDF, do some changes and then save. They are not stored to disk until the factory is closed. This operation includes all graphs, which means that we cannot store a specific graph. JRDF model uses a hierarchy of interfaces, in which the base is represented by Node. This is sub classed by Subject, Predicate and Object. Each of those elements from the triple is sub classed by blank node, URI and Literal . The default behavior in JRDF is that a URI reference has to be an absolute URI. Literals allow the creation of literal through Element Factory.
  3. 3. Comparative study that aims RDF processing for the Java platform 3 SPARQL support JRDF provide a simple GUI interface that allows users to load the RDF/XML and NS documents and execute the SPQRL queries. Let’s say we just downloaded the latest version of jrdf –(the build from Nov 20 2009), and we want to execute same SPAQL queries on our RDF documents. We can simply type the command: java -jar jrdf-gui-0.5.6.jar to start the JRDF GUI (a java-swing application). The user can load an RDF document and execute SPAQL queries over this file. The application will show the result of the query and the time taken by this operation or the exception cause in a fail case. The documentation that comes with the library is not so substantial – only in the wiki page - that explains how to use the main important features of JRDF and we can say that it is oriented to developers, since it providing lots of examples and test cases
  4. 4. 4 Burada Alexandru and Prelipcean Gheorghe in the binary sources. At this moment the library is still developed and they want to add support for transactions, security, event handling for adding or removing nodes from graph and a RDF to java object API – similar to hibernate. The code License –The JRDF library is licensed under Apache License 2.0 3 Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema Sesame is a generic architecture for storage and querying large quantities of metadata in RDF and RDF Schema. Sesame is built as an independent implementation from any storage devices, which means that we can use for storage relational databases, triple storage and object-oriented databases without changing the code for query engine or other functionalities. It also offers support for concurrency control, export of RDF and RDF Schema information and a query engine for RQL. For storage of metadata it needs, of course, a repository, but this issue involves some problems because there are lots of data base management systems and each of these has strengths and weaknesses, targeted platforms and API’s. In this case Sesame chose anther strategy: to concentrate all specific DBMS code in a single layer called Storage And Inference Layer (SAIL). This layer offer to clients RDF methods and translate them to a specific DBMS. The RQL query module. The RQL module is one module implemented in Sesame and it consists from a series of steps that are made in order to execute a RQL query. The execution cycle of a RQL is next one: Sesame translate the RQL query through a parser (via the object model) into a set of calls to the SAIL, the result will be optimized by Sesame query optimizer and will produce an optimized query model. A natural consequence of this choice to evaluate queries in the SAIL is that it needs to devise several optimization techniques in the engine and the SAIL API implementation, since it cannot rely on any given DBMS to do this. A lack of SAIL module is the unused transaction functionalities. The admin module This module is used for inserting RDF data and RDF Schema information into a repository. It offers two functionalities: adding RDF data/ schema information and clearing the repository. As a future functionality will be the partial delete. This module retrieves the information from RDF source using a RDF streaming parser (a parser that is part of Jena). This parser gives the admin module the information as a triple - Subject, Predicate and Object. The current version of Sesame has no support for versioning. The RDF export module The RDF Export Module is a very simple module. This module is able to export the contents of a repository formatted in XMLserialized RDF. The idea behind this
  5. 5. Comparative study that aims RDF processing for the Java platform 5 module is that it supplies a basis for using Sesame in combination with other RDF tools, as all RDF tools will at least be able to read this format. Differences between Sesame and other APIs APIs like Jena or Redland focus on RDF triple set, leaving the interpretation of these triple as an exercise for the user. In SAIL, those RDFS tasks are handled internally. The main reason is related to the relationship the efficiency and the actual storage model used. Another interesting feature in SAIL is concurrency handling. This feature was introduced because of the given RQL was broken down into several operations and they need to assure consistency over multiple operations. The documentation offered by Sesame project is very rich and it included lots example. The project is available under a BSD-style license. 4 Jena - “a Java framework for writing Semantic Web applications” Jena [5] is an open source software under a BSD license. It implements APIs for dealing with Semantic Web building blocks such as RDF and OWL. Jena manages the triples with an API called Model which can be created from a file system or a remote file. An RDF document is represented as a set of statements. It uses JDBC (java data base connectivity) for binding to an existing RDBMS. Jena contains a rich set of features for dealing with RDF: the methods for reading and writing RDF as XML, save an RDF model to a file and loads it when it’s needed, the ability to reason using custom rules, and OWL-based ontology processing. It also offers a complete guide to the operations it support but few example in the binary sources. Using SPARQL with Jena The support for SPARQL in Jena is currently available through a module called ARQ. ARQ is under active development, and is not yet part of the standard Jena distribution. The integration of Jena with the most popular IDE used by java developers is pretty simple, because for eclipse already exists a plugin called “Jena 2.0 Library Plugin”, and for Netbeans you need only to include the library in the project you want to use it.
  6. 6. 6 Burada Alexandru and Prelipcean Gheorghe Conclusion A very good report about a set of open source triple store systems and performance tests within a common hardware, software and dataset environment can be found on the Ryan Lee report - Scalability Report on Triple Store Applications [7]. RDF holds great promise for the future of the Web. As the technology matures, it will become increasingly simple to create new applications by generating RDF models. Such models may be generated based on very high-level interactive specifications and data analysis. Maybe in the future non-programmers will be able to build very sophisticated internet applications. References 1. http://code.google.com/p/jrdf/ 2. http://jrdf.sourceforge.net/ 3. http://www.openrdf.org/doc/papers/Sesame-ISWC2002.pdf 4. http://www.ibm.com/developerworks/xml/library/x-atomtordf/index.html?ca=drs-&ca=dgf- ip 5. http://jena.sourceforge.net/documentation.html 6. http://hydrogen.informatik.tu-cottbus.de/wiki/index.php/Advanced_Jena_Rules#example9 7. http://simile.mit.edu/reports/stores/

×