SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
STRAYER UNIVERSTY




  NETWORK ARCHITECTURE AND ANALYSIS

             SUBMITTED TO

           DR. BRYANT PAYDEN




                  BY

           ADELMAR ESPLANA

          GRADUATE STUDENT IN

MASTER OF SCIENCE IN INFORMATION SYSTEMS

              MARCH 2009
2

                                               I.      Overview

        “Knowledge is power (But only if you know how to acquire it)." (The Economist, 2003) Knowledge

is awareness and understanding of facts, truth or information in the form of experiences and learning. Today,

one of the quickest ways to acquire knowledge is through the Web or WWW (World Wide Web). Using

search engine facilities, we can search almost any kind of information. The internet has become a world-

wide network of information resource and a powerful communication tool. And the more specific word you

type in, in the search engine, the more accurate results you would get. The internet is a powerful tool which

may be used in a number of ways such as having online classes, bills payment, booking vacation, scheduling

medical appointments, finding the latest news and even taking a virtual tour on vacation destinations. In

spite of all these tremendous capabilities that the Web offers, to date, the principle that is being used by

search engines in the analysis of material is based on textual indexing. Although search engines have proven

remarkably useful in finding information rapidly, they have also been proven remarkably useless in

providing quality information. “The biggest problem facing users of web search engines today is the quality

of the results they get back.”(Brin S. & Page L.) Currently, search engines can only account for the

vocabulary of the documents and has a little concept of document quality thereby producing a lot of junk.

        Since its inception in 1989, the internet has grown initially as a medium for the broadcast of read-

only material, from heavily loaded corporate servers, to the mass of internet-connected consumers. Recently,

the rise of digital video, blogging, podcasting, and social media has revolutionized the way people socially

interact through the web. Other promising developments include the increasing interactive nature of the

interface to the user, and the increasing use of machine-readable information with defined semantics

allowing more advanced machine processing of global information, including machine-readable signed

assertions. (Berners T., 1996a)

        A lot of the data that we use everyday are not part of the web. These data which are in various silos,

reside in different software applications and different places that cannot be connected easily like bank

statements, photographs and calendar appointments. In order for us to see bank statements, we have to use

online banking; to manage our calendar appointments and organize our photo collection, we have to use
3

social-oriented websites such as MySpace and Facebook. It is interesting to see on the web photos displayed

on a calendar, showing exactly what we were doing when we took them; and bank statement, showing the

list of transactions that we incurred. Due to the silo of information where both personal and business

information are managed by different software, we are forced to learn how to use different programs all the

time. In addition, since data are controlled by applications, each application must keep the data to itself.

According to W3C Group, what we need is a “web of data” (Herman I., 2009). Supported by Fielding’s

dissertation, he mentioned that what we need is a way to store and structure our own information, whether

permanent or temporal in nature, both for our own necessity and for others, be able to reference and structure

the information stored by others so that it would not be necessary for everyone to keep and maintain local

copies. (Fielding R., 2000)

                                          II.      Purpose of the Paper


        Understanding the underlying concepts and principles behind the web is essential to current and

future implementation initiatives. For this reason, it is the objective of this paper to uncover the root of its

existence, and to examine the fundamental design notion of the following design principles: Independent

specification design, Hypertext Transfer Protocol (HTTP), Uniform Resource Identifier (URI) and Hypertext

Markup Language (HTML). This study also aims to develop a better understanding of the emerging web

standards, such as REST, SOA, and Semantic Web. The paper discusses some of the misconceptions about

URI, HTTP and XML and the following issues: a) In REST and Semantic point of view, there is no

difference between slash based and parameter based URI reference; b) HTTP is not a data transfer protocol;

it is an application protocol (or a coordination language, if you swing that way). REST does not "run on top

of HTTP" but rather HTTP is a protocol that displays many of the traits of the REST architectural style; c)

What is Extensible Markup Language (XML) function in Representational state transfer (REST) and

Semantic Web? Is it true that most REST services in deployment do not return XML but rather HTML? Is

it true that REST has no preference for XML?
4

                                   III.   Foundation of the World Wide Web


        According to Tim Berners, “The goal of the Web was to be a shared information space through

which people (and machines) could communicate.” (Berners T., 1996a) It was also the original intent that

this so called “space” ought to span all sorts of information, from different sources to a wide array of

formats, and from highly valued designed material to a spontaneous idea. In the original design of the web,

he stated the following fundamental design criteria: (Berners T., 1996a)

a)    An information system must be able to record random associations between any arbitrary objects,

      unlike most database systems

                  The concept of database systems has been purposely utilized to facilitate storage, retrieval

      and information generation of structured data. Unlike, the web concept of “one universal space of

      information” which is based on the principle that almost anything on the web could be possibly linked

      to any arbitrary objects. The power of the Web is that linkage can be established to any document (or,

      more generally, resource) of any kind in the universe of information, whereas in the database systems,

      one has to understand the data structure to establish the relationship.

b)    If two sets of users started to use the system independently, to make a link from one system to another

      should be an incremental effort, not requiring unscalable operations such as the merging of link

      databases

                  In the business environment, to integrate two different types of systems, it is necessary to

      perform some degrees of integration efforts such as merging, importing or linking of databases. On

      the contrary, the idea of web was to be able integrate systems easily.    Most of the systems done in

      the past involve a great deal of integration effort due to the information silo. For this reason, the idea

      where machine can talk to each other set forth the promise of seamless integration.

c)    Any attempt to constrain users as a whole to the use of particular languages or operating systems was

      always doomed to fail. Information must be available on all platforms, including future ones.

                  Platform and language interoperability support the principles of universality of access
5

      irrespective of hardware or software platform, network infrastructure, language, culture, geographical

      location, or physical or mental impairment.

d)    Any attempt to constrain the mental model users of data into a given pattern was always doomed to

      fail. If information within an organization is to be accurately represented in the system, entering or

      correcting it must be trivial for the person directly knowledgeable

                 If the interaction between person and hypertext could be so intuitive that the machine-

      readable information space gave an accurate representation of the state of people's thoughts,

      interactions, and work patterns, then machine analysis could become a very powerful management

      tool, seeing patterns in our work and facilitating our working together through the typical problems

      which beset the management of large organizations.

Independent specification design

        The basic principles of the Web proposed in 1989 to meet the design criteria were adopted based on

the well-known software design principles called “independent specification design”. This design was based

on the principle of modularity. Meaning when it is modular in nature, the interfaces between the modules

hinge on simplicity and abstraction. This allows seamless compatibility of the existing content, to work with

the new implementation. As technology evolves and disappears, specifications for the Web’s languages and

protocols should be able to adapt to the new hardware and software changes. Along with this basic principle

are the three main components such as URI, HTTP and HTML.

URI or Universal Resource Identifier

        URI is a compact string of characters for identifying abstract or physical resource. It is a simple and

extensible means of identifying a resource. A URI can be further classified as a locator, a name, or both. The

term "Uniform Resource Locator" (URL) refers to the subset of URI that identifies resources via a

representation of their primary access mechanism, rather than identifying the resource by name or by some

other attribute(s) of that resource (Berners T., Fielding R. & L. Masinter, 2005). URNs (Uniform Resource

Names) are used for identification; URCs (Uniform Resource Characteristics), for including meta-

information; and URLs, for locating or finding resources.
6

        REST defines URI as a resource based on a simple premise that identifiers should change as

infrequently as possible (Fielding R., 2000). While Semantic Web identifies URI’s not just Web documents,

but rather real-world objects like people, cars, and abstract ideas. They call all these as real-world objects or

things (W3C, 2008b). Deriving URI definitions from the meaning of each letters U -Uniform, R-Resource

and I-Identifier as listed below:

        Uniform allows consistency of its usage, even when the internal mechanism of accessing the

resources has changed. It allows common semantic interpretation of syntactic conventions, across different

type of resources to work with the existing identifiers.

        Resources are, in general, any real world “thing” such as electronic documents, images and services,

recognized by URI to represent something, for example, electronic document, an image, or a source of

information with consistent purpose. Other resources that are not accessible via internet are representation of

the abstract concepts, mathematical equations, correlation (e.g., “parent” or “employee”) and values (e.g.,

zero, one, and infinity).

        Identifier pertains to information required to distinguish what is being identified from all other things

within its scope of identification. The terms “identify” and “identifying” means distinguishing one resource

from the other regardless how that purpose is accomplished. One of the capabilities web popularized is the

ability of documents to link to any kind, in the universe of information. With this in mind, the concept of

“identity” is concerned with the conceptual scheme of identifying objects generically. For example, one URI

can represent a book which is available in several languages and several data format.

        HTTP and URIs are the basis of the World Wide Web, yet they are often misunderstood, and their

implementations and uses are sometimes incomplete or incorrect (W3C, 2003).


    a) A common mistake, responsible for many implementation problems, is to think that a URI is

        equivalent to a filename within a computer system. This is wrong as URIs have, conceptually,

        nothing to do with a file system.
7

    b) A URI should not show the underlying technology (server-side content generation engine,

        script written in such or such language) used to serve the resource. Using URIs to show the

        specific underlying technology means one is dependent on the technology used, which, in turn,

        means that the technology cannot be changed without either breaking URIs or going through

        the hassle of "fixing" them.


HTTP

        According to the HTTP 1.0 specification, The Hypertext Transfer Protocol (HTTP) is an application-

level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information

systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name

servers and distributed object management systems, through extension of its request methods (Berners T.,

Fielding R. & Frystyk H., 1996b).

        HTTP messages are generic and communication takes place operationally based on the client/server

paradigm of request/response. Messages are all created to comply with the generic message format. Clients

usually send requests and receive responses, while servers receive requests and send responses. It is stateless

and connectionless in nature because after the server has responded to the client's request, the connection

between client and server is dropped and forgotten. There is no "memory" between client connections.

Basically, when you type in the URL in the browser, the client and server Connection takes place over

TCP/IP. This URL internally gets converted into a Request for server to process, after the server finished

processing, and then the server sends the message Response back to the client and Closes the connection of

both parties. The downside of it is that, it may decrease the network-performance due to the increasing

amount of overhead data per request, the fact that the state of request is not stored in a shared context.

        The design is patterned and implemented with the idea of object-orientation. In general, objects used

internally for each request are as follows: HTTP messages, Request/Response, Entity, Method Definitions,

Status Code Definitions, Status Code Definitions, and Header Field Definitions (based HTTP 1.0

specification).
8

        The Method field indicates the method to be performed on the object identified by the URL.

Methods supported by HTTP 1.1 specification are OPTIONS, GET, HEAD, POST, PUT, DELETE,

TRACE, and CONNECT (Fielding R., 1999). The GET method means to retrieve whatever is identified by

the URI. The HEAD is the same as GET but it returns only HTTP headers and no document body. The

POST method is used to request that the destination server accept the entity enclosed in the request as a new

subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a

uniform method to cover the following functions: 1) annotation of existing resources; 2) posting a message

to a bulletin board, newsgroup, mailing list, or similar group of articles; 3) providing a block of data, such as

the result of submitting a form, to a data-handling process; and 4) extending a database through an append

operation. (Berners T., Fielding R & H. Frystyk, 1996b).


HTML

        The Hypertext Markup Language (HTML) is a markup language used to create hypertext documents

that are platform independent. (Yergeau F. et.al, 1997) The difference between XML and HTML is that,

HTML is defined by W3C that must be followed by every possible browser. XML is an extension, its

markup is customizable. It is typically used as storage to hold and describe data.

                                    IV.     Future of the World Wide Web

        Say you had some lingering back pain: a program might determine a specialist's availability, check

an insurance site's database for in-plan status, consult your calendar, and schedule an appointment. Another

program might look up restaurant reviews, check a map database, cross-reference open table times with your

calendar, and make a dinner reservation. Tim Berners and others describe this as “web of data”. This will be

the new Web capable of supporting software agents that are able not only to locate data, but also to

“understand” in ways that will allow computers to perform meaningful tasks with data automatically on the

fly (Updegrove, 2001).

        The Semantic Web is a web of data. It is about common formats for integration and combination of

data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of
9

documents. It is also about language for recording how the data relates to real world objects that allows a

person, or a machine, to start off in one database, and then move through an unending set of databases which

are connected not by wires but by being about the same thing (Herman I., 2009).

        Representational State Transfer (REST) is an architectural style for distributed hypermedia systems,

describing the software engineering principles guiding REST and the interaction constraints chosen to retain

those principles, while contrasting them to the constraints of other architectural styles (Fielding R., 2000).

        The fundamental differences between the two are: Semantic Web is an integration solution (a

solution to information silo), while REST is a set of state transfer operations universal to any data storage

and retrieval system (Battle R. & Benson E., 2007). Semantic Web provides ways to semantically describe

and align data from desperate sources while REST offers resource data access operations commonly known

as CRUD (Create, Read, Update and Delete).

        From the traditional “web of pages” to a “web of data”, the Semantic Web goal is to provide a cost-

efficient way of sharing machine-readable data. The business of sharing machine-readable data in general

has been around for quite some time. Information silo has always been a challenge that researchers and IT

practitioners are keen about.

        Service-oriented architecture (SOA) solutions have been created to satisfy business goals that

include easy and flexible integration with legacy systems, streamlined business processes, reduced costs,

innovative service to customers, and agile adaptation and reaction to opportunities and competitive threats.

SOA is a popular architecture paradigm for designing and developing distributed systems (Bianco P. et al.,

2007). In spite of the popularity of SOA and Web Services, confusion among software developers is

prevalent. To shed a light, SOA is an architectural style, whereas Web Services is a technology used to

implement SOA’s.

        Web services provide a standard means of interoperating between different software applications,

running on a variety of platforms and/or frameworks (W3C, 2004). The Web services technology consists of

several published standards, the most important ones being SOAP, XML (Extensible Markup Language) and

WSDL (Web Services Description Language). Although there are some other technologies like CORBA and
10

Jini but to limit our discussion, we are only concerned with the Web Services as other do not apply to Web

domain.

          At the heart of the Service Oriented Architecture is the service contract. It answers the question,

"what service is delivered to the customer?" In the current web-services stack, WSDL is used to define this

contract. However, WSDL defines only the operational signature of the service interface and is too brittle to

support discovery in a scalable way. "SOAP” is no longer an acronym, A SOAP message represents the

information needed to invoke a service or reflect the results of a service invocation, and contains the

information specified in the service interface definition (W3C, 2004). Extensible Markup Language (XML)

documents are made up of storage units called entities, which contain either parsed or unparsed data. (Cowan

J., 2008). SOAP and WSDL are good examples of XML documents.

          As mentioned earlier, Web services way is just another roadmap of Service Oriented Architecture.

The concept of “web of data” was also introduced as a solution for information silo and, was able to establish

the rationale for Web-accessible API (Application Programming Interface). Technically speaking, a Web

service is a Web-accessible API. So, why is there a need for REST and Web Semantics?

          There is a great amount of data available through REST and SOAP Web Services, published by

private and public sectors however these data carry no markup that conforms to semantic standards. It is

important to provide markup in a manner where Semantic Web application suite understands to make the

services compatible and to allow semantic query operations feasible (Battle R. & Benson E, 2007).

          In the traditional Database Systems, we have SQL (Structured Query Language). It is the language

used to interact with the database. In the Semantic world, there is this technology called SPARQL

(SPARQL Protocol and RDF Query Language). SPARQL can be used to express queries across diverse data

sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains

capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions.

SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results

of SPARQL queries can be results sets or RDF graphs (W3C, 2008a).
11

          It is now getting more interesting, that developers have to learn not only SQL but also SPARQL.

Thanks to Edgar Frank Codd, the inventor of Relational Database; without his invention we are still using

the filing cabinet and physically sorting out and searching records manually.

          It is important to provide markup in a manner where Semantic Web could understand to make the

services compatible and to allow semantic query operations feasible. With the existing SOAP Web services,

what needed to be done is to add semantic information to web services, such as OWL-S and SAWDL. They

provide details for each Web service parameter that describes how the value is derived from ontology (Battle

R. Benson E, 2007). The OWL-S document maps each operation and message defined in the WSDL

definition ontology. But the problem with SPARQL is that, it is an RDF Query Language designed for

RDF. Since most Web services return plain old XML, a conversion process from XML data to RDF is

needed.

          So where does REST come into play? REST is another roadmap of SOA and a principle that is being

applied to a quite few of Web services implementation. While SOAP based services have a WSDL

document that defines their operation, there is no standard equivalent for REST services. This is the area

where companies who adopted REST earlier must be aware of. Who knows, if companies are really

convinced that this is the right way of doing Web services, then maybe in the future there will be a standard

way of implementation.

                                              V.      Discussion

a.   In REST and Semantic point of view, there is no difference between slash based and parameter based

     URI reference.

          There are two major requirements on the Semantic Web where naming of Resource must be

     followed. First, a description of the identified resource should be retrievable with standard Web

     technologies. Second, a naming scheme should not confuse things and the documents representing them

     (Battle R. Benson E., 2007). Both REST and Semantic Web support the idea that “Cool URIs don't

     change”. Tim Berner explained that the best resource identifiers don't just provide descriptions for

     people and machines, but are designed with simplicity, stability and manageability in mind. Based on
12

W3C standard, a generic URI syntax consists of hierarchical sequence of components such as scheme,

authority, path, query, and fragment.

       URI          = scheme ":" hier-part [“?" query] [“#" fragment]

       The following are two examples URIs and their component parts:

       foo://example.com:8042/over/there?name=ferret#nose

       _/ ______________/_________/ _________/ __/

       |        |        |       |          |

       scheme       authority        path       query fragment

       | _____________________|__

       //                   

       urn: example: animal: ferret: nose

             WC3 recommends the use of standard session mechanisms instead of session-based

URIs (W3C, 2003). What does it mean? HTTP/1.1 provides a number of mechanisms for

identification, authentication and session management. Using these mechanisms instead of user-

based or session-based URIs guarantees than the URIs used to serve resources is truly universal

(allowing, for example, people to share, send, or copy them).

   For example: Bob tries to visit http://www.example.com/resource, but since it's a rainy Monday

morning, he gets redirected to http://www.example.com/rainymondaymorning/resource. The day

after, when Bob tries to access the resource, he had bookmarked earlier, the server answers that

Bob has made a bad request and serves http://www.example.com/error/thisisnotmondayanymore.

Had the server served back http://www.example.com/resource because the Monday session had

expired, it would have been, if not acceptable, at least harmless. The problem with this is that, it

does not really guarantee that URI’s used are truly universal. The acceptable practice in this

situation is to use some modifiers, like "?" used to pass arguments for cgi, or ";" and to pass other

kind of arguments or context information.
13

        Roy Fielding did not mention in his dissertation that URI do not allow parameterized reference.

     Similarly, Semantic Web requirements mentioned that as long as the identifier conforms to the

     two major requirements above and W3C standard specifications then the use of it is acceptable.

     Both REST and Semantic Web consistently raised the implementation need of having abstraction

     to URI. The key abstraction of information in REST is a resource. Any information that can be

     named can be a resource, a document or image, a temporal service (e.g. “today’s weather in Los

     Angeles”), a collection of other resources, a non-virtual object (e.g. a person), and so on (Fielding

     R. 2000).

        It was mentioned previously that the biggest challenge of the search engines today is the

     quality of results. Search engine spiders do not presently crawl many types of “dynamic” web

     pages. Typical examples of dynamic pages are those internal and external web applications that

     companies are using to do their business as well as those Web 2.0 emerging sites. In accordance

     to this, it is important that we identify the types of resource and mapped the underlying entity.

        Conceptual representation means, resource is an abstraction of some type of arbitrary concept.

     Once mapping of the concept “resource” to a physical resource is done, it should remain this way

     as long as possible. Think about pages that have .asp extension. Companies who are still linking to

     those pages are possibly not working anymore since a lot of companies are now moving to .aspx.

b)   HTTP is not a data transfer protocol; it is an application protocol (or a coordination language, if you

     swing that way). REST does not "run on top of HTTP" but rather HTTP is a protocol that displays

     many of the traits of the REST architectural style.

                 HTTP is not designed to be a transport protocol. It is a transfer protocol in which the

        messages reflect the semantics of the Web architecture by performing actions on resources through

        the transfer and manipulation of representations of those resources. It is possible to achieve a wide

        range of functionality using this very simple interface, but following the interface is required in order

        for HTTP semantics to remain visible to intermediaries (Fielding R., 2000).
14

        Conceivably, it is easy to get the wrong idea that REST sits in between Application Protocol and

     Transport Protocol when it was cited that REST is a “transfer protocol”. Leveraging the HTTP

     Headers to provide request context around CRUD operations (Create for POST, Read for GET, Update

     for PUT and Delete for DELETE) will allow developers to overlay the programmatic API for a website

     directly on top of the site exposed to web user and reduce the cost and complexity of providing multi-

     format access to a site’s underlying data.

c)   What is Extensible Markup Language (XML) function in REST and Semantic Web? Is it true that

     most REST services in deployment do not return XML but rather HTML? Is it true that REST has no

     preference for XML?

                RESTS’s data elements are summarized in Table 1

                                           Table 1 REST Data Elements

            Data Element                   Modern Web Examples

            resource                       the intended conceptual target of a hypertext reference

            resource identifier            URL, URN

            representation                 HTML, document, JPEG image

            representation metadata        media type, last-modified time

            resource metadata              source link, alternates, vary

            control data                   if-modified-since, cache-control



                It is true that Roy did not specify XML as an example of resource and resource metadata; on

        the other hand, he did mention the representation media type which is the data format of the

        representation. He described that representation consists of data, meta data describing the data, and,

        on occasion metadata to describe metadata. As mentioned previously XML is an open standard for

        describing data. Therefore, the question about “REST has no preference for XML”. It is not true if

        you are doing Web services but false if you are just creating pages that are not designed for machine
15

   interpretation, why do you have to care about returning XML if you don’t need it in the first place?

   Likewise if you want to share your data, how do you want your data represented? Using file

   delimited text?

           The idea of REST and Semantic Web is to coexist with the existing web standards and not to

   disqualify any of them; in fact it is the idea of Platform and language interoperability.

                                        V.   CONCLUSION

   Acquiring data has becoming easier and easier than ever before and with latest technology

breakthroughs, there is no doubt that in time the internet will be “all in one”. Along the way, there will

be some adjustments and corrections to be done and misconceptions to be addressed (intentional or not

- whatever) to reach this so called “web of data”. Support from the industry players is crucial. Data

security issues have always been our primary concerned. There are a plethora of questions that must

be addressed, such as the following; a) who will annotate the data? b) What is the advantage of giving

the data, as we all know that data is a valuable commodity? c) Without any centralized control, how

will all this data be connected to one another? d) Will the existing AI techniques be sufficient to

process this huge amount of data? In addition, is it even practical to pursue this route?

   On the other side, web has exploded so rapidly that in the beginning, what we are only concerned

about is the sharing of documents. Giving credit to the early contributors who brought us this far, I

firmly believe that the simple approach of the existing implementation of web particularly URL (which

was originally designed as URI) opened up the door to everybody (computer savvy and none).

Explaining URI alone to common people and doing it right the first time is not simple. Philosophically

speaking, isn’t it also what the concept of “universality of access” is? In the same way, when you write

software, it is not always right the first time. Writing software is an evolving process.

   Meeting the line of both ends, I believe that there is not much we can do concerning why things are

done the URL way, but I do recognize that there is always room for improvement. With a better

understanding of what went before and what it is about to come, moving towards the future of Semantic

Web is up for us to consider.
16

                                      VI. REFERENCES

Berners T., Fielding R. & Masinter L. (2005) Uniform Resource Identifier (URI): Generic Syntax.

      Retrieved Feb 13, 2009 from http://labs.apache.org/webarch/uri/rfc/rfc3986.html#URLvsURN

Berners T. (2002). What do HTTP URIs Identify?

      Retrieved Feb 20, 2009 from http://www.w3.org/DesignIssues/Overview.html

Berners T. (1996a). The World Wide Web: Past, Present and Future. Retrieved on Feb 20, 2009 from

      http://www.w3.org/People/Berners-Lee/1996/ppf.html

Berners T., Fielding R & H. Frystyk (1996b) Hypertext Transfer Protocol -- HTTP/1.0. Retrieved Feb

      13, 2009 from http://www.ietf.org/rfc/rfc1945.txt

Battle R. Benson E, (2007). Bridging the semantic Web and Web 2.0 with Representational.

      State Transfer (REST). Retrieved Feb 13,2009 from

      http://omescigil.etu.edu.tr/semanticweb/papers/sw_4.pdf

Bianco P., Kotermanski R. Merson P. (2007 ) Evaluating a Service-Oriented Architecture. Retrieved

      Feb 13,2009 from http://www.sei.cmu.edu/pub/documents/07.reports/07tr015.pdf

Brin S. & Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Retrieved Feb 20,

      2009 from http://infolab.stanford.edu/~backrub/google.html

Cowan J., Fang A., Grosso P., Lanz K., Marcy G., Thompson H., Tobin R., Veillard D., Walsh N.,

      Yergeau F. (2008) Extensible Markup Language (XML) 1.0 (Fifth Edition) Retrieved Feb

      13,2009 from http://www.w3.org/TR/REC-xml/#sec-intro

Fielding R. (2000). Architectural Styles and the Design of Network-based Software Architectures.

      Retrieved Feb 20, 2009 from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

Fielding R., Mogul J., Gettys J., Frystyk H., Masinter L., Leach P. & Berners T. (1999) Hypertext

      Transfer Protocol HTTP/1.1. Retrieved Feb 13,2009 from http://www.ietf.org/rfc/rfc2616.txt

Herman I. (2009). W3C Semantic Web Activity. Retrieved Feb 20, 2009

      from http://www.w3.org/2001/sw/
17

The Economist (2003). Knowledge is power. Retrieved on Feb 20, 2009 from

     http://www.economist.com/business/globalexecutive/education/displayStory.cfm?story_id=17626

Updegrove A. (2001) THE SEMANTIC WEB: AN INTERVIEW WITH TIM BERNERS-LEE.

     Retrieved    Feb 13,2009 from http://www.consortiuminfo.org/bulletins/semanticweb.php

W3C (2008a) SPARQL Query Language for RDF Retrieved Feb 13, 2009 from

     http://www.w3.org/TR/rdf-sparql-query/

W3C (2008b) Cool URIs for the Semantic Web. Retrieved Feb 13, 2009 from

     http://www.w3.org/TR/2008/WD-cooluris-20080321/

W3C (2004) Web Services Architecture Retrieved Feb 13, 2009 from

     http://www.w3.org/TR/ws-arch/#id2260892

W3C (2003) Common HTTP Implementation Problems.

     http://www.w3.org/TR/2003/NOTE-chips-20030128/

Yergeau Y., Nicol G. Adams G. & Duerst M .(1997) Internationalization of the Hypertext Markup

     Language http://www.rfc-editor.org/rfc/rfc2070.txt

Weitere ähnliche Inhalte

Was ist angesagt?

Impact of trust, security and privacy concerns in social networking: An explo...
Impact of trust, security and privacy concerns in social networking: An explo...Impact of trust, security and privacy concerns in social networking: An explo...
Impact of trust, security and privacy concerns in social networking: An explo...
Anil Dhami
 
New approaches to openness – beyond open educational resources
New approaches to openness – beyond open educational resourcesNew approaches to openness – beyond open educational resources
New approaches to openness – beyond open educational resources
Grainne Conole
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Kumar Goud
 
Obsolete and emerging technologies
Obsolete and emerging technologiesObsolete and emerging technologies
Obsolete and emerging technologies
Duke University
 

Was ist angesagt? (17)

Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content Management
 
Semantic Web-Linked Data and Libraries
Semantic Web-Linked Data and LibrariesSemantic Web-Linked Data and Libraries
Semantic Web-Linked Data and Libraries
 
Networking online assignment
Networking online assignmentNetworking online assignment
Networking online assignment
 
Semantic web
Semantic webSemantic web
Semantic web
 
Knowledge Sharing in the Sciences - 8JPL
Knowledge Sharing in the Sciences - 8JPLKnowledge Sharing in the Sciences - 8JPL
Knowledge Sharing in the Sciences - 8JPL
 
Analysis, modelling and protection of online private data.
Analysis, modelling and protection of online private data.Analysis, modelling and protection of online private data.
Analysis, modelling and protection of online private data.
 
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Using Maltego Tungsten to Explore Cyber-Physical Confluence in GeolocationUsing Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
 
Information Organisation for the Future Web: with Emphasis to Local CIRs
Information Organisation for the Future Web: with Emphasis to Local CIRs Information Organisation for the Future Web: with Emphasis to Local CIRs
Information Organisation for the Future Web: with Emphasis to Local CIRs
 
Chapter 05 pertemuan 7- donpas - manajemen data
Chapter 05 pertemuan 7- donpas - manajemen dataChapter 05 pertemuan 7- donpas - manajemen data
Chapter 05 pertemuan 7- donpas - manajemen data
 
Impact of trust, security and privacy concerns in social networking: An explo...
Impact of trust, security and privacy concerns in social networking: An explo...Impact of trust, security and privacy concerns in social networking: An explo...
Impact of trust, security and privacy concerns in social networking: An explo...
 
New approaches to openness – beyond open educational resources
New approaches to openness – beyond open educational resourcesNew approaches to openness – beyond open educational resources
New approaches to openness – beyond open educational resources
 
Informatics moments
Informatics momentsInformatics moments
Informatics moments
 
Network literacy-high-res
Network literacy-high-resNetwork literacy-high-res
Network literacy-high-res
 
Cb4301449454
Cb4301449454Cb4301449454
Cb4301449454
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
 
Paper24
Paper24Paper24
Paper24
 
Obsolete and emerging technologies
Obsolete and emerging technologiesObsolete and emerging technologies
Obsolete and emerging technologies
 

Ähnlich wie Cert Overview

Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-LogicSecurity-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Nana Kwame(Emeritus) Gyamfi
 
Nlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_intNlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_int
KarenVacca
 
Web 3.0 - Media Theory
Web 3.0 - Media TheoryWeb 3.0 - Media Theory
Web 3.0 - Media Theory
Akshay Iyer
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
Kaniska Mandal
 
Katasonov icinco08
Katasonov icinco08Katasonov icinco08
Katasonov icinco08
cg19920128
 

Ähnlich wie Cert Overview (20)

Edu.03
Edu.03 Edu.03
Edu.03
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-LogicSecurity-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
 
The internet
The internetThe internet
The internet
 
Unit 1.4 Research
Unit 1.4 ResearchUnit 1.4 Research
Unit 1.4 Research
 
Nlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_intNlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_int
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web 3.0 - Media Theory
Web 3.0 - Media TheoryWeb 3.0 - Media Theory
Web 3.0 - Media Theory
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
 
1 web programming
1 web programming1 web programming
1 web programming
 
A42020106
A42020106A42020106
A42020106
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
Semantic Technology. Origins and Modern Enterprise Use
Semantic Technology. Origins and Modern Enterprise UseSemantic Technology. Origins and Modern Enterprise Use
Semantic Technology. Origins and Modern Enterprise Use
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
 
Semantic Web Mining of Un-structured Data: Challenges and Opportunities
Semantic Web Mining of Un-structured Data: Challenges and OpportunitiesSemantic Web Mining of Un-structured Data: Challenges and Opportunities
Semantic Web Mining of Un-structured Data: Challenges and Opportunities
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
A LITERATURE REVIEW ON SEMANTIC WEB – UNDERSTANDING THE PIONEERS’ PERSPECTIVE
A LITERATURE REVIEW ON SEMANTIC WEB – UNDERSTANDING THE PIONEERS’ PERSPECTIVEA LITERATURE REVIEW ON SEMANTIC WEB – UNDERSTANDING THE PIONEERS’ PERSPECTIVE
A LITERATURE REVIEW ON SEMANTIC WEB – UNDERSTANDING THE PIONEERS’ PERSPECTIVE
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Katasonov icinco08
Katasonov icinco08Katasonov icinco08
Katasonov icinco08
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database
 

Cert Overview

  • 1. STRAYER UNIVERSTY NETWORK ARCHITECTURE AND ANALYSIS SUBMITTED TO DR. BRYANT PAYDEN BY ADELMAR ESPLANA GRADUATE STUDENT IN MASTER OF SCIENCE IN INFORMATION SYSTEMS MARCH 2009
  • 2. 2 I. Overview “Knowledge is power (But only if you know how to acquire it)." (The Economist, 2003) Knowledge is awareness and understanding of facts, truth or information in the form of experiences and learning. Today, one of the quickest ways to acquire knowledge is through the Web or WWW (World Wide Web). Using search engine facilities, we can search almost any kind of information. The internet has become a world- wide network of information resource and a powerful communication tool. And the more specific word you type in, in the search engine, the more accurate results you would get. The internet is a powerful tool which may be used in a number of ways such as having online classes, bills payment, booking vacation, scheduling medical appointments, finding the latest news and even taking a virtual tour on vacation destinations. In spite of all these tremendous capabilities that the Web offers, to date, the principle that is being used by search engines in the analysis of material is based on textual indexing. Although search engines have proven remarkably useful in finding information rapidly, they have also been proven remarkably useless in providing quality information. “The biggest problem facing users of web search engines today is the quality of the results they get back.”(Brin S. & Page L.) Currently, search engines can only account for the vocabulary of the documents and has a little concept of document quality thereby producing a lot of junk. Since its inception in 1989, the internet has grown initially as a medium for the broadcast of read- only material, from heavily loaded corporate servers, to the mass of internet-connected consumers. Recently, the rise of digital video, blogging, podcasting, and social media has revolutionized the way people socially interact through the web. Other promising developments include the increasing interactive nature of the interface to the user, and the increasing use of machine-readable information with defined semantics allowing more advanced machine processing of global information, including machine-readable signed assertions. (Berners T., 1996a) A lot of the data that we use everyday are not part of the web. These data which are in various silos, reside in different software applications and different places that cannot be connected easily like bank statements, photographs and calendar appointments. In order for us to see bank statements, we have to use online banking; to manage our calendar appointments and organize our photo collection, we have to use
  • 3. 3 social-oriented websites such as MySpace and Facebook. It is interesting to see on the web photos displayed on a calendar, showing exactly what we were doing when we took them; and bank statement, showing the list of transactions that we incurred. Due to the silo of information where both personal and business information are managed by different software, we are forced to learn how to use different programs all the time. In addition, since data are controlled by applications, each application must keep the data to itself. According to W3C Group, what we need is a “web of data” (Herman I., 2009). Supported by Fielding’s dissertation, he mentioned that what we need is a way to store and structure our own information, whether permanent or temporal in nature, both for our own necessity and for others, be able to reference and structure the information stored by others so that it would not be necessary for everyone to keep and maintain local copies. (Fielding R., 2000) II. Purpose of the Paper Understanding the underlying concepts and principles behind the web is essential to current and future implementation initiatives. For this reason, it is the objective of this paper to uncover the root of its existence, and to examine the fundamental design notion of the following design principles: Independent specification design, Hypertext Transfer Protocol (HTTP), Uniform Resource Identifier (URI) and Hypertext Markup Language (HTML). This study also aims to develop a better understanding of the emerging web standards, such as REST, SOA, and Semantic Web. The paper discusses some of the misconceptions about URI, HTTP and XML and the following issues: a) In REST and Semantic point of view, there is no difference between slash based and parameter based URI reference; b) HTTP is not a data transfer protocol; it is an application protocol (or a coordination language, if you swing that way). REST does not "run on top of HTTP" but rather HTTP is a protocol that displays many of the traits of the REST architectural style; c) What is Extensible Markup Language (XML) function in Representational state transfer (REST) and Semantic Web? Is it true that most REST services in deployment do not return XML but rather HTML? Is it true that REST has no preference for XML?
  • 4. 4 III. Foundation of the World Wide Web According to Tim Berners, “The goal of the Web was to be a shared information space through which people (and machines) could communicate.” (Berners T., 1996a) It was also the original intent that this so called “space” ought to span all sorts of information, from different sources to a wide array of formats, and from highly valued designed material to a spontaneous idea. In the original design of the web, he stated the following fundamental design criteria: (Berners T., 1996a) a) An information system must be able to record random associations between any arbitrary objects, unlike most database systems The concept of database systems has been purposely utilized to facilitate storage, retrieval and information generation of structured data. Unlike, the web concept of “one universal space of information” which is based on the principle that almost anything on the web could be possibly linked to any arbitrary objects. The power of the Web is that linkage can be established to any document (or, more generally, resource) of any kind in the universe of information, whereas in the database systems, one has to understand the data structure to establish the relationship. b) If two sets of users started to use the system independently, to make a link from one system to another should be an incremental effort, not requiring unscalable operations such as the merging of link databases In the business environment, to integrate two different types of systems, it is necessary to perform some degrees of integration efforts such as merging, importing or linking of databases. On the contrary, the idea of web was to be able integrate systems easily. Most of the systems done in the past involve a great deal of integration effort due to the information silo. For this reason, the idea where machine can talk to each other set forth the promise of seamless integration. c) Any attempt to constrain users as a whole to the use of particular languages or operating systems was always doomed to fail. Information must be available on all platforms, including future ones. Platform and language interoperability support the principles of universality of access
  • 5. 5 irrespective of hardware or software platform, network infrastructure, language, culture, geographical location, or physical or mental impairment. d) Any attempt to constrain the mental model users of data into a given pattern was always doomed to fail. If information within an organization is to be accurately represented in the system, entering or correcting it must be trivial for the person directly knowledgeable If the interaction between person and hypertext could be so intuitive that the machine- readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations. Independent specification design The basic principles of the Web proposed in 1989 to meet the design criteria were adopted based on the well-known software design principles called “independent specification design”. This design was based on the principle of modularity. Meaning when it is modular in nature, the interfaces between the modules hinge on simplicity and abstraction. This allows seamless compatibility of the existing content, to work with the new implementation. As technology evolves and disappears, specifications for the Web’s languages and protocols should be able to adapt to the new hardware and software changes. Along with this basic principle are the three main components such as URI, HTTP and HTML. URI or Universal Resource Identifier URI is a compact string of characters for identifying abstract or physical resource. It is a simple and extensible means of identifying a resource. A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identifies resources via a representation of their primary access mechanism, rather than identifying the resource by name or by some other attribute(s) of that resource (Berners T., Fielding R. & L. Masinter, 2005). URNs (Uniform Resource Names) are used for identification; URCs (Uniform Resource Characteristics), for including meta- information; and URLs, for locating or finding resources.
  • 6. 6 REST defines URI as a resource based on a simple premise that identifiers should change as infrequently as possible (Fielding R., 2000). While Semantic Web identifies URI’s not just Web documents, but rather real-world objects like people, cars, and abstract ideas. They call all these as real-world objects or things (W3C, 2008b). Deriving URI definitions from the meaning of each letters U -Uniform, R-Resource and I-Identifier as listed below: Uniform allows consistency of its usage, even when the internal mechanism of accessing the resources has changed. It allows common semantic interpretation of syntactic conventions, across different type of resources to work with the existing identifiers. Resources are, in general, any real world “thing” such as electronic documents, images and services, recognized by URI to represent something, for example, electronic document, an image, or a source of information with consistent purpose. Other resources that are not accessible via internet are representation of the abstract concepts, mathematical equations, correlation (e.g., “parent” or “employee”) and values (e.g., zero, one, and infinity). Identifier pertains to information required to distinguish what is being identified from all other things within its scope of identification. The terms “identify” and “identifying” means distinguishing one resource from the other regardless how that purpose is accomplished. One of the capabilities web popularized is the ability of documents to link to any kind, in the universe of information. With this in mind, the concept of “identity” is concerned with the conceptual scheme of identifying objects generically. For example, one URI can represent a book which is available in several languages and several data format. HTTP and URIs are the basis of the World Wide Web, yet they are often misunderstood, and their implementations and uses are sometimes incomplete or incorrect (W3C, 2003). a) A common mistake, responsible for many implementation problems, is to think that a URI is equivalent to a filename within a computer system. This is wrong as URIs have, conceptually, nothing to do with a file system.
  • 7. 7 b) A URI should not show the underlying technology (server-side content generation engine, script written in such or such language) used to serve the resource. Using URIs to show the specific underlying technology means one is dependent on the technology used, which, in turn, means that the technology cannot be changed without either breaking URIs or going through the hassle of "fixing" them. HTTP According to the HTTP 1.0 specification, The Hypertext Transfer Protocol (HTTP) is an application- level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (Berners T., Fielding R. & Frystyk H., 1996b). HTTP messages are generic and communication takes place operationally based on the client/server paradigm of request/response. Messages are all created to comply with the generic message format. Clients usually send requests and receive responses, while servers receive requests and send responses. It is stateless and connectionless in nature because after the server has responded to the client's request, the connection between client and server is dropped and forgotten. There is no "memory" between client connections. Basically, when you type in the URL in the browser, the client and server Connection takes place over TCP/IP. This URL internally gets converted into a Request for server to process, after the server finished processing, and then the server sends the message Response back to the client and Closes the connection of both parties. The downside of it is that, it may decrease the network-performance due to the increasing amount of overhead data per request, the fact that the state of request is not stored in a shared context. The design is patterned and implemented with the idea of object-orientation. In general, objects used internally for each request are as follows: HTTP messages, Request/Response, Entity, Method Definitions, Status Code Definitions, Status Code Definitions, and Header Field Definitions (based HTTP 1.0 specification).
  • 8. 8 The Method field indicates the method to be performed on the object identified by the URL. Methods supported by HTTP 1.1 specification are OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, and CONNECT (Fielding R., 1999). The GET method means to retrieve whatever is identified by the URI. The HEAD is the same as GET but it returns only HTTP headers and no document body. The POST method is used to request that the destination server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions: 1) annotation of existing resources; 2) posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles; 3) providing a block of data, such as the result of submitting a form, to a data-handling process; and 4) extending a database through an append operation. (Berners T., Fielding R & H. Frystyk, 1996b). HTML The Hypertext Markup Language (HTML) is a markup language used to create hypertext documents that are platform independent. (Yergeau F. et.al, 1997) The difference between XML and HTML is that, HTML is defined by W3C that must be followed by every possible browser. XML is an extension, its markup is customizable. It is typically used as storage to hold and describe data. IV. Future of the World Wide Web Say you had some lingering back pain: a program might determine a specialist's availability, check an insurance site's database for in-plan status, consult your calendar, and schedule an appointment. Another program might look up restaurant reviews, check a map database, cross-reference open table times with your calendar, and make a dinner reservation. Tim Berners and others describe this as “web of data”. This will be the new Web capable of supporting software agents that are able not only to locate data, but also to “understand” in ways that will allow computers to perform meaningful tasks with data automatically on the fly (Updegrove, 2001). The Semantic Web is a web of data. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of
  • 9. 9 documents. It is also about language for recording how the data relates to real world objects that allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing (Herman I., 2009). Representational State Transfer (REST) is an architectural style for distributed hypermedia systems, describing the software engineering principles guiding REST and the interaction constraints chosen to retain those principles, while contrasting them to the constraints of other architectural styles (Fielding R., 2000). The fundamental differences between the two are: Semantic Web is an integration solution (a solution to information silo), while REST is a set of state transfer operations universal to any data storage and retrieval system (Battle R. & Benson E., 2007). Semantic Web provides ways to semantically describe and align data from desperate sources while REST offers resource data access operations commonly known as CRUD (Create, Read, Update and Delete). From the traditional “web of pages” to a “web of data”, the Semantic Web goal is to provide a cost- efficient way of sharing machine-readable data. The business of sharing machine-readable data in general has been around for quite some time. Information silo has always been a challenge that researchers and IT practitioners are keen about. Service-oriented architecture (SOA) solutions have been created to satisfy business goals that include easy and flexible integration with legacy systems, streamlined business processes, reduced costs, innovative service to customers, and agile adaptation and reaction to opportunities and competitive threats. SOA is a popular architecture paradigm for designing and developing distributed systems (Bianco P. et al., 2007). In spite of the popularity of SOA and Web Services, confusion among software developers is prevalent. To shed a light, SOA is an architectural style, whereas Web Services is a technology used to implement SOA’s. Web services provide a standard means of interoperating between different software applications, running on a variety of platforms and/or frameworks (W3C, 2004). The Web services technology consists of several published standards, the most important ones being SOAP, XML (Extensible Markup Language) and WSDL (Web Services Description Language). Although there are some other technologies like CORBA and
  • 10. 10 Jini but to limit our discussion, we are only concerned with the Web Services as other do not apply to Web domain. At the heart of the Service Oriented Architecture is the service contract. It answers the question, "what service is delivered to the customer?" In the current web-services stack, WSDL is used to define this contract. However, WSDL defines only the operational signature of the service interface and is too brittle to support discovery in a scalable way. "SOAP” is no longer an acronym, A SOAP message represents the information needed to invoke a service or reflect the results of a service invocation, and contains the information specified in the service interface definition (W3C, 2004). Extensible Markup Language (XML) documents are made up of storage units called entities, which contain either parsed or unparsed data. (Cowan J., 2008). SOAP and WSDL are good examples of XML documents. As mentioned earlier, Web services way is just another roadmap of Service Oriented Architecture. The concept of “web of data” was also introduced as a solution for information silo and, was able to establish the rationale for Web-accessible API (Application Programming Interface). Technically speaking, a Web service is a Web-accessible API. So, why is there a need for REST and Web Semantics? There is a great amount of data available through REST and SOAP Web Services, published by private and public sectors however these data carry no markup that conforms to semantic standards. It is important to provide markup in a manner where Semantic Web application suite understands to make the services compatible and to allow semantic query operations feasible (Battle R. & Benson E, 2007). In the traditional Database Systems, we have SQL (Structured Query Language). It is the language used to interact with the database. In the Semantic world, there is this technology called SPARQL (SPARQL Protocol and RDF Query Language). SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs (W3C, 2008a).
  • 11. 11 It is now getting more interesting, that developers have to learn not only SQL but also SPARQL. Thanks to Edgar Frank Codd, the inventor of Relational Database; without his invention we are still using the filing cabinet and physically sorting out and searching records manually. It is important to provide markup in a manner where Semantic Web could understand to make the services compatible and to allow semantic query operations feasible. With the existing SOAP Web services, what needed to be done is to add semantic information to web services, such as OWL-S and SAWDL. They provide details for each Web service parameter that describes how the value is derived from ontology (Battle R. Benson E, 2007). The OWL-S document maps each operation and message defined in the WSDL definition ontology. But the problem with SPARQL is that, it is an RDF Query Language designed for RDF. Since most Web services return plain old XML, a conversion process from XML data to RDF is needed. So where does REST come into play? REST is another roadmap of SOA and a principle that is being applied to a quite few of Web services implementation. While SOAP based services have a WSDL document that defines their operation, there is no standard equivalent for REST services. This is the area where companies who adopted REST earlier must be aware of. Who knows, if companies are really convinced that this is the right way of doing Web services, then maybe in the future there will be a standard way of implementation. V. Discussion a. In REST and Semantic point of view, there is no difference between slash based and parameter based URI reference. There are two major requirements on the Semantic Web where naming of Resource must be followed. First, a description of the identified resource should be retrievable with standard Web technologies. Second, a naming scheme should not confuse things and the documents representing them (Battle R. Benson E., 2007). Both REST and Semantic Web support the idea that “Cool URIs don't change”. Tim Berner explained that the best resource identifiers don't just provide descriptions for people and machines, but are designed with simplicity, stability and manageability in mind. Based on
  • 12. 12 W3C standard, a generic URI syntax consists of hierarchical sequence of components such as scheme, authority, path, query, and fragment. URI = scheme ":" hier-part [“?" query] [“#" fragment] The following are two examples URIs and their component parts: foo://example.com:8042/over/there?name=ferret#nose _/ ______________/_________/ _________/ __/ | | | | | scheme authority path query fragment | _____________________|__ // urn: example: animal: ferret: nose WC3 recommends the use of standard session mechanisms instead of session-based URIs (W3C, 2003). What does it mean? HTTP/1.1 provides a number of mechanisms for identification, authentication and session management. Using these mechanisms instead of user- based or session-based URIs guarantees than the URIs used to serve resources is truly universal (allowing, for example, people to share, send, or copy them). For example: Bob tries to visit http://www.example.com/resource, but since it's a rainy Monday morning, he gets redirected to http://www.example.com/rainymondaymorning/resource. The day after, when Bob tries to access the resource, he had bookmarked earlier, the server answers that Bob has made a bad request and serves http://www.example.com/error/thisisnotmondayanymore. Had the server served back http://www.example.com/resource because the Monday session had expired, it would have been, if not acceptable, at least harmless. The problem with this is that, it does not really guarantee that URI’s used are truly universal. The acceptable practice in this situation is to use some modifiers, like "?" used to pass arguments for cgi, or ";" and to pass other kind of arguments or context information.
  • 13. 13 Roy Fielding did not mention in his dissertation that URI do not allow parameterized reference. Similarly, Semantic Web requirements mentioned that as long as the identifier conforms to the two major requirements above and W3C standard specifications then the use of it is acceptable. Both REST and Semantic Web consistently raised the implementation need of having abstraction to URI. The key abstraction of information in REST is a resource. Any information that can be named can be a resource, a document or image, a temporal service (e.g. “today’s weather in Los Angeles”), a collection of other resources, a non-virtual object (e.g. a person), and so on (Fielding R. 2000). It was mentioned previously that the biggest challenge of the search engines today is the quality of results. Search engine spiders do not presently crawl many types of “dynamic” web pages. Typical examples of dynamic pages are those internal and external web applications that companies are using to do their business as well as those Web 2.0 emerging sites. In accordance to this, it is important that we identify the types of resource and mapped the underlying entity. Conceptual representation means, resource is an abstraction of some type of arbitrary concept. Once mapping of the concept “resource” to a physical resource is done, it should remain this way as long as possible. Think about pages that have .asp extension. Companies who are still linking to those pages are possibly not working anymore since a lot of companies are now moving to .aspx. b) HTTP is not a data transfer protocol; it is an application protocol (or a coordination language, if you swing that way). REST does not "run on top of HTTP" but rather HTTP is a protocol that displays many of the traits of the REST architectural style. HTTP is not designed to be a transport protocol. It is a transfer protocol in which the messages reflect the semantics of the Web architecture by performing actions on resources through the transfer and manipulation of representations of those resources. It is possible to achieve a wide range of functionality using this very simple interface, but following the interface is required in order for HTTP semantics to remain visible to intermediaries (Fielding R., 2000).
  • 14. 14 Conceivably, it is easy to get the wrong idea that REST sits in between Application Protocol and Transport Protocol when it was cited that REST is a “transfer protocol”. Leveraging the HTTP Headers to provide request context around CRUD operations (Create for POST, Read for GET, Update for PUT and Delete for DELETE) will allow developers to overlay the programmatic API for a website directly on top of the site exposed to web user and reduce the cost and complexity of providing multi- format access to a site’s underlying data. c) What is Extensible Markup Language (XML) function in REST and Semantic Web? Is it true that most REST services in deployment do not return XML but rather HTML? Is it true that REST has no preference for XML? RESTS’s data elements are summarized in Table 1 Table 1 REST Data Elements Data Element Modern Web Examples resource the intended conceptual target of a hypertext reference resource identifier URL, URN representation HTML, document, JPEG image representation metadata media type, last-modified time resource metadata source link, alternates, vary control data if-modified-since, cache-control It is true that Roy did not specify XML as an example of resource and resource metadata; on the other hand, he did mention the representation media type which is the data format of the representation. He described that representation consists of data, meta data describing the data, and, on occasion metadata to describe metadata. As mentioned previously XML is an open standard for describing data. Therefore, the question about “REST has no preference for XML”. It is not true if you are doing Web services but false if you are just creating pages that are not designed for machine
  • 15. 15 interpretation, why do you have to care about returning XML if you don’t need it in the first place? Likewise if you want to share your data, how do you want your data represented? Using file delimited text? The idea of REST and Semantic Web is to coexist with the existing web standards and not to disqualify any of them; in fact it is the idea of Platform and language interoperability. V. CONCLUSION Acquiring data has becoming easier and easier than ever before and with latest technology breakthroughs, there is no doubt that in time the internet will be “all in one”. Along the way, there will be some adjustments and corrections to be done and misconceptions to be addressed (intentional or not - whatever) to reach this so called “web of data”. Support from the industry players is crucial. Data security issues have always been our primary concerned. There are a plethora of questions that must be addressed, such as the following; a) who will annotate the data? b) What is the advantage of giving the data, as we all know that data is a valuable commodity? c) Without any centralized control, how will all this data be connected to one another? d) Will the existing AI techniques be sufficient to process this huge amount of data? In addition, is it even practical to pursue this route? On the other side, web has exploded so rapidly that in the beginning, what we are only concerned about is the sharing of documents. Giving credit to the early contributors who brought us this far, I firmly believe that the simple approach of the existing implementation of web particularly URL (which was originally designed as URI) opened up the door to everybody (computer savvy and none). Explaining URI alone to common people and doing it right the first time is not simple. Philosophically speaking, isn’t it also what the concept of “universality of access” is? In the same way, when you write software, it is not always right the first time. Writing software is an evolving process. Meeting the line of both ends, I believe that there is not much we can do concerning why things are done the URL way, but I do recognize that there is always room for improvement. With a better understanding of what went before and what it is about to come, moving towards the future of Semantic Web is up for us to consider.
  • 16. 16 VI. REFERENCES Berners T., Fielding R. & Masinter L. (2005) Uniform Resource Identifier (URI): Generic Syntax. Retrieved Feb 13, 2009 from http://labs.apache.org/webarch/uri/rfc/rfc3986.html#URLvsURN Berners T. (2002). What do HTTP URIs Identify? Retrieved Feb 20, 2009 from http://www.w3.org/DesignIssues/Overview.html Berners T. (1996a). The World Wide Web: Past, Present and Future. Retrieved on Feb 20, 2009 from http://www.w3.org/People/Berners-Lee/1996/ppf.html Berners T., Fielding R & H. Frystyk (1996b) Hypertext Transfer Protocol -- HTTP/1.0. Retrieved Feb 13, 2009 from http://www.ietf.org/rfc/rfc1945.txt Battle R. Benson E, (2007). Bridging the semantic Web and Web 2.0 with Representational. State Transfer (REST). Retrieved Feb 13,2009 from http://omescigil.etu.edu.tr/semanticweb/papers/sw_4.pdf Bianco P., Kotermanski R. Merson P. (2007 ) Evaluating a Service-Oriented Architecture. Retrieved Feb 13,2009 from http://www.sei.cmu.edu/pub/documents/07.reports/07tr015.pdf Brin S. & Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Retrieved Feb 20, 2009 from http://infolab.stanford.edu/~backrub/google.html Cowan J., Fang A., Grosso P., Lanz K., Marcy G., Thompson H., Tobin R., Veillard D., Walsh N., Yergeau F. (2008) Extensible Markup Language (XML) 1.0 (Fifth Edition) Retrieved Feb 13,2009 from http://www.w3.org/TR/REC-xml/#sec-intro Fielding R. (2000). Architectural Styles and the Design of Network-based Software Architectures. Retrieved Feb 20, 2009 from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm Fielding R., Mogul J., Gettys J., Frystyk H., Masinter L., Leach P. & Berners T. (1999) Hypertext Transfer Protocol HTTP/1.1. Retrieved Feb 13,2009 from http://www.ietf.org/rfc/rfc2616.txt Herman I. (2009). W3C Semantic Web Activity. Retrieved Feb 20, 2009 from http://www.w3.org/2001/sw/
  • 17. 17 The Economist (2003). Knowledge is power. Retrieved on Feb 20, 2009 from http://www.economist.com/business/globalexecutive/education/displayStory.cfm?story_id=17626 Updegrove A. (2001) THE SEMANTIC WEB: AN INTERVIEW WITH TIM BERNERS-LEE. Retrieved Feb 13,2009 from http://www.consortiuminfo.org/bulletins/semanticweb.php W3C (2008a) SPARQL Query Language for RDF Retrieved Feb 13, 2009 from http://www.w3.org/TR/rdf-sparql-query/ W3C (2008b) Cool URIs for the Semantic Web. Retrieved Feb 13, 2009 from http://www.w3.org/TR/2008/WD-cooluris-20080321/ W3C (2004) Web Services Architecture Retrieved Feb 13, 2009 from http://www.w3.org/TR/ws-arch/#id2260892 W3C (2003) Common HTTP Implementation Problems. http://www.w3.org/TR/2003/NOTE-chips-20030128/ Yergeau Y., Nicol G. Adams G. & Duerst M .(1997) Internationalization of the Hypertext Markup Language http://www.rfc-editor.org/rfc/rfc2070.txt