Grid computing notes

By
Prof. SYED MUSTAFA,
HKBK College Of Engineering
Clouds, Grids, and Clusters

Introduction:
• Grid computing is a distributed computing approach where the end user
will be ubiquitously offered any of the services of a “grid” or a network of
computer systems located either in a Local Area Network (LAN) or in a Wide
Area Network (WAN) in a spread of geographical area.
• It aims at dynamic or “runtime” selection, allocation and control of
computational resources such as processing power, disk storage or
specialized software systems or even data, according to the demands of the
end-users.
GRID Computing – 1:
2
Clouds, Grids and Clusters- Prof. A.
Syed Mustafa, HKBKCE.
UNIT – 5

Clouds, Grids, and Clusters
Text Books:
1. Anthony T. Velte, Toby J. Velte, Robert Elsenpeter: Cloud Computing, A Practical Approach, McGraw Fill, 2010.
Chapters: 1,2,3,4,5,7,8, 9,10,12,13.
2. Prabhu: Grid and Cluster Computing, PHI, 2007. Chapters:1,2,3,4,5,6,7,8,9,10,12, 13,14,15,16
Unit 1 : Text 1-Chapters-1,2
Unit 2 : Text 1-Chapters-3,4
Unit 3 : Text 1-Chapters-5,7,8
Unit 4 : Text 1-Chapters-9,10,12,13
Unit 6 : Text 2-Chapters-3,4,6

Introduction:
• Grid computing also means that the end-users connect to
a grid of computing resources the way the end-users of a power
supply or water supply grid connect and draw power or water as they need,
at any time and any location without any knowledge or reference to the
details such as the exact location or nature or quantity of the resource
being drawn.
• IBM :
▫ “A grid is a collection of distributed computing resources over a local or
wide area network, that appear to an end-user or application as one large
virtual computing system”
4
UNIT – 5

Introduction:
• Vision:
Create virtual dynamic organizations through secure, coordinated resource
sharing among individuals and institutions.
• Grid Computing:
An approach that spans locations, organizations, machine architectures and
software boundaries to provide unlimited power, collaboration and information
access to everyone connected to the grid.
The Internet is about getting computers together (connected), grid computing
is about getting computers work together.
Combines the QoS of enterprise computing with ability to share the
heterogeneous distributed resources – everything from applications to data
storage and servers.
5
UNIT – 5

Introduction:
• Grid computing is a middleware software that manages and executes all the
activities related to:
▫ Identification of resources
▫ Allocation and deallocation of resources
▫ Consolidation of resources
• Organizations having under-utilized or over-utilized resources need a dynamically
equitable distribution of resources.
6
UNIT – 5

Introduction:
The 'grid' refers to an infrastructure that enables the integrated, collaborative use of high
end computers, networks, databases, and also other scientific resources including
instruments owned and managed by various organizations.
“A computational grid is a hardware and software infrastructure that provides
dependable, consistent, pervasive, and inexpensive access to high-end computational
capabilities”
7
UNIT – 5

Introduction:
Multiple Virtual Organizations(MVO)- Collaborator agencies in a grid.
Grid is a network of computation while the Internet is network of communication.
The Grid computing Environment provides the tools and protocols for resource sharing
and problem solving in dynamic, multi-institutional Virtual organizations or communities
of user organizations.
8
Clouds, Grids and Clusters- Prof. A. Syed Mustafa,
HKBKCE.
UNIT – 5

Introduction:
Grid computing comprises of a combination of a decentralized architecture for resource
management and a layered architecture of a specific hierarchy for the implementation
of various services of the grid.
Grid Computing System can have any configuration starting with a Local Area
Network(LAN) or a large Wide Area Network(WAN) at the national scale or even an
international network spanning several countries and continents.
It can span a single organization or many organizations or service providers’ space.
A grid can focus on the pooled assets of one organization or a pool f Multiple Virtual
Organizations(MVOs) all of which use common protocols so as to enable the grid to offer
services and run applications in a secure and controlled way.
9
UNIT – 5

1.1 The Data Centre:
• Before the data centre concepts came, organizations maintained own servers and specialized
software.
• This approach was expensive and redundant
• Data Centres shared resources with organizations.
• Organizations connected to a data centre may not be able to use resources from other data
centres
• Concept of grid computing enables multiple data centres (same or different organizations) to be
networked and shared.
• Grid is a combination of:
▫ Distributed computing
▫ High Performance computing
▫ Disposable computing
• Grid provides a metacomputing environment, which can be a megacomputing facility for the
users.
• Grid provides computational utility to its consumers
10
UNIT – 5

1.2 Cluster Computing and Grid computing :
Clusters are aggregations of processors in parallel configurations.
• Resource allocation is performed by a centralized resource manager and scheduling
system.
• A cluster comprises of multiple interconnected independent nodes that cooperatively
work together as a single unified resource.
• All nodes of a cluster work cooperatively together, as a unified resource.
• Grid has resource manager for each node.
• Grid does not provide a single system view.
• Some grids are collections of clusters. Example: NSF Tera Grid
11
UNIT – 5

1.3 Metacomputing
• Metacomputing is all computing and computing-oriented activity which involves
computing knowledge (science and technology) utilized for the research,
development and application of different types of computing. ---- Wikipedia
• Use of powerful computing resources, transparently available to the user via a
networked environment is Metacomputing.
• Three essential steps to achieve goals of metacomputing are:
1. To integrate the large number of individual hardware and software resources
into a combined networked resource
2. To deploy and implement a middleware to provide a transparent view of
resources available
3. To develop and deploy optimal applications on the distributed metacomputing
environment to take advantage of the resources.
UNIT – 5
HKBKCE.

1.3 Metacomputing
• Challenges in metacomputing –
• -Viability of the linking speeds for realistic application execution
• -ability and feasibility to execute parallely the components of an application
• Resources and originating points of data are geographically distributed – may need
to processed in a distributed manner
• Metacomputing is useful when a single point usage is required for large remotely
located resources.
• Metacomputing encompasses two broad categories:
• - Seamless access to high performance
• -Linking of computing resources, instruments and other resources.
UNIT – 5
HKBKCE.

1.3 Metacomputer composition
• Metacomputer is a virtual computer. It has a vrtual computing architecture.
• Its constituent components (individual servers and other computational resources)
are individually not important.
• Metacomputer consists of:
1. Processors and memory
2. Network and communication software
3. Remote data access and retrieval
4. Virtual environment
UNIT – 5
HKBKCE.

1. Processors and memory
 The primary resources of metacomputer is processors and the associated memory units.
 A meta computer is a Single virtual view of several (large number) of processors
and their associated memory units
2. Network and communication software
Interconnected network of physically distributed processors.
Links between machines could be via modems,ISDN, Ethernet(fast/
gigabit),FDDI,ATM (Asynchronous Transmission mode) or any other networking
technology.
High bandwidth and low latency is required to provide rapid and reliable
communication
UNIT – 5
HKBKCE.

3. Remote data access and retrieval
 In a Metacomputer comprising of a large number of nodes, the data stored in the
secondary storage devices with each node is required to be accessed remotely and
retrieved upon demand.
Date sizing may go upto petabytes.
Retrieval, replication and mirroring support will be required for the purposes of
recovery and business continuity.
Ability to access remote data is a challenge in metacomputing.
Ability to manage and manipulate large quantity of remote data
4. Virtual environment
 A software like an operating system, that can configure, manage and maintain
meta computing environment.
UNIT – 5
HKBKCE.

1.3 Evolution of Metacomputing projects
• Project FAFNER (1995) - (Factoring via Network-Enabled Recursion)
▫ Finding factors of large numbers parallely, over a large network of mathematicians who
calculated the factors required for prime numbers in the context of encryption for
Public Key Infrastructure(PKI).
▫ PKI is for secure communication with digital signatures.
▫ RSA( Rivest, Shamir, Adleman-Massachusetts Intstitute of Technology-MIT) Algorithm –
keys were generated to factor extremely large numbers.
▫ RSA keys are 154 or 512 digit keys.
▫ Started by Bellcore Labs, Syracuse University.
▫ FAFNER was initiated to factor RSA-130 using a numerical technique called Number field
Save(NFS) factorization method, using web servers for computation.
▫ To distribute the code for factorizing and related information data
UNIT – 5
HKBKCE.

• FAFNER (1995) - (Factoring via Network-Enabled Recursion)
▫ A web interface for NFS(Number Field Sieve) was produced.
▫ Web forms with ‘get’ and ‘post’ method were used to invoke the CGI scripts from Server
side written in PERL language.
▫ Single workstations with a small memory(4MB) were allowed to perform useful work
using Sieve and small boundaries.
▫ A consortium of sites was deployed to run CGI scripts package locally.
▫ The monitoring was done by RSA-130 web servers hierarchically.
UNIT – 5
HKBKCE.

• Project I-WAY (1995) - Information Wide Area Year
▫ Experimental high-performance network, linking many servers and addressed
virtualization environments.
▫ The objective is to integrate existing high bandwidth networks with telephone
systems.
▫ The servers, datasets and software environments(for virtualization) located in 17
different US locations were integrated by connecting them with 10 networks of
different bandwidths and protocols using different routing and switching
technologies.
▫ To standardize I-WAY software interface management, key sites installed Point of
Presence(POP) computer system to serve as their respective gateways to I-WAY.
UNIT – 5
HKBKCE.

▫ These I-POP systems were UNIX workstations configured homogeneously and
contained a software environment.
▫ I- Soft which helped overcome problems and issues related to heterogeneity,
scalability, security and performance.
▫ I-POP machines provided uniform authentication, resource reservation and
process creation.
▫ I-POP also performed communication functions across the resources in I-WAY.
▫ A resource Scheduler called Computational Resource Broker(CRB) was developed
or used. It consists of User-CRB, CRB-User, CRB-Local Scheduler protocols.
UNIT – 5
HKBKCE.

▫ A central scheduler maintained queues of jobs and tables indicating the state of
local machines, allocating jobs to machines.
▫ Multiple local scheduler also operated for local scheduling.
▫ AFS file system was sued for file movement and processing functions.
▫ To support user level tools, a software called ‘Nexus’ was used to perform
automating configuration mechanism.
▫ Supported Applications: Super computing, Virtual reality, multi-virtual reality, web
video.
▫ Grid computing can be used in metacomputing mode for scientific applications.
UNIT – 5
HKBKCE.

1.4 Scientific, Business and e-Governance Grids
• Grid computing approach helped to all computing communities – businesses,
scientific research and government applications.
• Scientific grids – users belong to only scientist groups
• Business grids/e-governance grids – users may belong any citizen groups using
business services or government services.
• The number of users in Businesses and e-Governance are high – hence setting up
such girds are more complex.
• Internet is the only communication network available to the citizens for accessing a
business or an e-governance grid.
• The user interfaces, access speeds and data sizes will be large.
UNIT – 5
HKBKCE.

1.5 Web Services and Grid Computing
• Users of business and e-Governance grids will need web services over internet.
• Users of business grids will not be interested in hardware and software locations
• They are not interested in resource allocation management as well.
• Hence the need for integrating web services with grids.
• The Open Grid Services Architecture(OGSA) becomes essential to offer effective,
stateful web services based on Service Oriented Architecture(SOA) on the grid.
UNIT – 5
HKBKCE.

1.6 Business computing and the Grid – a Potential Win-win situation
• Grid was initially utilized for solving very long computational problems in Scientific
research, applications such as: weather forecasting models, molecular modelling,
bioinformatics, drug design, etc.
• Computational grids have been used for several years to solve large scale problems
in basic sciences and engineering.
• By harnessing the grid approach businesses can achieve cost reduction and better
QoS.
• Grid leverages its extensive information capabilities to support the processing and
storage requirements to complete a task.
• Hence grid can provide the maximum resource utilization, providing fastest,
cheapest and maximum satisfaction.
UNIT – 5
HKBKCE.

• In a Grid environment , heterogeneous computer systems located over a large area
or around the globe can be integrated and made to appear as a single
computational resource, to be optimally or maximally utilized by the user
community without any loss or wastage of time, investment or resources.
• The goal of the grid computing is to provide the users with a single view
(independent of the location) and a single mechanism that can be utilized to
support any number of computing tasks.
• The participating computer systems of a grid could be located in the same room or
distributed across the globe.
UNIT – 5
HKBKCE.

The grid computing for business is based on three factors:
1. The ability of grid to ensure more cost-effective use of a given amount of computer
resources
2. A methodology to solve any difficult or large problem by using grid as a ‘large
computer’
3. All the computing resources of a grid such as CPUs, disk storage systems and
software packages can be comparatively and synergistically harnessed and
managed in collaboration towards a common business objective.
UNIT – 5
HKBKCE.

1.7 E-Governance and the Grid
• In the case of e-governance, the citizens become the end user, citizen services of the
government becomes the most important and high priority application of the grid.
• citizen services are delivered as e-governance services through the web.
• The new paradigm of technology is Web services based on SOA.
• When e-governance services are delivered as Web services on Service oriented
architecture, the grid has to support the Web services.
• The new grid architecture OGSA – Open grid services architecture standard offers
the stateful web services for e-governance.
• Grid tools such as Globus toolkit (V4) support OGSA standard on the grid.
• It is possible to develop and offer e-governance services to citizens through web
services on the grid on the OGSA standard.
UNIT – 5
HKBKCE.

Key functional requirements in Grid Computing
1. Resource Management
2. Security Management
3. Data Management
4. Services Management
UNIT – 5
HKBKCE.

Key functional requirements in Grid Computing
1. Resource Management
The ability to keep track, allot and remove grid resources
2. Security Management
The ability to ensure authenticated and authorized access to grid resources, from the users in
the external world.
3. Data Management
The ability of transporting , cleaning , parceling and processing the data between any
two nodes in the grid, without the knowledge of the user.
4. Services Management
The ability of the users and applications to query and obtain response from the grid
efficiently.
UNIT – 5
HKBKCE.

Six Layered Architecture of Grid Computing
Layer # Layer use
6 Application Layer Scientific/Engineering, Commercial, EGovernance
applications
5 Middleware Layer Tools, Languages, Libraries
4 Resource
Management Layer
Resource access and management, scheduling services
3 Job Management
Layer
Job Scheduling, Job management, accounting
2 Security Layer Authentication and Authorization
1 Infrastructure Layer Processors, Storage, Software, Data
UNIT – 5
HKBKCE.

UNIT – 5
1. Infrastructure Layer/ Grid Fabric:
 All computational resources, processors, disk storage, operating systems and
software ( both Application software and System Software).
 All these resources are to be monitored and dynamically allocated and deallocated
to various job requirements from the users.
 This function is performed by the Middleware software(eg: Globus Toolkit).
HKBKCE.

UNIT – 5
2. Security Layer:
 Performs authentication and authorization of users who intend to access the grid
and its various computational resources.
3. Job Management Layer:
 Part of the Core Layer of the Grid Middleware software.
 Designed to hide the complexities of partitioning, distributing and load balancing
of jobs.
 Job scheduling of various jobs.
HKBKCE.

UNIT – 5
4. Resource Management Layer:
 Core Layer of Grid Management.
 Resource monitoring, allocation and management of the resources being drawn from
the lowest layer of the grid architecture.
5. Middleware Layer:
 Globus Tool Kit which comprises of tools, languages and the grid programming
Environment including Compilers and Libraries.
 It also includes Resource brokers- which manage the execution of the application on
distributed resources using appropriate scheduling algorithms.
 It uses strategies for aggregating and allocating/deallocating computational resources
in the grid.
 It also includes various related tools and utilities such as a grid file transfer protocol and
grid application development tools.

UNIT – 5
6. Application Layer:
 It is the layer of actual grid applications themselves: Scientific, Engineering,
Commercial or EGovernance applications
 Grid Architecture are now being built, based on Internet protocols and servers for the
functionalities of communications, routing and name resolutions, etc.
HKBKCE.

UNIT – 5
Standards for Grid Computing:
 Grid standards are being defined, accepted and implemented by the vendors globally.
 Standards based grids are termed as third generation or 3G Grids.
 The first generation(1G) grids were just local language metacomputers, with basic
distributed servers such as distributed file system, single sign-on based distributed
applications and customer communication protocols.
 The 2G grids came up with an improvement with projects such as Condor and I-WAY or
Legion, in all of which the underlying communication protocols software services could
be used for developing distributed applications and services.
HKBKCE.

UNIT – 5
 2G could offer very basic core/kernel and application development which required
significant customization efforts.
 2G Grids were not easily interoperable.
 The 3G grid came with standards for interoperability.
 It implements key services.
HKBKCE.

UNIT – 5
 OGSA with web standards of web services such as SOAP, XML, WSDL, UDDI and the
standard of grid computing developed the project called “Globus Project ( Globus
Toolkit .
 Globus Project is a joint effort of Globus community of Open Source based researchers
and software programmers of the whole globe, with a focus on grid research, software
tools, test beds and applications.
 OSGA become backbone of grid computing.
 GT4(Globus Toolkit 4) supports OGSA and WSRF.
HKBKCE.

UNIT – 5
OGSA and WSRF:
 OGSA (Open Grid Services Architecture) is a distributed computing interoperability
standard, based on a grid service.
 It leverages on web services, depending on WSDL(Web Service Description Language)
interfaces( how to use the service).
 The UDDI registry and WSDL document documents are used to locate the grid service.
 The transport protocol SOAP is used to connect data and applications for accessing a
grid service.
 OGSA is developed by Globus Grid Forum(CGF) with the objective of defining common,
standard and open architecture for grid based applications.
HKBKCE.

UNIT – 5
OGSA and WSRF:
 OGSI (Open Grid Services Infrastructure) is the basic interoperability in terms of RPC
(Remote Process Communication) for a rich set of high level services and capabilities
that are collectively called OGSA.
 OGSA defines and describes a web services based architecture, composed of a set of
interfaces and their behaviors, to facilitate resource sharing and accessing in
heterogeneous dynamic environments.
HKBKCE.

UNIT – 5
OGSA for Resource Distribution:
 OGSA is a web services based architecture, comprised of a set of interfaces and their
behaviors, to facilitate resource sharing and accessing in heterogeneous dynamic
environments, by relying upon the definition of a web service in WSDL, which defines
method names, parameters and their types for grid service access.
 The main theme of OGSA is, it is a service oriented grid architecture supported by grid
service, which is a special web service that provides set of well defined interfaces that
follow specific conventions.
 These interfaces of grid services all matters pertaining to service discovery, dynamic
service instance creation, life time management, notification and manageability.
 All communications between the services are secure.
 All Services are built on Globus Toolkit.
 OGSA =grid structure + web services + toolkit.
HKBKCE.

UNIT – 5
OGSA for Resource Distribution:
OGSA and SOA
HKBKCE.
Grid Client
Messaging(Secure)
Grid Server
(Windows Platform)
Grid Server
(Linux Platform)
Grid Server
(UNIX Platform)
Standard Interface,
Multiple Bindings
(Java,C#)

UNIT – 5
Stateful Web Services in OGSA:
 Web services Technology lacked the ‘state’ because they are stateless.
 OGSA introduced ‘state’ for a web service.
 OGSA defined ‘stateful’ web services using a new framework called “We Services
Resource Framework(WSRF).
HKBKCE.

UNIT – 5
WSRF ( Web Services Resource Framework) :
 Web services Resource Framework (WSRF) is defined by OASIS (Organization for
Advancement of Structured Information Standards).
 It specifies how we can make a Web service stateful, along with a few other useful
features.
 It is the result of joint effort of Web services and Grid Community.
 It provides stateful Web Services that OGSA needs.
 OGSA is the architecture that requires stateful Web services.
 WSRF is the infrastructure on which OGSA architecture is based and built on.
HKBKCE.

UNIT – 5
Stateful Web Services architecture WSRF with OGSA
HKBKCE.
OGSA WSRF
Stateful Web
services
Web Server
Requires
Specifies
Extends

UNIT – 5
 Globus Tool Kit version 4 implements both OGSA and WSRF.
 Plain web services are stateless i.e past transaction details are lost.
 It is necessary when web service need not to remember past transaction.
 But stateful web services do remember the last transaction data.
 So, have both stateless and stateful service in both hands.
HKBKCE.

UNIT – 5
WSRF ( Web Services Resource Framework) Specifications :
 WSRF is a set of 5 specifications, all to be used in the management of WS-Resources.
1. WS-Resource Properties:
 A resource may have 0 more resource properties.
 A resource file may have 3 properties[ File name/No., Size and Description ].
 It specifies clearly how the properties or attributes are defined in WSDL.
2. WS- Resource Lifetime:
 Beginning and end –lifetime
 It supplies information to manage lifecycle of resurces.
HKBKCE.

UNIT – 5
3. WS-Servicegroup:
 It is utilized to manage a group of web services or group of WS-Resources.
 Operations such as Add/remove/find a service in a group performed.
 It specifies how exactly we should go about grouping the services or WS-Resources.
 GT’s Features such as Index Service enable the user to group and index the services
and also their references.
4. WS-Barefault:
 It is a mechanism of enabling reporting of faults, when something goes wrong
during a web service invocation.
HKBKCE.

UNIT – 5
5. WS-Notification:
 It enables a web service to be configured as a ‘notification producer’ and certain
clients to be ‘notification consumers’.
 It notifies all the subscribers or consumers whenever a change occurs in a WS or
WS-Ref.
6. WS-Addressing:
 It provides away to address web services or web resources for resource pair(WS-
Resource).
HKBKCE.

UNIT – 6
Web Services and the Service Oriented Architecture (SOA) :
1. History and Background:
 Before the advent of Object Oriented Programming, all programming was
monolithic(unstructured).
 Structured programming methodology came after monolithic programming.
 It resulted in better modular programming methodology.
 But had procedural paradigm- having procedures, subroutines and functions with
global data maintenance- too difficult to manage.
 So better and simpler abstraction –Object Oriented approach was invented.
 Reusable individual modules or components can reduce the time and effort in
software development- resulted in Component based software.
HKBKCE.

UNIT – 6
 The evolution in the nature of software over the time – from Mainframe-batch
processing was replaced with ‘online’, ‘interactive’ software.
 Object- Oriented, component based software was used for online, high transaction
rate, user friendly and highly available systems.
 The component based, object-oriented software addressed issues of extensibility,
reusability and maintainability.
 Component based architecture was not adequate- so, distributed deployment,
heterogeneous platform, interoperability and application integration in diverse
protocols and interfaces ware achieved.
HKBKCE.

UNIT – 6
 Software Development became compartmentalized- from mainframe monolithic
programming to client- server architecture.
 Software could be broken into client side and server side software.
 Client- typically a PC with GUI connected to a server at the back end with shared
resources or database tightly coupled.
 From a single server LAN to Distributed systems to support WAN, to interact with
multiple WAN across the globe , internet was found.
HKBKCE.

UNIT – 6
 In a large distributed system or network as Internet, the client does not know the
server and the server does not know the client.
 Multi tier architecture evolved- front tier-client –user interface will talk to middle
tier.
 Middle tier will take the request from front tier and sends it to Backend server. This
middle tier is known as ‘Broker’.
 This 3 tier is extended as ‘n’ tier architecture over Internet.
HKBKCE.

UNIT – 6
 In order to make minimal dependence between components loose coupling is
needed.
 Loose coupling will help to relocate a service on to a different server without
changing the interface for communication with server.
 Service will be discovered with its capabilities, policies, location and protocols.
 Earlier to SOA and web services, technologies such as ORB(Object Resource Broker)
and DCOM,COM,OLE were developed, based on CORBA(Common Object Resource
Broker) specification.
HKBKCE.

UNIT – 6
 Interoperability across languages and vendors was achieved by CORBA through IIOP
(Internet Inter-ORB) protocol.
 IIOP was a transport protocol for distributed applications, written in either IDL or
RMI.
 CORBA has limitations such as parameters/return values are limited to the capability
of representation available in the programming language.
 Argument values/ return values are actual types not abstracted type.
HKBKCE.

UNIT – 6
2. Service Oriented Architecture(SOA):
 SOA is based on loosely coupled ‘Service’ as an abstraction.
 A Service in SOA is an exposed piece of behaviour, loosely coupled such that the
context of service to the service is platform independent.
 A Service may be dynamically discovered and called.
 Dynamic discovery is possible through a directory service available at run time.
 In principle, a Service an atomic operation encapsulating a particular business
process.
HKBKCE.

UNIT – 6
 Service should be self-contained(i.e stateless).
 A Service should be publishable, discoverable, communicable, self-contained, well-
defined and independent of platform or other services.
 Service Providers and consumers communicate via messages.
 The Service contract will clearly specify the invocation procedure.
 Platform independent messages are exchangeable.
 The Service invocation and responses should be platform independent, i.e.
interoperable across heterogeneous platforms, operated through platform
independent messages.
HKBKCE.

UNIT – 6
 Web Services interact with each other automatically, without any human
interactions, not even a programming based human intervention.
 A web service(A) can interact with another web service(B) in a completely
automated manner using the UDDI,WSDL,SOAP and XML.
 Every service provider should have facility for service registration, dynamic
discovery, encryption handling, security, interoperability independent of platform
and be able to do clusters and grid computing.
HKBKCE.

UNIT – 6
HKBKCE.
Service
Registry
Service
Consumer
Service
Provider
Discover
Register
Bind
Execute

UNIT – 6
HKBKCE.

UNIT – 6
 Service Oriented Architecture(SOA) contains Service providers, Service consumer
and Service Registry.
 Initially, Service provider register themselves with Service Registry.
 Then , Service consumer track and bid (engage) with Service providers.
 With Heterogeneous platforms offering web services, data exchange takes place
through XML.
 Service Invocation takes place through SOAP.
HKBKCE.

• A Web service is a software system designed to support interoperable system-to-
system interaction over a network.
• “self-contained, modular business applications that have open, Internet-
oriented, standards-based interfaces” - [ UDDI consortium ]
• “A software application identified by a URI, whose interfaces and
bindings are capable of being defined, described, and discovered as
XML artifacts. A Web service supports direct interactions with other
software agents using XML-based messages exchanged via Internet-
based protocols” - [ W3C ]
Web service
64

• A Web-Service is an interface that describes a collection of operations that
are network accessible through standardized XML messaging - [ Microsoft ]
• Web services are a new breed of Web applications. They are self-contained,
self-describing, modular applications that can be published, located, and
invoked across the Web. Web services perform functions, which can be
anything from simple request to complicated business processes… Once a
Web service is deployed, other applications (and other Web services) can
discover and invoke the deployed service. [IBM]
Web service
65

• Web Services are fully XML based
• Expose interface
Clients access the service functionality through interfaces
Communication between applications as opposed to communication between users
• Self-describing modular units
• Accessible over the Web
• Interaction
XML based messages over standard Web protocols
• Registered and discoverable at the Web service registry
• Platform, Language and Protocol independence
• Compositions of Web Services are possible
66
Web Service- General characteristics

• Web Services Description Language
• Simple Object Access Protocol
• Universal Description, Discovery
and Integration
• EXtensible Markup Language
Web Service Platform
WSDL
SOAP
UDDI
XML
67

68
XML
•It is used to tag the data
•Extensible Markup Language, a specification developed by the W3C.
•It is used to create customized tags, enabling the definition, transmission,
validation and interpretation of data between applications and between
organizations.
SOAP
•Communication protocol is used to transfer the data
•a lightweight xml-based messaging protocol used to encode the information in Web
service request and response messages before sending them over a network.
• SOAP messages are independent of any operating system or protocol and may be
transported using a variety of Internet protocols, including SMTP, MIME and HTTP.

69
WSDL
WSDL is used for describing the services available
• Web Services Description Language, an XML-formatted language used to describe
a Web service's capabilities as collections of communication endpoints capable of
exchanging messages.
•WSDL is an integral part of UDDI, an XML-based worldwide business registry.
WSDL is the language that UDDI uses.
•WSDL was developed jointly by Microsoft and IBM.
UDDI
•UDDI is a directory service where companies can register and search for Web
services.
•Universal Description, Discovery and Integration.
•It is a Web-based distributed directory that enables businesses to list themselves
on the Internet and discover each other

Web Service definition revisited
• A more precise definition:
▫ an application component that:
 Communicates via open protocols (HTTP, SMTP, etc.)
 Processes XML messages framed using SOAP
 Describes its messages using XML Schema
 Provides an endpoint description using WSDL
 Can be discovered using UDDI

Web Services Components
• XML – eXtensible Markup Language – A uniform data representation
and exchange mechanism.
• SOAP – Simple Object Access Protocol – A standard way for
communication.
• UDDI – Universal Description, Discovery and Integration specification –
A mechanism to register and locate WS based application.
• WSDL – Web Services Description Language – A standard meta
language to described the services offered.

Example – A simple Web Service
• A buyer (which might be a simple client) is ordering goods from a seller service.
• The buyer finds the seller service by searching the UDDI directory.
• The seller service is a Web Service whose interface is defined using Web Services
Description Language (WSDL).
• The buyer is invoking the order method on the seller service using Simple Object
Access Protocol (SOAP) and the WSDL definition for the seller service.
• The buyer knows what to expect in the SOAP reply message because this is defined
in the WSDL definition for the seller service.

The Web Service Model
• The Web Services architecture is based upon the interactions between
three roles:
▫ Service provider
▫ Service registry
▫ Service requestor
• The interactions involve the:
▫ Publish operations
▫ Find operation
▫ Bind operations.

The Web Service Model (cont)
The Web Services model follows the publish, find, and bind paradigm.
1. publish 2. find
3. bind/invoke
Web Service
Registry
Web Service
Provider
Web Service
Client

XML
• XML stands for EXtensible Markup Language.
• XML is a markup language much like HTML.
• XML was designed to describe data.
• XML tags are not predefined. You must define your own tags.
• The prefect choice for enabling cross-platform data communication in
Web Services.

XML vs HTML
An HTML example:
<html>
<body>
<h2>John Doe</h2>
<p>2 Backroads Lane<br>
New York<br>
045935435<br>
john.doe@gmail.com<br>
</p>
</body>
</html>

XML vs HTML
• This will be displayed as:
• HTML specifies how the document is to be displayed,
and not what information is contained in the
document.
• Hard for machine to extract the embedded
information. Relatively easy for human.
John Doe
2 Backroads Lane
New York
045935435
John.doe@gmail.com

XML vs HTML
• Now look at the following:
• In this case:
▫ The information contained is being marked, but not for
displaying.
▫ Readable by both human and machines.
<?xml version=1.0?>
<contact>
<name>John Doe</name>
<address>2 Backroads Lane</address>
<country>New York</country>
<phone>045935435</phone>
<email>john.doe@gmail.com</email>
</contact>

SOAP
• SOAP originally stood for "Simple Object Access Protocol" .
• Web Services expose useful functionality to Web users through a
standard Web protocol called SOAP.
• Soap is an XML vocabulary standard to enable programs on separate
computers to interact across any network. SOAP is a simple markup
language for describing messages between applications.
• Soap uses mainly HTTP as a transport protocol. That is, HTTP message
contains a SOAP message as its payload section.

SOAP Characteristics
• SOAP has three major characteristics:
▫ Extensibility – security and WS-routing are among the extensions under
development.
▫ Neutrality - SOAP can be used over any transport protocol such as HTTP,
SMTP or even TCP.
▫ Independent - SOAP allows for any programming model .

SOAP Building Blocks
A SOAP message is an ordinary XML document containing the following
elements:
▫ A required Envelope element that identifies the XML document as a SOAP
message.
▫ An optional Header element that contains header information.
▫ A required Body element that contains call and response information.
▫ An optional Fault element that provides information about errors that occurred
while processing the message.

SOAP Request
POST /InStock HTTP/1.1
Host: www.stock.org
Content-Type: application/soap+xml; charset=utf-8 Content-Length: 150
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle=http://www.w3.org/2001/12/soap-encoding”>
<soap:Body xmlns:m="http://www.stock.org/stock">
<m:GetStockPrice>
<m:StockName>IBM</m:StockName>
</m:GetStockPrice>
</soap:Body>
</soap:Envelope>

SOAP Response
HTTP/1.1 200 OK
Content-Type: application/soap; charset=utf-8
Content-Length: 126
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Body xmlns:m="http://www.stock.org/stock">
<m:GetStockPriceResponse>
<m:Price>34.5</m:Price>
</m:GetStockPriceResponse>
</soap:Body>
</soap:Envelope>

SOAP Security
• SOAP uses HTTP as a transport protocol and hence can use HTTP
security mainly HTTP over SSL.
• But, since SOAP can run over a number of application protocols (such as
SMTP) security had to be considered.
• The WS-Security specification defines a complete encryption system.

WSDL
• WSDL stands for Web Services Description Language.
• WSDL is an XML vocabulary for describing Web services. It allows developers to
describe Web Services and their capabilities, in a standard manner.
• WSDL specifies what a request message must contain and what the response
message will look like in unambiguous notation. In other words, it is a contract
between the XML Web service and the client who wishes to use this service.
• In addition to describing message contents, WSDL defines where the service is
available and what communications protocol is used to talk to the service.

The WSDL Document Structure
• A WSDL document is just a simple XML document.
• It defines a web service using these major elements:
▫ port type - The operations performed by the web service.
▫ message - The messages used by the web service.
▫ types - The data types used by the web service.
▫ binding - The communication protocols used by the web service.

WSDL Document
<message name="GetStockPriceRequest">
<part name="stock" type="xs:string"/>
</message>
<message name="GetStockPriceResponse">
<part name="value" type="xs:string"/>
</message>
<portType name=“StocksRates">
<operation name=“GetStockPrice">
<input message=“GetStockPriceRequest"/>
<output message=“GetStockPriceResponse"/>
</operation>
</portType>

UDDI
• UDDI stands for Universal Description, Discovery and Integration.
• UDDI is a directory for storing information about web services , like
yellow pages.
• UDDI is a directory of web service interfaces described by WSDL.

UNIT – 6
3. How a Web Service Works:
 Initially the client interested I finding a web service in ‘Discovering Services’
( in Server 1) for a specific ‘Web Service’ (eg. Weather forecast web service).
 The Discovery service in server 1 replies to the client that there is a Web Service
on weather forecast in Server 2.
 The client interacts with Server2 for Web Service for weather forecast service by
asking how exactly it should interact with the service.
 The service replies saying that the client should see the concerned WSDL.
 The WSDL description provides the exact syntax of invocation along the
parameter.
HKBKCE.

UNIT – 6
HKBKCE.
Client
Discovery Service
Web Service
Server 1
Server 2
Web Service

UNIT – 6
 The client will send a SOAP message to the web server with the parameters(i.e
location).
 The Web server replies in SOAP response the weather forecast for the given
location(parameter list by the client).
 All interactions between client and server are through SOAP.
HKBKCE.

UNIT – 6
4. SOAP and WSDL:
 SOAP and WSDL are the essential parts of Web services Architecture as shown
below:
Discovery, Aggregation Processes
WSDL Description
SOAP Invocation
HTTP Transport
The transport of message is done by HTTP( Hyper Text Transfer Protocol)
HKBKCE.

UNIT – 6
4. SOAP and WSDL:
 Web service invocation involves passing message between the client and the
server.
 SOAP or Simple Object access Protocol specifies how the request should be
formatted while being sent to the server and how the server should format its
response.
 SOAP is the most popular choice of invocation of a web service.
 HTTP is the most popular transport protocol.
HKBKCE.

UNIT – 6
5.Description:
 Web services are self describable.
 Once the web service is located, it can be asked to describe itself.
 It describes what operations it supports and how to invoke it.
 This is done by a language called ‘WSDL or Web service Descripton Language.
 Web service is addressed through its URI.
 Web service invocation happens automatically without human intervention.
HKBKCE.

UNIT – 6
6. Creating Web Services:
 Web services are created by web service programmer using the programming
language of choice along with WSDL.
 The existing Tools can be used to create web service and Tool will generate itself
WSDL.
 SOAP code will always be generated and interpreted auto matically.
 The task that a client application invokes a web service is called’ Stub’.
 Based on WSDL, the tool can generate automatically the stub.
 Stubs can be client side or server side stubs.
HKBKCE.

UNIT – 6
 A Stub is a piece of code that generates a SOAP request from a client or interprets
a SOAP message which is a server response.
WSDL
Client Stub Server Stub
(Client side code generates (Server side code that interprets
SOAP request and also interprets SOAP request received from Clients
SOAP responses received from the and also generates SOAP responses
server) to be sent to Clients)
HKBKCE.

UNIT – 6
 Stubs are generated only once.
 The stubs are reused any number of time, until replacement or modifications are
introduced to web service.
 Whenever a client application requires to invoke a web service, it will call the
client side stub.
 The client side Stub will generate a proper SOAP request according to the WSDL of
the web service being called by the client side application.
 This process is called ‘ marshalling or Serializing’ process.
HKBKCE.

UNIT – 6
 The SOAP request is sent over a network using HTTP transport protocol.
 The server receives this SOAP request from the client.
 The server simply hands over the request to the stub on the server side.
 The server stub will convert the SOAP request into a form which can be
understood by the service implementation.
 This step is called ‘unmarshalling or deserializing’.
HKBKCE.

UNIT – 6
 After the SOAP request is deserialized, the server stub will invoke the service
implementation, which then executes the request it has received, generating a
result.
 The result so generated by the execution of the request is handed over to the
server stub, which will now convert the result into a SOAP response.
 The SOAP response is sent over the network again using HTTP transport protocol.
 The Client stub receives this SOAP response and converts it into a form which can
be understood by the client application.
 Finally, the application receives the result of web service invocation and uses its
response result.
HKBKCE.

UNIT – 6
7. Server Side:
Server Side Architecture
HKBKCE.
off timer, return unslept time */
}
unsigned
SOAP Engine
HTTP Server
Application Server
Web Server

UNIT – 6
7. Server Side:
 HTTP Server, commonly called as Web Server.
 It is a piece of software which knows how to handle HTTP messages.
 Eg: Apache Web server, WEA Web logic, Web sphere, Pramati, etc.
 Application Server is a piece of software that hosts and provides facilities for
hosting ‘applications’, which will be accessed by different clients.
 The SOAP engine runs as an application inside the application server.
 Eg: Jakarta Tomcat Server, Java Servlet and Java Server Pages(JSP) container that is
frequently used with Apache Axis and Globus Tool Kit.
HKBKCE.

UNIT – 6
7. Server Side:
 We can operationalize web services by installing SOAP engine in application
server.
 SOAP engine is a piece of software, which can bundle SOAP requests and also
SOAP responses.
 It is common to use a SOAP engine which can actually generate server stubs for
each individual web service.
 Apache Axis is a good example of a SOAP engine.
 The role of SOAP engine is limited to manipulating SOAP.
HKBKCE.

UNIT – 6
7. Server Side:
 Web Sever is apiece of software that exposes a set of operations.
 If the web service is implemented in Java, the web service will be a Java Class.
 This web service will be requested for service from various clients through SOAP
message, which will be analyzed by SOAP engine.
HKBKCE.

UNIT – 6
7. Server Side:
HKBKCE.

UNIT – 6
Globus Toolkit Version 4:
GT4 Architecture:
HKBKCE.

UNIT – 6
Globus Toolkit Version 4:
HKBKCE.

UNIT – 6
HKBKCE.

UNIT – 6
Storage Request Broker (SRB)
HKBKCE.

UNIT – 7
CLUSTER Computing – 1:
What is cluster computing?
 Cluster computing is a form of computing in which a group of computers
are linked together so that they can act like a single entity .
 It is the technique of linking two or more computers into a network
(usually through a local area network) in order to take advantage of the
parallel processing power of those computers.
 To meet the Grand challenge Applications (CGA), the CPU power may be
increased or alternatively, parallel computing approaches are available to
break down the job into several components and allow multiple processors
to process them simultaneously , so that overall veryhigh performance can
be achieved.
HKBKCE.

UNIT – 7
Approach to Parallel Computing:
1. Massively parallel Processors(MPPs)
2. Symmetric Multi Processors(SMPs)
3. Cache-Coherent Non-Uniform Memory Access(CC-NUMA)
4. Distributed System Processing
5. Cluster Computing
HKBKCE.

UNIT – 7
 MPP approach comprises a large parallel processing system with shared
nothing architecture.
 This is known as multi computer approach to parallel processing.
 There may be several hundred processing elements or nodes, all
connected with each other through a high speed interconnection
network/switch.
 Each node will contain at least one or more processors and one main
memory unit.
HKBKCE.

UNIT – 7
 Special nodes can have additional peripherals such as disks or backup
systems connected.
 Each node will have a separate copy of the operating system.
 All the nodes parallelly execute the tasks assigned to them.
HKBKCE.

UNIT – 7
2. Symmetric Multi Processors(SMPs)
 SMP systems can have 2 to 256/512/1028 processors.
 All processors share all the global resources available, such as I/O
systems, memory and the bus.
 Only a single copy of the operating system executes on all the systems
put together.
HKBKCE.

UNIT – 7
3. Cache-Coherent Non-Uniform Memory Access(CC-NUMA)
 CC-NUMA is a scalable multiprocessor system architecture having a
cache coherent non-uniform memory access.
 CC-NUMA system has a global view of all the memory.
 NUMA stands for non-uniform time taken to access the nearest and
farthest part of the memory (shortest and the longest time taken to
access the memory).
HKBKCE.

UNIT – 7
4. Distributed System Processing
 Distributed Processing, with processors distributed in a network, can be
understood as conventional networking of independent computer
systems.
 Each node is under its own operating system.
 Each node can also be a cluster by itself.
HKBKCE.

UNIT – 7
5. Cluster Computing:
 A Cluster is a collection of PCs or Workstations that are connected with
each other, using some networking technology (with an interconnection
network or switch).
 For high performance or parallel computing purposes a cluster will have
a number of high performance PCs or workstations, connected with
each other through a high speed interconnection network.
HKBKCE.

UNIT – 7
5. Cluster Computing:
 A cluster works on single, integrated collection of resources.
 It can have a single system image spanning all its nodes.
 In contras with a grid of distributed processing nodes, where each node
has a separate identity, separate Operating System, separate set of
resources, which could also be shared.
HKBKCE.

UNIT – 7
How to achieve Low cost Parallel computing through clusters:
 The use of a cluster of PCS and workstations to achieve high
performance computing at supercomputer scale s successful.
 Clustering became practical due to the standardization of tools and
utilities used by parallel applications.
 Message Passing Libraries(MPL) and Data Parallel Language HPF are
some of the standards.
 These standards enabled the development of tedting and debugging
applications in a cluster( or Now-Network of Workstations/COW-Cluster
of Workstations).
HKBKCE.

UNIT – 7
 For testing and debugging CPU time is not required to be accounted or
charges.
 For dedicated parallel platforms for regular execution, CPU time is
accounted and charged.
HKBKCE.

UNIT – 7
Components for clusters are based on the basic concepts of parallel
processing –to use multiple processors, network interfaces, memory and hard
disks with facilities as follows:
1. Multiple workstations can be connected to form a cluster to act an
MPP(Massively Parallel Processor) with shared nothing architecture.
2. Memory associated with each workstations as aggregate DRAM(Dynamic
RAM) Cache, so as to improve virtual memory and file system
performance.
HKBKCE.

UNIT – 7
3. RAID (Redundant Array of Inexpensive Disks)- using arrays of workstation
disks to provide cheap, highly available and scalable file storage. It uses
redundantly arrays of workstations with LAN and I/O backplane. Parallel
I/O is also possible with appropriate middleware.
4. Multiple Communications is possible by using many networks for parallel
data transfer between nodes.
HKBKCE.

UNIT – 7
Definition and Architecture of Cluster:
 A Cluster is a specific category of parallel distributed processing system
consisting of a collection of interconnected standalone computers
working together as a single integrated computing resource.
 Each of these stand alone computers can be viewed as a node.
 A node, in general, can be a single processor or multiple processor
system, PC, workstation or SMP with Memory, I/O facilities and an
Operating System.
 A cluster will have 2 or more nodes together.
 These nodes can be in a single cabinet or may located as stand alone
nodes in a LAN.
HKBKCE.

UNIT – 7
Definition and Architecture of Cluster:
 The architecture of a Cluster will have a cluster middleware which
provides a single system view to the user.
 This middleware is the essential component of the LAN or cluster
cabinet, which provides a single system view of the cluster.
HKBKCE.

UNIT – 7
Definition and Architecture of a Cluster:
HKBKCE.
Sequential
Applications
Parallel
Applications
Cluster Middleware
Interconnection network/switch
PC-1 PC-2 PC-3 PC-4 PC-n
Basic Architecture of a Cluster

UNIT – 7
Architecture of a Cluster:
 In the architecture of a cluster diagram, a Cluster of PCs/Workstations is
show connected with network / switch above which a middeware
software operates over the cluster to provide an illusion of a single
system view to outside world –the users who will be running
applications, which may be sequential and also parallel.
 Parallel programming application development environment and tools
such as parallel compilers are available.
 Cluster middleware, which aims to provide Single system Image(SSI) and
System Availability Infrastructure (SAI).
HKBKCE.

UNIT – 7
 Cluster middleware, which aims to provide Single system Image(SSI) and
System Availability Infrastructure (SAI) comprises:
1. Hardware such as memory channel.
2. Operating System kernel or gluing layer such as Solaris MC and
GLUnix.
3. Application subsystem such as system management tools and
electronic forms, runtime systems such as parallel file system etc.
4. Resource management and scheduling software such as LSF or Load
sharing Facility and CODINE(Computing in Distributed Network
Environment).
HKBKCE.

UNIT – 7
 The network interface hardware acts as a communication processor.
 This interface takes care of transmitting and receiving packets of data
between cluster nodes via a network or switch.
 Fast and reliable data communication among the cluster nodes and to
the outside world is achieved by the communication software.
 The nodes of the cluster work collectively as an integrated computing
resource or they can operate as individual computers.
HKBKCE.

UNIT – 7
Functionality of a Cluster:
 A Cluster can offer high performance, high throughput, high availability.
 A cluster can be expanded, so it is scalable.
 Cluster computing enables an organization to expand their processing
power using standard existing components, i.e. PCs and Workstations,
which off- the-shelf commodity hardware and software available at low
cost.
 The organization need not to procure proprietary hardware with high
performance/supercomputing capabilities.
HKBKCE.

UNIT – 7
Functionality of a Cluster:
 The organization can conserve and preserve the existing hardware stock
and assemble them into a cluster for high performance computing.
 The user organization can themselves leverage on their existing hardware
to build high performance/supercomputing clusters from their own
resources – a 10 or 100 factor cost reduction in comparison with high
performance supercomputing main frame and super computers from
vendors such as Cray, IBM, SGI, NEC etc.
HKBKCE.

UNIT – 7
Categories of Cluster:
Classification of Clusters can be made on the basis of various criteria such as
1. Functionality
2. Performance
3. Availability
4. Node ownership
5. Type of node hardware
6. Node operating system
7. Node configuration
8. Level or layering of clustering
HKBKCE.

UNIT – 7
 In terms of Performance, a Cluster can be classified into either High
Performance clusters or high availability clusters.
 In terms of nodes, a Cluster can be classified into dedicated clusters or non-
dedicated(shared) clusters.
 In case of dedicated clusters, all the nodes(PCs/workstations) are dedicated
fully for the cluster, with no independent usage.
 In case of non-dedicated or shared clusters, the nodes are used
independently by the respective end users and simultaneously CPU cycles
are stolen for cluster functionality.
HKBKCE.

UNIT – 7
 Classification based on Operating System is on the basis whether the node
has LINUX, windows, Solaris, etc.
 Classification based on node configuration is based on whether the cluster
is homogeneous or heterogeneous.
 In a homogeneous cluster, all the nodes have similar architecture and have
the same operating system.
 In a heterogeneous cluster all the nodes have different architecture and run
on different operating systems.
HKBKCE.

UNIT – 7
 Classification based on levels of clustering is based on the location of the
nodes and their number.
 In a group clusters ( number of nodes 2 to 99) nodes are connected by SAN
(Storage Area Network).
 Departmental clusters have nodes in 10s or 100s.
 National metacomputers are WAN/ Internet based and may have several
thousand nodes(which themselves could be clusters).
 International metacomputers are internet based and may have tens or
hundreds of thousands or millions of nodes.
HKBKCE.

UNIT – 7
 Individual clusters can be connected to form larger systems or cluster of
clusters.
 Internet itself can be viewed and utilized as a cluster.
HKBKCE.

UNIT – 7
Cluster Middleware:
 The essence of a cluster is to provide a Single System Image(SSI) for the
entire cluster of nodes, which may be individual PCs or Workstations.
 Single System Image(SSI) is the most essential aspect of a cluster of
PCs/workstations.
 SSI has to be provided by a middleware, above the operating system layer
and below the user interface layer.
 A Cluster middleware consists of 2 sub-layers of software infrastructure:
1. SSI infrastructure 2. System Availability Infrastructure
HKBKCE.

UNIT – 7
Cluster Middleware Architecture:
HKBKCE.
SSI Middleware
User Interface
Operating System
Interconnection switch
PC-1 PC-2 PC-3 PC-4 PC-n
Architecture of a Cluster Middleware

UNIT – 7
Cluster Middleware:
 SSI infrastructure attaches the operating system to all the nodes in the
cluster so as to provide a single system image.
 The System Availability Infrastructure aims at providing fault tolerant
computing environment among all the nodes of a cluster by providing for
failover facilities, recovery from failures and requisite check pointing and
other facilities.
 Applications can be system management tools, run time systems such as
parallel file systems and middleware modules such as resource
management and scheduling software.
HKBKCE.

UNIT – 7
Levels and Layers of Single System Image[ Parts of SSI ]:
1. HARDWARE Layer:
 SSI at hardware level allows a user to view the cluster as a shared-memory system.
 A Memory Channel in a dedicated cluster environment provides virtual shared memory
among nodes by means of intermodal space mapping.
2. OPERATING SYSTEM Kernel or ‘gluing layer’:
 A Cluster operating systems will have to support the execution of both parallel applications
and sequential applications.
 The goal is to pool the resources of a cluster both parallel and sequential applications.
 For this purpse ‘gang scheduling’ of parallel programs has to be done by the OS.
HKBKCE.

UNIT – 7
2. Operating System Kernel or ‘gluing layer’:
 The OS has to identify the idle resources such as processors, memory and network in the
cluster system and offer globalized access to all these resources.
 It should achieve load balancing by supporting optimal process migration.
 It should also provide fast inter process communication for the application.
 OS which supports SSI are SCO UnixWare and SUN Solaris MC.
 A full scale cluster wide SSI enables all physical resources and kernel resources to be
accessible from and usable by all nodes within cluster systems.
 Each node should see a uniform single system image.
 A full SSI at Kernel Level can be very economical, since the applications at each node will
run directly without any modifications or code rewriting.
HKBKCE.

UNIT – 7
3. Application Layer and middle/sub-system Layer:
 Multiple cooperating components can be presented as a Single System Image to the user
or administrator as a single application.
 Single system Image can be at all levels.
 A cluster administration tool offers a single point of management and control of SSI
services.
 Sub-systems of cluster offer single means to create easy-to-use GUI tools.
 File System SSI ensures that every node in the cluster has the same view of the data.
 Global Scheduling of jobs is done by the scheduling modules in an efficient manner,
offering high availability.
HKBKCE.

UNIT – 7
3. Application Layer and middle/sub-system Layer:
 The benefits of SSI are well identifiable.
 From any node a single image of all the resources is offered without any node for the end
user, and also the operators, to know where a particular application will run, or where a
particular resource exists.
 SSI enables the system administrator to control the entire cluster as one system, with single
familiar interface and commands, with central system management capability.
 It improves the system reliability and operational manageability.
HKBKCE.

UNIT – 7
Resource Management and Scheduling [ RMS ]:
 RMS is the act of distributing the application load among the individual computer systems
in a cluster, so as to maximize the utilization throughput by maximizing the resource
utilization.
 RMS is a software function and has two parts, namely resource manager and resource
scheduler.
 Resource manager is responsible for all aspects of locating and allocating computational
resources, authentication and also issues relating to process creation and migration.
 The resource scheduler is responsible for scheduling and queuing.
 RMS is based on client-server architecture.
 RMS functionality is based on Server Daemons.
HKBKCE.

UNIT – 7
 Users can interact RMS through a client program or web Browser.
 Applications can be online or batch mode.
 The batch job is submitted after submitting its details such as location of the executable
module and input datasets, the location where the output should be placed, system type,
maximum length of the run, the nature of resources required-sequential or parallel.
 RMS will execute the submitted job, based on the details submitted.
HKBKCE.

UNIT – 7
 RMS will also have to enable heterogeneous environments of PCs/workstations, and SMP
and other dedicated parallel platforms to be easily and efficiently utilized with the
following services:
1. Load Balancing by distributing application load across all the nodes in the cluster to
achieve efficient and effective use of all the resources-this may involve process migration
from one node to another.
2. Process migration involving termination/succession, moving and restarting processes in
another node.
3. Resource scheduling involving setting up job queues to easily manage the resources of a
given organization. Each Queue can be configured with certain attributes such as the
priority of the short jobs against the priority of the long jobs or based on specialized
resources such as parallel computing platforms.
HKBKCE.

UNIT – 7
4. Check pointing is required for managing abrupt abortions or job terminations, so that
restart/ recovery can be easily achieved. Thus improving the system availability and
reliability.
5. Better CPU utilization is achieved by RMS by engaging idle CPU cycles with jobs. In case
of idling CPUs, jobs can be assigned to them by RMS.
HKBKCE.

UNIT – 7
Cluster Programming Environment and Tools:
Special parallel programming environments and the corresponding tools are essential for
cluster computing.
They can be classifies as:
1. Thread
2. Message Passing system
3. Distributed Shared Memory (DSM) System
4. Parallel debuggers and profilers
5. Performance analysis tools
6. Cluster administration tools
HKBKCE.

UNIT – 7
1. Threads
 Threads are paradigm for concurrent programming on single or multiple processor
systems.
 In a conventional uniprocessor, multi processing is achieved through time sharing.
 This can result in optimal utilization of resources effectively.
 In multiprocessor environments, threads are primarily used to utilize all the available
processors.
 Multithreaded applications offer quicker response to user and run faster.
 Thread input creation is cheaper and easier than handling fork processes.
 Threads communicate using shared variables as they are created within their parent
process address space.
HKBKCE.

UNIT – 7
2. Message Passing System (MPS and PVM)
1. Libraries:
 To write efficient parallel programs for distributed memory systems, message passing
libraries are required.
 These libraries provide routines to indicate and configure messaging environment, and
also for sending and receiving packets of data.
 Such system for message passing are PVM(Parallel Virtual Machine) and MPI(Message
Passing Interface).
 PVM is an environment and also a library for message passing and can be used for
running parallel applications on systems, ranging from super computers to clusters of
workstations.
HKBKCE.

UNIT – 7
2. MPI:
 It is only a standard message passing specification- it is taken by MPI forum.
 MPI standard which aims at portability and efficiency, defines a message passing
library.
 MPI and PVM libraries are available for ANSI C,C++,Fortran 77& 90,Java.
HKBKCE.

UNIT – 7
3. P-CORBA:
 A Model for Parallel programming over CORBA.
 It incorporates the notion of concurrency into conventional CORBA.
 This model provides a way to balance the load on a CORBA based distributed system.
 It provides a new idea for achieving object migration in CORBA.
HKBKCE.

UNIT – 7
3. Distributed Shared System (DSM)
 It enables programming on shared variable basis and can be implemented using
software or hardware solutions approach.
 Software DSMs are built as separate layers on top of the communication interface.
 DSM can be implemented either by only compile time methods or run time method, or
combining both.
 The characteristics of hardware DSMs are: better speed than software DSMs, no
burden on application and software layers.
 Example of Hardware DSMs: DASH & Merdin
 Example of Software DSM: Linda.
HKBKCE.

UNIT – 7
4. Parallel Debuggers
 The High Performance Debugging Forum(HPDF)( under Parallel Tools Consortium
Project) has developed HPD version specification defining functionality,semantics and
syntax for command line debugger.
A parallel debugger should be able to
 manage multiple processors and threads in a process.
 Display each process in a window, setup source level and machine level breakpoints.
 Share break points between groups of processors.
 Define and watch evolution points, display arrays and manipulate code varilables and
contents.
HKBKCE.

UNIT – 7
4. Parallel Debuggers
 TotalView is a commercial debugger with GUI.
 It supports parallel Programming in C, C++, Fortran 77/90, HPF, MPI & PVM.
 It supports OS like Sun Solaris, UNIX and IRIX.
HKBKCE.

UNIT – 7
5. Performance Analysis Tools
It consists of
 components for inserting instrumentation calls to the performance monitoring routines
into the user’s application
 a runtime performance library of monitoring routines which measure and record
various aspects of a program performance,
 a set of tools for processing and displaying the performance data.
Commonly used tools are
 Pabol- monitoring Library and analysis
 AIMES- instrumentation, monitoring library analysis
 MPE- Logging library and snapshot performance visualization.
HKBKCE.

UNIT – 7
6. Cluster Administration Tools
 It is essential to monitor a cluster in its entirety through a GUI.
 Good administrative tool means good performance of the cluster.
 They are Berkely NOW(Network of Workstations),SMILE( Scalable Multi computer
Implementation using Low cost Equipment).
 Berkely NOW gathers and stores data in a relational database.
 It uses Java applet to allow users to monitor a system from a browser.
 SMILE has a tool for monitoring called K-CAP.
 K-CAP uses a Java applet to connect to management node through a predefined URL
address in the clusters.
 PARMON is a comprehensive environment for monitoring large clusters.
HKBKCE.

UNIT – 7
High Throughput Computing Cluster [ HTC ]:
HTC is an environment which provides large amounts of processing capacity to consumers
over long periods of time, by fully harnessing the existing computational resources
available on the cluster network.
Fro achieving high throughput computing over long periods of time, the clustering
environment should be robust, reliable and maintainable- surviving software and
hardware failures –by appropriate failover, failback mechanisms, allowing resources to
join and leave the cluster easily, and enabling system upgradation and reconfiguration
without any significant down time, with very high MTBF( Mean Time Between Failure ).
HKBKCE.

UNIT – 7
HTC Cluster System-Condor:
 In Condor, each user/customer is represented by a ’customer agent’ whose job is to
manage a queue of application descriptions and send the requests for resources to
‘matchmaker’.
 The resource ( or donor) is represented by a ‘resource agent’ whose job is to implement
the policies of resource owner i.e donor send resource offers, or donor offers resource, to
the ‘matchmaker’.
 The job of the matchmaker is to find the match between resource offered by donor on the
one hand and the requests for resources by customers or users on the other.
 On finding a match, the matchmaker will notify both the agents that a match has been
found.
HKBKCE.

UNIT – 7
HKBKCE.
Matchmaker
(Donor)
Resource
agent
Customer
agents
Request
Claiming
Protocol
Offers
Condor matchmaker

UNIT – 7
 On notification, the customer agent and the resource agent perform a claiming prorocol.
 Requests and offers may also contain constraints, which need to be satisfied (eg. A specific
operating system).
 The matchmaker may have its own constraints.
 The matchmaker may impose a specific priority policy on the customers , depending on
their importance, urgency, nature and duration of the job, etc.
 The matchmaker implement will implement a customer priority mechanism, by matching
resource requests in a specific priority order.
 Customers also have a choice to break to break the allocation at any given time, depending
on their own priorities.
HKBKCE.

UNIT – 7
HKBKCE.
Layered architecture of Condor
System agent
Process
Management
API
Network API Workstation
Statistics API
System call interface
Operating System layer

UNIT – 7
 Any user or customer of Condor should be able to face and overcome several challenges
such as resources from heterogeneous environments, network protocols, remote file
access i.e the ability of the application to access its data files from any workstation in the
cluster.
 To achieve the objectives of heterogeneous resource utilization and network protocol
interface, a layered architecture is designed in Condor.
 HTC system depends on the host operating systems of the nodes to provide network
communication, process management and workstation statistics.
 As the interfaces of these functions vary from one operating system to another, the layered
architecture will help to overcome the difficulties involved.
HKBKCE.

UNIT – 7
 HTC system management deals with APIs , which are OS dependent.
 APIs for process management, network interface and workstation statistics are provided
with lower level system call interfaces to various OS environments.
 The Network API provides connection-oriented, or connection-less, reliable or unreliable
interfaces.
 This APIS performs all conversion between host and network integer byte orders
automatically, checks for overflows, and also provides standard host address lookup
mechanisms.
 The process management API provides the ability to create, suspend, continue and kill a
process to enable the HTC system to control the execution of a customer’s application.
HKBKCE.

UNIT – 7
 When a cluster process is created, a parent process may pass to the child processes all
aspects of the state such as the network connection, open files, environment variables and
security attributes.
 The workstation API ensures that both the customer application requests and information
necessary to implement resource owner policies are well met.
 Examples of such matching will be the OS & CPU configurations, disk space requests of the
customer to be satisfied with the owner constraints of specific timings or dates of
availability of the resource.
HKBKCE.

UNIT – 7
How To Set Up a Simple Cluster:
 A Simple Cluster can be set up by the individual user organization.
 The objectives and expected performance characteristics of the cluster should be well
understood and defined first.
 Its expected computational speed performance and latency should be decided in advance.
 Once these decisions are taken, the network topology and technology suitable for meeting
the proposed cluster’s communication requirements should be identified and designed.
 Technology option such as Gigabit Ethernet, optical fiber channel or ATM are chosen.
 Some of the technologies compulsorily require specific topology such as switched point to
point or ring.
 For Gigabit Ethernet, we can choose between direct links, hubs, switches and their
mixtures.
HKBKCE.

UNIT – 7
How To Set Up a Simple Cluster:
 Hubs are to be avoided because of their latencies.
 Segment switching is better than full port switching.
 Direct point to point connections with crossed cabling(MDI port to MDI ports), confirming a
hyper cube or a torus mesh is preferred.
 Using dynamic routing protocols inside the cluster to set router automatically introduces
more traffic and congestion complexity.
 Some of the operating system will provide support for bonding several physical interfaces
into a single virtual interface for higher throughput.
HKBKCE.

Grid computing notes

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Grid computing notes

Ähnlich wie Grid computing notes (20)

Mehr von Syed Mustafa

Mehr von Syed Mustafa (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Grid computing notes