SlideShare ist ein Scribd-Unternehmen logo
1 von 104
Downloaden Sie, um offline zu lesen
GRID COMPUTING
Sandeep Kumar Poonia
Head Of Dept. CS/IT
B.E., M.Tech., UGC-NET
LM-IAENG, LM-IACSIT,LM-CSTA, LM-AIRCC, LM-SCIEI, AM-UACEE
WHY GRID COMPUTING?
 40% Mainframes are idle
 90% Unix servers are idle
 95% PC servers are idle
 0-15% Mainframes are idle in peak-hour
 70% PC servers are idle in peak-hour
Source: “Grid Computing” Dr Daron G Green
SandeepKumarPoonia
OUTLINE
 Introduction to Grid Computing
 Methods of Grid computing
 Grid Middleware
 Grid Architecture
SandeepKumarPoonia
SandeepKumarPoonia
ELECTRICAL POWER GRID
ANALOGY
Electrical power
grid
 users (or electrical
appliances) get access to
electricity through wall
sockets with no care or
consideration for where or
how the electricity is
actually generated.
 “The power grid” links
together power plants of
many different kinds
The Grid
 users (or client applications) gain
access to computing resources
(processors, storage, data,
applications, and so on) as needed
with little or no knowledge of where
those resources are located or what
the underlying technologies,
hardware, operating system, and so
on are
 "the Grid" links together computing
resources (PCs, workstations, servers,
storage elements) and provides the
mechanism needed to access them.
Sandeep Kumar Poonia
WHY NEED GRID COMPUTING?
 Core networking technology now accelerates at a much
faster rate than advances in microprocessor speeds
 Exploiting under utilized resources
 Parallel CPU capacity
 Virtual resources and virtual organizations for
collaboration
 Access to additional resources
Sandeep Kumar Poonia
WHO NEEDS GRID COMPUTING?
 Not just computer scientists…
 scientists “hit the wall” when faced with situations:
 The amount of data they need is huge and the data is stored in
different institutions.
 The amount of similar calculations the scientist has to do is
huge.
 Other areas:
 Government
 Business
 Education
 Industrial design
 ……
LIVING IN AN EXPONENTIAL WORLD
(1) COMPUTING & SENSORS
Moore‘s Law: transistor count doubles each 18 months
Magnetohydro-
dynamics
star formation
SandeepKumarPoonia
LIVING IN AN EXPONENTIAL WORLD:
(2) STORAGE
 Storage density doubles every 12 months
 Dramatic growth in online data (1 petabyte =
1000 terabyte = 1,000,000 gigabyte)
 2000 ~0.5 petabyte
 2005 ~10 petabytes
 2010 ~100 petabytes
 2015 ~1000 petabytes?
 Transforming entire disciplines in physical and,
increasingly, biological sciences; humanities
next?
SandeepKumarPoonia
DATA INTENSIVE PHYSICAL SCIENCES
 High energy & nuclear physics
 Including new experiments at CERN
 Gravity wave searches
 LIGO, GEO, VIRGO
 Time-dependent 3-D systems (simulation, data)
 Earth Observation, climate modeling
 Geophysics, earthquake modeling
 Fluids, aerodynamic design
 Pollutant dispersal scenarios
 Astronomy: Digital sky surveys
SandeepKumarPoonia
ONGOING ASTRONOMICAL MEGA-SURVEYS
 Large number of new surveys
 Multi-TB in size, 100M objects or larger
 In databases
 Individual archives planned and under way
 Multi-wavelength view of the sky
 > 13 wavelength coverage within 5 years
 Impressive early discoveries
 Finding exotic objects by unusual colors
 L,T dwarfs, high redshift quasars
 Finding objects by time variability
 Gravitational micro-lensing
MACHO
2MASS
SDSS
DPOSS
GSC-II
COBE
MAP
NVSS
FIRST
GALEX
ROSAT
OGLE
...
SandeepKumarPoonia
COMING FLOODS OF ASTRONOMY DATA
 The planned Large Synoptic Survey Telescope
will produce over 10 petabytes per year by 2008!
 All-sky survey every few days, so will have fine-grain
time series for the first time
SandeepKumarPoonia
DATA INTENSIVE BIOLOGY AND MEDICINE
 Medical data
 X-Ray, mammography data, etc. (many petabytes)
 Digitizing patient records
 X-ray crystallography
 Molecular genomics and related disciplines
 Human Genome, other genome databases
 Proteomics (protein structure, activities, …)
 Protein interactions, drug delivery
 Virtual Population Laboratory (proposed)
 Simulate likely spread of disease outbreaks
 Brain scans (3-D, time dependent)
SandeepKumarPoonia
And comparisons must be
made among many
We need to get to one micron to know location of every cell. We’re just now
starting to get to 10 microns – Grids will help get us there and further
A BRAIN
IS A LOT
OF DATA!
(MARK ELLISMAN, UCSD)
SandeepKumarPoonia
Fastest virtual supercomputers
SandeepKumarPoonia
As of April 2013, Folding@home – 11.4 x86-equivalent
(5.8 "native") PFLOPS.
As of March 2013, BOINC – processing on average 9.2
PFLOPS.
As of April 2010, MilkyWay@Home computes at over
1.6 PFLOPS, with a large amount of this work coming from
GPUs.
As of April 2010, SETI@Home computes data averages
more than 730 TFLOPS.
As of April 2010, Einstein@Home is crunching more than
210 TFLOPS.
As of June 2011, GIMPS is sustaining 61 TFLOPS.
HOW GRID COMPUTING WORKS
Super computer,
Big mainframe…
Idol time
Idol CPU
Idol CPU
Idol time
Source: “The Evolving Computing Model: Grid Computing” Michael Teyssedre
SandeepKumarPoonia
HOW GRID COMPUTING WORKS
Virtual machine
Virtual CPU…
Idol time
Idol CPU
Idol CPU
Idol time
Source: “The Evolving Computing Model: Grid Computing” Michael Teyssedre
SandeepKumarPoonia
HOW GRID COMPUTING WORKS
Grid
Computing
0% idol
0% idol
0% idol
0% idol
Source: “The Evolving Computing Model: Grid Computing” Michael Teyssedre
SandeepKumarPoonia
GRID ARCHITECTURE
Autonomous, globally distributed computers/clusters
SandeepKumarPoonia
WHAT IS A GRID?
 Many definitions exist in the literature
 Early defs: Foster and Kesselman, 1998
―A computational grid is a hardware and software
infrastructure that provides dependable, consistent,
pervasive, and inexpensive access to high-end
computational facilities‖
 Kleinrock 1969:
―We will probably see the spread of ‗computer utilities‘,
which, like present electric and telephone utilities, will
service individual homes and offices across the country.‖
SandeepKumarPoonia
3-POINT CHECKLIST (FOSTER 2002)
1. Coordinates resources not subject to
centralized control
2. Uses standard, open, general purpose protocols
and interfaces
3. Deliver nontrivial qualities of service
• e.g., response time, throughput, availability,
security
SandeepKumarPoonia
DEFINITION
Grid computing is…
 A distributed computing system
 Where a group of computers are connected
 To create and work as one large virtual
computing power, storage, database, application,
and service
SandeepKumarPoonia
DEFINITION
Grid computing…
 Allows a group of computers to share the system
securely and
 Optimizes their collective resources to meet
required workloads
 By using open standards
SandeepKumarPoonia
GRID COMPUTING
Grid computing is a form of distributed computing
whereby a "super and virtual computer" is composed of a
cluster of networked, loosely coupled computers, acting in
concert to perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a
growing technology that facilitates the executions of
large-scale resource intensive applications on
geographically distributed computing resources.
Facilitates flexible, secure, coordinated large scale
resource sharing among dynamic collections of
individuals, institutions, and resource
Enable communities (―virtual organizations‖) to share
geographically distributed resources as they pursue
common goals
Ian Foster and Carl Kesselman
SandeepKumarPoonia
A COMPARISON
SERIAL
 Fetch/Store
 Compute
PARALLEL
 Fetch/Store
 Compute/
communicate
 Cooperative game
GRID
 Fetch/Store
 Discovery of Resources
 Interaction with remote
application
 Authentication /
Authorization
 Security
 Compute/Communicate
 Etc
SandeepKumarPoonia
DISTRIBUTED COMPUTING VS. GRID
 Grid is an evolution of distributed computing
 Dynamic
 Geographically independent
 Built around standards
 Internet backbone
 Distributed computing is an ―older term‖
 Typically built around proprietary
software and network
 Tightly couples systems/organization
SandeepKumarPoonia
WEB VS.
GRID
 Web
 Uniform naming access to documents
 Grid - Uniform, high performance access to computational
resources
Colleges/R&D
Labs
Software
Catalogs
Sensor nets
http://
http://
SandeepKumarPoonia
IS THE WORLD WIDE WEB A
GRID ?
 Seamless naming? Yes
 Uniform security and Authentication? No
 Information Service? Yes or No
 Co-Scheduling? No
 Accounting & Authorization ? No
 User Services? No
 Event Services? No
 Is the Browser a Global Shell ? No
SandeepKumarPoonia
WHAT DOES THE WORLD WIDE WEB BRING TO
THE GRID ?
 Uniform Naming
 A seamless, scalable information service
 A powerful new meta-data language: XML
 XML will be standard language for
describing information in the grid
 SOAP – simple object access protocol
 Uses XML for encoding. HTML for protocol
 SOAP may become a standard RPC
mechanism for Grid services
 Uses XML for encoding. HTML for protocol
 Portal Ideas
SandeepKumarPoonia
THE ULTIMATE GOAL
 In future I will not know or care
where my application will be
executed as I will acquire and pay
to use these resources as I need
them
SandeepKumarPoonia
WHY GRIDS?
 Large-scale science and engineering are done
through the interaction of people, heterogeneous
computing resources, information systems, and
instruments, all of which are geographically and
organizationally dispersed.
 The overall motivation for ―Grids‖ is to facilitate
the routine interactions of these resources in order
to support large-scale science and Engineering.
SandeepKumarPoonia
AN EXAMPLE VIRTUAL ORGANIZATION:
CERN‘S LARGE HADRON COLLIDER
1800 Physicists, 150 Institutes, 32 Countries
100 PB of data by 2010; 50,000 CPUs?
SandeepKumarPoonia
GRID COMMUNITIES & APPLICATIONS:
DATA GRIDS FOR HIGH ENERGY PHYSICS
Tier2 Centre
~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional
Centre
Italy Regional
Centre
Germany Regional
Centre
InstituteInstituteInstitute
Institute
~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more
channels; data for these channels should be cached by the
institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec
or Air Freight (deprecated)
Tier2 Centre
~1 TIPS
Tier2 Centre
~1 TIPS
Tier2 Centre
~1 TIPS
Caltech
~1 TIPS
~622 Mbits/sec
Tier
0
Tier
1
Tier
2
Tier
4
1 TIPS is approximately 25,000
SpecInt95 equivalents
www.griphyn.org www.ppdg.net www.eu-datagrid.org
SandeepKumarPoonia
INTELLIGENT INFRASTRUCTURE:
DISTRIBUTED SERVERS AND SERVICES
SandeepKumarPoonia
 Early 90s
 Gigabit testbeds, metacomputing
 Mid to late 90s
 Early experiments (e.g., I-WAY), academic software projects
(e.g., Globus, Legion), application experiments
 2002
 Dozens of application communities & projects
 Major infrastructure deployments
 Significant technology base (esp. Globus ToolkitTM)
 Growing industrial interest
 Global Grid Forum: ~500 people, 20+ countries
THE GRID:
A BRIEF HISTORY
SandeepKumarPoonia
HOW IT EVOLVES
Utility computing
Service grid
Data grid
Processing grid
Virtualization
Service-oriented
Open standard
SandeepKumarPoonia
EARLY ADOPTERS
 Academic
 Big science
 Life science
 Nuclear engineering
 Simulation…
SandeepKumarPoonia
MARKET POTENTIAL
 Financial services:
risk management and compliance
 Automotive:
acceleration of product development
 Petroleum:
discovery of oils
Source: “Perspectives on grid: Grid computing - next-generation distributed computing" Matt Haynos, 01/27/04
SandeepKumarPoonia
Criteria for a Grid:
Coordinates resources that are not subject to
centralized control.
Uses standard, open, general-purpose protocols
and interfaces.
Delivers nontrivial qualities of service.
e.g., response time, throughput, availability, security
Benefits
Exploit Underutilized resources
Resource load Balancing
Virtualize resources across an enterprise
Data Grids, Compute Grids
Enable collaboration for virtual organizations
SandeepKumarPoonia
WHY DO WE NEED GRIDS?
 Many large-scale problems cannot be solved by a
single computer
 Globally distributed data and resources
SandeepKumarPoonia
GRID APPLICATIONS
Data and computationally intensive applications:
This technology has been applied to computationally-
intensive scientific, mathematical, and academic problems
like drug discovery, economic forecasting, seismic analysis
back office data processing in support of e-commerce
 A chemist may utilize hundreds of processors to screen
thousands of compounds per hour.
 Teams of engineers worldwide pool resources to analyze
terabytes of structural data.
 Meteorologists seek to visualize and analyze petabytes of
climate data with enormous computational demands.
Resource sharing
 Computers, storage, sensors, networks, …
 Sharing always conditional: issues of trust, policy,
negotiation, payment, …
Coordinated problem solving
 distributed data analysis, computation, collaboration, …
SandeepKumarPoonia
GRID TOPOLOGIES
• Intragrid
– Local grid within an organisation
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organisations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the
internet
– Trust based on certification
SandeepKumarPoonia
COMPUTATIONAL GRID
―A computational grid is a hardware and software infrastructure
that provides dependable, consistent, pervasive, and inexpensive
access to high-end computational capabilities.‖
‖The Grid: Blueprint for a New Computing Infrastructure‖,
Kesselman & Foster
Example : Science Grid (US Department of Energy)
SandeepKumarPoonia
DATA GRID
 A data grid is a grid computing system that deals with
data — the controlled sharing and management of
large amounts of distributed data.
 Data Grid is the storage component of a grid environment.
Scientific and engineering applications require access to
large amounts of data, and often this data is widely
distributed. A data grid provides seamless access to the
local or remote data required to complete compute
intensive calculations.
Example :
Biomedical informatics Research Network (BIRN),
the Southern California earthquake Center (SCEC).
SandeepKumarPoonia
BACKGROUND: RELATED
TECHNOLOGIES
 Cluster computing
 Peer-to-peer computing
 Internet computing
SandeepKumarPoonia
CLUSTER COMPUTING
 Idea: put some PCs together and get them to
communicate
 Cheaper to build than a mainframe
supercomputer
 Different sizes of clusters
 Scalable – can grow a cluster by adding more PCs
SandeepKumarPoonia
CLUSTER ARCHITECTURE
SandeepKumarPoonia
PEER-TO-PEER COMPUTING
 Connect to other computers
 Can access files from any computer on the
network
 Allows data sharing without going through
central server
 Decentralized approach also useful for Grid
SandeepKumarPoonia
PEER TO PEER ARCHITECTURE
SandeepKumarPoonia
METHODS OF GRID COMPUTING
 Distributed Supercomputing
 High-Throughput Computing
 On-Demand Computing
 Data-Intensive Computing
 Collaborative Computing
 Logistical Networking
SandeepKumarPoonia
DISTRIBUTED SUPERCOMPUTING
 Combining multiple high-capacity resources on
a computational grid into a single, virtual
distributed supercomputer.
 Tackle problems that cannot be solved on a
single system.
 Examples: climate modeling, computational
chemistry
 Challenges include:
 Scheduling scarce and expensive resources
 Scalability of protocols and algorithms
 Maintaining high levels of performance across
heterogeneous systems
SandeepKumarPoonia
HIGH-THROUGHPUT COMPUTING
 Uses the grid to schedule large numbers of
loosely coupled or independent tasks, with the
goal of putting unused processor cycles to
work.
 Schedule large numbers of independent tasks
 Goal: exploit unused CPU cycles (e.g., from
idle workstations)
 Unlike distributed computing, tasks loosely
coupled
 Examples: parameter studies, cryptographic
problems
SandeepKumarPoonia
On-Demand Computing
 Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
 Models real-time computing demands.
 Use Grid capabilities to meet short-term
requirements for resources that cannot
conveniently be located locally
 Unlike distributed computing, driven by cost-
performance concerns rather than absolute
performance
 Dispatch expensive or specialized
computations to remote servers
SandeepKumarPoonia
COLLABORATIVE COMPUTING
 Concerned primarily with enabling and
enhancing human-to-human interactions.
 Enable shared use of data archives and
simulations
 Applications are often structured in terms of a
virtual shared space.
 Examples:
 Collaborative exploration of large geophysical data sets
 Challenges:
 Real-time demands of interactive applications
 Rich variety of interactions
SandeepKumarPoonia
Data-Intensive Computing
 The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.
 Particularly useful for distributed data mining.
 Examples:
•High energy physics generate terabytes of distributed data, need complex
queries to detect “interesting” events
•Distributed analysis of Sloan Digital Sky Survey data
SandeepKumarPoonia
LOGISTICAL NETWORKING
 Logistical networks focus on exposing storage
resources inside networks by optimizing the
global scheduling of data transport, and data
storage.
 Contrasts with traditional networking, which
does not explicitly model storage resources in the
network.
 high-level services for Grid applications
 Called "logistical" because of the analogy it bears
with the systems of warehouses, depots, and
distribution channels.
SandeepKumarPoonia
P2P COMPUTING VS GRID
COMPUTING
 Differ in Target Communities
 Grid system deals with more complex, more
powerful, more diverse and highly interconnected
set of resources than
P2P.
SandeepKumarPoonia
A TYPICAL VIEW OF GRID
ENVIRONMENT
User
Resource Broker
Grid Resources
Grid Information Service
A User sends computation
or data intensive application
to Global Grids in order to
speed up the execution of
the application.
A Resource Broker distribute the
jobs in an application to the Grid
resources based on user’s QoS
requirements and details of available
Grid resources for further executions.
Grid Resources (Cluster, PC,
Supercomputer, database,
instruments, etc.) in the Global
Grid execute the user jobs.
Grid Information Service
system collects the details of
the available Grid resources
and passes the information
to the resource broker.
Computation result
Grid application
Computational jobs
Details of Grid resources
Processed jobs
1
2
3
4
SandeepKumarPoonia
GRID MIDDLEWARE
 Grids are typically managed by grid ware -
a special type of middleware that enable sharing and
manage grid components based on user requirements
and resource attributes (e.g., capacity, performance)
 Software that connects other software components or
applications to provide the following functions:
Run applications on suitable available resources
– Brokering, Scheduling
Provide uniform, high-level access to resources
– Semantic interfaces
– Web Services, Service Oriented Architectures
Address inter-domain issues of security, policy, etc.
– Federated Identities
Provide application-level status
monitoring and control
SandeepKumarPoonia
MIDDLEWARES
 Globus –chicago Univ
 Condor – Wisconsin Univ – High throughput
computing
 Legion – Virginia Univ – virtual workspaces-
collaborative computing
 IBP – Internet back pane – Tennesse Univ –
logistical networking
 NetSolve – solving scientific problems in
heterogeneous env – high throughput & data
intensive
SandeepKumarPoonia
TWO KEY GRID COMPUTING GROUPS
The Globus Alliance (www.globus.org)
 Composed of people from:
Argonne National Labs, University of Chicago, University of
Southern California Information Sciences Institute,
University of Edinburgh and others.
 OGSA/I standards initially proposed by the Globus Group
The Global Grid Forum (www.ggf.org)
 Heavy involvement of Academic Groups and Industry
 (e.g. IBM Grid Computing, HP, United Devices, Oracle,
UK e-Science Programme, US DOE, US NSF, Indiana
University, and many others)
 Process
 Meets three times annually
 Solicits involvement from industry, research groups, and
academics
SandeepKumarPoonia
GRID USERS
 Many levels of users
 Grid developers
 Tool developers
 Application developers
 End users
 System administrators
SandeepKumarPoonia
SOME GRID CHALLENGES
 Data movement
 Data replication
 Resource management
 Job submission
SandeepKumarPoonia
SOME OF THE MAJOR GRID PROJECTS
Name URL/Sponsor Focus
EuroGrid, Grid
Interoperability
(GRIP)
eurogrid.org
European Union
Create tech for remote access to super
comp resources & simulation codes; in
GRIP, integrate with Globus Toolkit™
Fusion Collaboratory fusiongrid.org
DOE Off. Science
Create a national computational
collaboratory for fusion research
Globus Project™ globus.org
DARPA, DOE,
NSF, NASA, Msoft
Research on Grid technologies;
development and support of Globus
Toolkit™; application and deployment
GridLab gridlab.org
European Union
Grid technologies and applications
GridPP gridpp.ac.uk
U.K. eScience
Create & apply an operational grid within the
U.K. for particle physics research
Grid Research
Integration Dev. &
Support Center
grids-center.org
NSF
Integration, deployment, support of the NSF
Middleware Infrastructure for research &
education
SandeepKumarPoonia
SandeepKumarPoonia
Grid in India-GARUDA
•GARUDA is India's Grid Computing
initiative connecting 17 cities across the
country.
•The 45 participating institutes in this
nationwide project include all the IITs and
C-DAC centers and other major institutes
in India.
GLOBUS GRID TOOLKIT
 Open source toolkit for building Grid systems and
applications
 Enabling technology for the Grid
 Share computing power, databases, and other tools securely
online
 Facilities for:
 Resource monitoring
 Resource discovery
 Resource management
 Security
 File management
SandeepKumarPoonia
DATA MANAGEMENT IN GLOBUS
TOOLKIT
 Data movement
 GridFTP
 Reliable File Transfer (RFT)
 Data replication
 Replica Location Service (RLS)
 Data Replication Service (DRS)
SandeepKumarPoonia
GRIDFTP
 High performance, secure, reliable data transfer protocol
 Optimized for wide area networks
 Superset of Internet FTP protocol
 Features:
 Multiple data channels for parallel transfers
 Partial file transfers
 Third party transfers
 Reusable data channels
 Command pipelining
SandeepKumarPoonia
MORE GRIDFTP FEATURES
 Auto tuning of parameters
 Striping
 Transfer data in parallel among multiple senders and
receivers instead of just one
 Extended block mode
 Send data in blocks
 Know block size and offset
 Data can arrive out of order
 Allows multiple streams
SandeepKumarPoonia
STRIPING ARCHITECTURE
 Use ―Striped‖ servers
SandeepKumarPoonia
LIMITATIONS OF GRIDFTP
 Not a web service protocol (does not employ
SOAP, WSDL, etc.)
 Requires client to maintain open socket
connection throughout transfer
 Inconvenient for long transfers
 Cannot recover from client failures
SandeepKumarPoonia
GRIDFTP
SandeepKumarPoonia
RELIABLE FILE TRANSFER (RFT)
 Web service with ―job-scheduler‖ functionality for data
movement
 User provides source and destination URLs
 Service writes job description to a database and moves
files
 Service methods for querying transfer status
SandeepKumarPoonia
RFT
SandeepKumarPoonia
REPLICA LOCATION SERVICE (RLS)
 Registry to keep track of where replicas exist on physical
storage system
 Users or services register files in RLS when files created
 Distributed registry
 May consist of multiple servers at different sites
 Increase scale
 Fault tolerance
SandeepKumarPoonia
REPLICA LOCATION SERVICE (RLS)
 Logical file name – unique identifier for contents of
file
 Physical file name – location of copy of file on
storage system
 User can provide logical name and ask for replicas
 Or query to find logical name associated with
physical file location
SandeepKumarPoonia
DATA REPLICATION SERVICE (DRS)
 Pull-based replication capability
 Implemented as a web service
 Higher-level data management service built on top of RFT
and RLS
 Goal: ensure that a specified set of files exists on a storage
site
 First, query RLS to locate desired files
 Next, creates transfer request using RFT
 Finally, new replicas are registered with RLS
SandeepKumarPoonia
CONDOR
 Original goal: high-throughput computing
 Harvest wasted CPU power from other machines
 Can also be used on a dedicated cluster
 Condor-G – Condor interface to Globus resources
SandeepKumarPoonia
CONDOR
 Provides many features of batch systems:
 job queueing
 scheduling policy
 priority scheme
 resource monitoring
 resource management
 Users submit their serial or parallel jobs
 Condor places them into a queue
 Scheduling and monitoring
 Informs the user upon completion
SandeepKumarPoonia
NIMROD-G
 Tool to manage execution of parametric studies across
distributed computers
 Manages experiment
 Distributing files to remote systems
 Performing the remote computation
 Gathering results
 User submits declarative plan file
 Parameters, default values, and commands necessary for
performing the work
 Nimrod-G takes advantage of Globus toolkit features
SandeepKumarPoonia
NIMROD-G ARCHITECTURE
SandeepKumarPoonia
GRID CASE STUDIES
 Earth System Grid
 LIGO
 TeraGrid
SandeepKumarPoonia
EARTH SYSTEM GRID
 Provide climate studies scientists with access to
large datasets
 Data generated by computational models –
requires massive computational power
 Most scientists work with subsets of the data
 Requires access to local copies of data
SandeepKumarPoonia
ESG INFRASTRUCTURE
 Archival storage systems and disk storage systems at
several sites
 Storage resource managers and GridFTP servers to
provide access to storage systems
 Metadata catalog services
 Replica location services
 Web portal user interface
SandeepKumarPoonia
EARTH SYSTEM GRID
SandeepKumarPoonia
EARTH SYSTEM GRID INTERFACE
SandeepKumarPoonia
LASER INTERFEROMETER
GRAVITATIONAL WAVE
OBSERVATORY (LIGO)
 Instruments at two sites to detect gravitational waves
 Each experiment run produces millions of files
 Scientists at other sites want these datasets on local storage
 LIGO deploys RLS servers at each site to register local
mappings and collect info about mappings at other sites
SandeepKumarPoonia
LARGE SCALE DATA REPLICATION
FOR LIGO
 Goal: detection of gravitational waves
 Three interferometers at two sites
 Generate 1 TB of data daily
 Need to replicate this data across 9 sites to make
it available to scientists
 Scientists need to learn where data items are,
and how to access them
SandeepKumarPoonia
LIGO
SandeepKumarPoonia
LIGO SOLUTION
 Lightweight data replicator (LDR)
 Uses parallel data streams, tunable TCP windows, and
tunable write/read buffers
 Tracks where copies of specific files can be found
 Stores descriptive information (metadata) in a
database
 Can select files based on description rather than filename
SandeepKumarPoonia
TERAGRID
 NSF high-performance computing facility
 Nine distributed sites, each with different
capability , e.g., computation power, archiving
facilities, visualization software
 Applications may require more than one site
 Data sizes on the order of gigabytes or terabytes
SandeepKumarPoonia
TERAGRID
SandeepKumarPoonia
TERAGRID
 Solution: Use GridFTP and RFT with front end
command line tool (tgcp)
 Benefits of system:
 Simple user interface
 High performance data transfer capability
 Ability to recover from both client and server software
failures
 Extensible configuration
SandeepKumarPoonia
TGCP DETAILS
 Idea: hide low level GridFTP commands from users
 Copy file smallfile.dat in a working directory to another
system:
tgcp smallfile.dat tg-login.sdsc.teragrid.org:/users/ux454332
 GridFTP command:
globus-url-copy -p 8 -tcp-bs 1198372 
gsiftp://tg-gridftprr.uc.teragrid.org:2811/home/navarro/smallfile.dat

gsiftp://tg-login.sdsc.teragrid.org:2811/users/ux454332/smallfile.dat
SandeepKumarPoonia
GRID ARCHITECTURE
SandeepKumarPoonia
THE HOURGLASS MODEL
 Focus on architecture issues
 Propose set of core services as
basic infrastructure
 Used to construct high-level,
domain-specific solutions
(diverse)
 Design principles
 Keep participation cost low
 Enable local control
 Support for adaptation
 ―IP hourglass‖ model
Diverse global services
Core
services
Local OS
A p p l i c a t i o n s
SandeepKumarPoonia
LAYERED GRID ARCHITECTURE
(BY ANALOGY TO INTERNET ARCHITECTURE)
Application
Fabric
“Controlling things locally”: Access
to, & control of, resources
Connectivity
“Talking to things”: communication
(Internet protocols) & security
Resource
“Sharing single resources”:
negotiating access, controlling use
Collective
“Coordinating multiple resources”:
ubiquitous infrastructure services,
app-specific distributed services
Internet
Transport
Application
Link
InternetProtocolArchitecture
SandeepKumarPoonia
EXAMPLE:
DATA GRID ARCHITECTURE
Discipline-Specific Data Grid Application
Coherency control, replica selection, task management,
virtual data catalog, virtual data code catalog, …
Replica catalog, replica management, co-allocation,
certificate authorities, metadata catalogs,
Access to data, access to computers, access to network
performance data, …
Communication, service discovery (DNS), authentication,
authorization, delegation
Storage systems, clusters, networks, network caches, …
Collective
(App)
App
Collective
(Generic)
Resource
Connect
Fabric
SandeepKumarPoonia
SIMULATION TOOLS
 GridSim – job scheduling
 SimGrid – single client multiserver
scheduling
 Bricks – scheduling
 GangSim- Ganglia VO
 OptoSim – Data Grid Simulations
 G3S – Grid Security services Simulator –
security services
SandeepKumarPoonia
SIMULATION TOOL
 GridSim is a Java-based toolkit for modeling,
and simulation of distributed resource
management and scheduling for conventional
Grid environment.
 GridSim is based on SimJava, a general
purpose discrete-event simulation package
implemented in Java.
 All components in GridSim communicate with
each other through message passing operations
defined by SimJava.
SandeepKumarPoonia
SALIENT FEATURES OF THE GRIDSIM
 It allows modeling of heterogeneous types of
resources.
 Resources can be modeled operating under space-
or time-shared mode.
 Resource capability can be defined (in the form of
MIPS (Million Instructions Per Second)
benchmark.
 Resources can be located in any time zone.
 Weekends and holidays can be mapped
depending on resource‘s local time to model non-
Grid (local) workload.
 Resources can be booked for advance reservation.
 Applications with different parallel application
models can be simulated.
SandeepKumarPoonia
SALIENT FEATURES OF THE GRIDSIM
 Application tasks can be heterogeneous and they can
be CPU or I/O intensive.
 There is no limit on the number of application jobs
that can be submitted to a resource.
 Multiple user entities can submit tasks for execution
simultaneously in the same resource, which may be
time-shared or space-shared. This feature helps in
building schedulers that can use different market-
driven economic models for selecting services
competitively.
 Network speed between resources can be specified.
 It supports simulation of both static and dynamic
schedulers.
 Statistics of all or selected operations can be recorded
and they can be analyzed using GridSim statistics
analysis methods.
SandeepKumarPoonia
A MODULAR ARCHITECTURE FOR GRIDSIM
PLATFORM AND COMPONENTS.
Appn Conf Res Conf User Req Grid Sc Output
Application, User, Grid Scenario’s input and Results
Grid Resource Brokers or Schedulers
…
Appn
modeling
Res entity Info serv Job mgmt Res alloc Statis
GridSim Toolkit
Single
CPU
SMPs Clusters Load Netw Reservation
Resource Modeling and Simulation
SimJava Distributed SimJava
Basic Discrete Event Simulation Infrastructure
PCs Workstation ClustersSMPs Distributed
Resources
Virtual Machine
SandeepKumarPoonia
SandeepKumarPoonia
Sandeep Kumar Poonia

Weitere ähnliche Inhalte

Was ist angesagt?

CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bankpkaviya
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]Raul Soto
 
Distributed and Cloud Computing 1st Edition Hwang Solutions Manual
Distributed and Cloud Computing 1st Edition Hwang Solutions ManualDistributed and Cloud Computing 1st Edition Hwang Solutions Manual
Distributed and Cloud Computing 1st Edition Hwang Solutions Manualkyxeminut
 
Vision of cloud computing
Vision of cloud computingVision of cloud computing
Vision of cloud computinggaurav jain
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applicationsBurhan Ahmed
 
Grid computing
Grid computingGrid computing
Grid computingWipro
 
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system Sarvesh Meena
 
Distributed Computing
Distributed Computing Distributed Computing
Distributed Computing Megha yadav
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]vaishalisahare123
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed SystemsRupsee
 
Unit 1 architecture of distributed systems
Unit 1 architecture of distributed systemsUnit 1 architecture of distributed systems
Unit 1 architecture of distributed systemskaran2190
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingSayed Chhattan Shah
 
Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??Abdul Aslam
 

Was ist angesagt? (20)

CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]
 
Grid computing
Grid computing Grid computing
Grid computing
 
cluster computing
cluster computingcluster computing
cluster computing
 
Cloud Computing Architecture
Cloud Computing ArchitectureCloud Computing Architecture
Cloud Computing Architecture
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Distributed and Cloud Computing 1st Edition Hwang Solutions Manual
Distributed and Cloud Computing 1st Edition Hwang Solutions ManualDistributed and Cloud Computing 1st Edition Hwang Solutions Manual
Distributed and Cloud Computing 1st Edition Hwang Solutions Manual
 
Vision of cloud computing
Vision of cloud computingVision of cloud computing
Vision of cloud computing
 
Distributed System ppt
Distributed System pptDistributed System ppt
Distributed System ppt
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applications
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing
Grid computingGrid computing
Grid computing
 
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system
 
Distributed Computing
Distributed Computing Distributed Computing
Distributed Computing
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Unit 1 architecture of distributed systems
Unit 1 architecture of distributed systemsUnit 1 architecture of distributed systems
Unit 1 architecture of distributed systems
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 
Cluster and Grid Computing
Cluster and Grid ComputingCluster and Grid Computing
Cluster and Grid Computing
 
Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??
 

Ähnlich wie 1. GRID COMPUTING

Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
Grid computing assiment
Grid computing assimentGrid computing assiment
Grid computing assimentHuma Tariq
 
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway SystemLarry Smarr
 
How to make data more usable on the Internet of Things
How to make data more usable on the Internet of ThingsHow to make data more usable on the Internet of Things
How to make data more usable on the Internet of ThingsPayamBarnaghi
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer OverlordsIan Foster
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Larry Smarr
 
High Performance Collaboration – The Jump to Light Speed
High Performance Collaboration – The Jump to Light SpeedHigh Performance Collaboration – The Jump to Light Speed
High Performance Collaboration – The Jump to Light SpeedLarry Smarr
 
The Rise of Machine Intelligence
The Rise of Machine IntelligenceThe Rise of Machine Intelligence
The Rise of Machine IntelligenceLarry Smarr
 
Envisioning the Future
Envisioning the FutureEnvisioning the Future
Envisioning the FutureLarry Smarr
 
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsOptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsLarry Smarr
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011Ian Foster
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceRobert Grossman
 
The Internet of Things (IoT) and its evolution
The Internet of Things (IoT) and its evolutionThe Internet of Things (IoT) and its evolution
The Internet of Things (IoT) and its evolutionSathvik N Prasad
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudNexgen Technology
 

Ähnlich wie 1. GRID COMPUTING (20)

3. the grid new infrastructure
3. the grid new infrastructure3. the grid new infrastructure
3. the grid new infrastructure
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Grid computing assiment
Grid computing assimentGrid computing assiment
Grid computing assiment
 
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
How to make data more usable on the Internet of Things
How to make data more usable on the Internet of ThingsHow to make data more usable on the Internet of Things
How to make data more usable on the Internet of Things
 
grid computing
grid computinggrid computing
grid computing
 
Grid computing
Grid computingGrid computing
Grid computing
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer Overlords
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...
 
High Performance Collaboration – The Jump to Light Speed
High Performance Collaboration – The Jump to Light SpeedHigh Performance Collaboration – The Jump to Light Speed
High Performance Collaboration – The Jump to Light Speed
 
The Rise of Machine Intelligence
The Rise of Machine IntelligenceThe Rise of Machine Intelligence
The Rise of Machine Intelligence
 
Envisioning the Future
Envisioning the FutureEnvisioning the Future
Envisioning the Future
 
Grid computing
Grid computingGrid computing
Grid computing
 
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsOptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of Science
 
The Internet of Things (IoT) and its evolution
The Internet of Things (IoT) and its evolutionThe Internet of Things (IoT) and its evolution
The Internet of Things (IoT) and its evolution
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloud
 

Mehr von Dr Sandeep Kumar Poonia

An improved memetic search in artificial bee colony algorithm
An improved memetic search in artificial bee colony algorithmAn improved memetic search in artificial bee colony algorithm
An improved memetic search in artificial bee colony algorithmDr Sandeep Kumar Poonia
 
Modified position update in spider monkey optimization algorithm
Modified position update in spider monkey optimization algorithmModified position update in spider monkey optimization algorithm
Modified position update in spider monkey optimization algorithmDr Sandeep Kumar Poonia
 
Enhanced local search in artificial bee colony algorithm
Enhanced local search in artificial bee colony algorithmEnhanced local search in artificial bee colony algorithm
Enhanced local search in artificial bee colony algorithmDr Sandeep Kumar Poonia
 
Memetic search in differential evolution algorithm
Memetic search in differential evolution algorithmMemetic search in differential evolution algorithm
Memetic search in differential evolution algorithmDr Sandeep Kumar Poonia
 
Improved onlooker bee phase in artificial bee colony algorithm
Improved onlooker bee phase in artificial bee colony algorithmImproved onlooker bee phase in artificial bee colony algorithm
Improved onlooker bee phase in artificial bee colony algorithmDr Sandeep Kumar Poonia
 
Comparative study of_hybrids_of_artificial_bee_colony_algorithm
Comparative study of_hybrids_of_artificial_bee_colony_algorithmComparative study of_hybrids_of_artificial_bee_colony_algorithm
Comparative study of_hybrids_of_artificial_bee_colony_algorithmDr Sandeep Kumar Poonia
 
A novel hybrid crossover based abc algorithm
A novel hybrid crossover based abc algorithmA novel hybrid crossover based abc algorithm
A novel hybrid crossover based abc algorithmDr Sandeep Kumar Poonia
 
Multiplication of two 3 d sparse matrices using 1d arrays and linked lists
Multiplication of two 3 d sparse matrices using 1d arrays and linked listsMultiplication of two 3 d sparse matrices using 1d arrays and linked lists
Multiplication of two 3 d sparse matrices using 1d arrays and linked listsDr Sandeep Kumar Poonia
 
Sunzip user tool for data reduction using huffman algorithm
Sunzip user tool for data reduction using huffman algorithmSunzip user tool for data reduction using huffman algorithm
Sunzip user tool for data reduction using huffman algorithmDr Sandeep Kumar Poonia
 
New Local Search Strategy in Artificial Bee Colony Algorithm
New Local Search Strategy in Artificial Bee Colony Algorithm New Local Search Strategy in Artificial Bee Colony Algorithm
New Local Search Strategy in Artificial Bee Colony Algorithm Dr Sandeep Kumar Poonia
 
Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...Dr Sandeep Kumar Poonia
 
Performance evaluation of diff routing protocols in wsn using difft network p...
Performance evaluation of diff routing protocols in wsn using difft network p...Performance evaluation of diff routing protocols in wsn using difft network p...
Performance evaluation of diff routing protocols in wsn using difft network p...Dr Sandeep Kumar Poonia
 

Mehr von Dr Sandeep Kumar Poonia (20)

Soft computing
Soft computingSoft computing
Soft computing
 
An improved memetic search in artificial bee colony algorithm
An improved memetic search in artificial bee colony algorithmAn improved memetic search in artificial bee colony algorithm
An improved memetic search in artificial bee colony algorithm
 
Modified position update in spider monkey optimization algorithm
Modified position update in spider monkey optimization algorithmModified position update in spider monkey optimization algorithm
Modified position update in spider monkey optimization algorithm
 
Enhanced local search in artificial bee colony algorithm
Enhanced local search in artificial bee colony algorithmEnhanced local search in artificial bee colony algorithm
Enhanced local search in artificial bee colony algorithm
 
RMABC
RMABCRMABC
RMABC
 
Memetic search in differential evolution algorithm
Memetic search in differential evolution algorithmMemetic search in differential evolution algorithm
Memetic search in differential evolution algorithm
 
Improved onlooker bee phase in artificial bee colony algorithm
Improved onlooker bee phase in artificial bee colony algorithmImproved onlooker bee phase in artificial bee colony algorithm
Improved onlooker bee phase in artificial bee colony algorithm
 
Comparative study of_hybrids_of_artificial_bee_colony_algorithm
Comparative study of_hybrids_of_artificial_bee_colony_algorithmComparative study of_hybrids_of_artificial_bee_colony_algorithm
Comparative study of_hybrids_of_artificial_bee_colony_algorithm
 
A novel hybrid crossover based abc algorithm
A novel hybrid crossover based abc algorithmA novel hybrid crossover based abc algorithm
A novel hybrid crossover based abc algorithm
 
Multiplication of two 3 d sparse matrices using 1d arrays and linked lists
Multiplication of two 3 d sparse matrices using 1d arrays and linked listsMultiplication of two 3 d sparse matrices using 1d arrays and linked lists
Multiplication of two 3 d sparse matrices using 1d arrays and linked lists
 
Sunzip user tool for data reduction using huffman algorithm
Sunzip user tool for data reduction using huffman algorithmSunzip user tool for data reduction using huffman algorithm
Sunzip user tool for data reduction using huffman algorithm
 
New Local Search Strategy in Artificial Bee Colony Algorithm
New Local Search Strategy in Artificial Bee Colony Algorithm New Local Search Strategy in Artificial Bee Colony Algorithm
New Local Search Strategy in Artificial Bee Colony Algorithm
 
A new approach of program slicing
A new approach of program slicingA new approach of program slicing
A new approach of program slicing
 
Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...
 
Enhanced abc algo for tsp
Enhanced abc algo for tspEnhanced abc algo for tsp
Enhanced abc algo for tsp
 
Database aggregation using metadata
Database aggregation using metadataDatabase aggregation using metadata
Database aggregation using metadata
 
Performance evaluation of diff routing protocols in wsn using difft network p...
Performance evaluation of diff routing protocols in wsn using difft network p...Performance evaluation of diff routing protocols in wsn using difft network p...
Performance evaluation of diff routing protocols in wsn using difft network p...
 
Lecture28 tsp
Lecture28 tspLecture28 tsp
Lecture28 tsp
 
Lecture27 linear programming
Lecture27 linear programmingLecture27 linear programming
Lecture27 linear programming
 
Lecture26
Lecture26Lecture26
Lecture26
 

Kürzlich hochgeladen

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 

Kürzlich hochgeladen (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 

1. GRID COMPUTING

  • 1. GRID COMPUTING Sandeep Kumar Poonia Head Of Dept. CS/IT B.E., M.Tech., UGC-NET LM-IAENG, LM-IACSIT,LM-CSTA, LM-AIRCC, LM-SCIEI, AM-UACEE
  • 2. WHY GRID COMPUTING?  40% Mainframes are idle  90% Unix servers are idle  95% PC servers are idle  0-15% Mainframes are idle in peak-hour  70% PC servers are idle in peak-hour Source: “Grid Computing” Dr Daron G Green SandeepKumarPoonia
  • 3. OUTLINE  Introduction to Grid Computing  Methods of Grid computing  Grid Middleware  Grid Architecture SandeepKumarPoonia
  • 4. SandeepKumarPoonia ELECTRICAL POWER GRID ANALOGY Electrical power grid  users (or electrical appliances) get access to electricity through wall sockets with no care or consideration for where or how the electricity is actually generated.  “The power grid” links together power plants of many different kinds The Grid  users (or client applications) gain access to computing resources (processors, storage, data, applications, and so on) as needed with little or no knowledge of where those resources are located or what the underlying technologies, hardware, operating system, and so on are  "the Grid" links together computing resources (PCs, workstations, servers, storage elements) and provides the mechanism needed to access them.
  • 5. Sandeep Kumar Poonia WHY NEED GRID COMPUTING?  Core networking technology now accelerates at a much faster rate than advances in microprocessor speeds  Exploiting under utilized resources  Parallel CPU capacity  Virtual resources and virtual organizations for collaboration  Access to additional resources
  • 6. Sandeep Kumar Poonia WHO NEEDS GRID COMPUTING?  Not just computer scientists…  scientists “hit the wall” when faced with situations:  The amount of data they need is huge and the data is stored in different institutions.  The amount of similar calculations the scientist has to do is huge.  Other areas:  Government  Business  Education  Industrial design  ……
  • 7. LIVING IN AN EXPONENTIAL WORLD (1) COMPUTING & SENSORS Moore‘s Law: transistor count doubles each 18 months Magnetohydro- dynamics star formation SandeepKumarPoonia
  • 8. LIVING IN AN EXPONENTIAL WORLD: (2) STORAGE  Storage density doubles every 12 months  Dramatic growth in online data (1 petabyte = 1000 terabyte = 1,000,000 gigabyte)  2000 ~0.5 petabyte  2005 ~10 petabytes  2010 ~100 petabytes  2015 ~1000 petabytes?  Transforming entire disciplines in physical and, increasingly, biological sciences; humanities next? SandeepKumarPoonia
  • 9. DATA INTENSIVE PHYSICAL SCIENCES  High energy & nuclear physics  Including new experiments at CERN  Gravity wave searches  LIGO, GEO, VIRGO  Time-dependent 3-D systems (simulation, data)  Earth Observation, climate modeling  Geophysics, earthquake modeling  Fluids, aerodynamic design  Pollutant dispersal scenarios  Astronomy: Digital sky surveys SandeepKumarPoonia
  • 10. ONGOING ASTRONOMICAL MEGA-SURVEYS  Large number of new surveys  Multi-TB in size, 100M objects or larger  In databases  Individual archives planned and under way  Multi-wavelength view of the sky  > 13 wavelength coverage within 5 years  Impressive early discoveries  Finding exotic objects by unusual colors  L,T dwarfs, high redshift quasars  Finding objects by time variability  Gravitational micro-lensing MACHO 2MASS SDSS DPOSS GSC-II COBE MAP NVSS FIRST GALEX ROSAT OGLE ... SandeepKumarPoonia
  • 11. COMING FLOODS OF ASTRONOMY DATA  The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008!  All-sky survey every few days, so will have fine-grain time series for the first time SandeepKumarPoonia
  • 12. DATA INTENSIVE BIOLOGY AND MEDICINE  Medical data  X-Ray, mammography data, etc. (many petabytes)  Digitizing patient records  X-ray crystallography  Molecular genomics and related disciplines  Human Genome, other genome databases  Proteomics (protein structure, activities, …)  Protein interactions, drug delivery  Virtual Population Laboratory (proposed)  Simulate likely spread of disease outbreaks  Brain scans (3-D, time dependent) SandeepKumarPoonia
  • 13. And comparisons must be made among many We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns – Grids will help get us there and further A BRAIN IS A LOT OF DATA! (MARK ELLISMAN, UCSD) SandeepKumarPoonia
  • 14. Fastest virtual supercomputers SandeepKumarPoonia As of April 2013, Folding@home – 11.4 x86-equivalent (5.8 "native") PFLOPS. As of March 2013, BOINC – processing on average 9.2 PFLOPS. As of April 2010, MilkyWay@Home computes at over 1.6 PFLOPS, with a large amount of this work coming from GPUs. As of April 2010, SETI@Home computes data averages more than 730 TFLOPS. As of April 2010, Einstein@Home is crunching more than 210 TFLOPS. As of June 2011, GIMPS is sustaining 61 TFLOPS.
  • 15. HOW GRID COMPUTING WORKS Super computer, Big mainframe… Idol time Idol CPU Idol CPU Idol time Source: “The Evolving Computing Model: Grid Computing” Michael Teyssedre SandeepKumarPoonia
  • 16. HOW GRID COMPUTING WORKS Virtual machine Virtual CPU… Idol time Idol CPU Idol CPU Idol time Source: “The Evolving Computing Model: Grid Computing” Michael Teyssedre SandeepKumarPoonia
  • 17. HOW GRID COMPUTING WORKS Grid Computing 0% idol 0% idol 0% idol 0% idol Source: “The Evolving Computing Model: Grid Computing” Michael Teyssedre SandeepKumarPoonia
  • 18. GRID ARCHITECTURE Autonomous, globally distributed computers/clusters SandeepKumarPoonia
  • 19. WHAT IS A GRID?  Many definitions exist in the literature  Early defs: Foster and Kesselman, 1998 ―A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational facilities‖  Kleinrock 1969: ―We will probably see the spread of ‗computer utilities‘, which, like present electric and telephone utilities, will service individual homes and offices across the country.‖ SandeepKumarPoonia
  • 20. 3-POINT CHECKLIST (FOSTER 2002) 1. Coordinates resources not subject to centralized control 2. Uses standard, open, general purpose protocols and interfaces 3. Deliver nontrivial qualities of service • e.g., response time, throughput, availability, security SandeepKumarPoonia
  • 21. DEFINITION Grid computing is…  A distributed computing system  Where a group of computers are connected  To create and work as one large virtual computing power, storage, database, application, and service SandeepKumarPoonia
  • 22. DEFINITION Grid computing…  Allows a group of computers to share the system securely and  Optimizes their collective resources to meet required workloads  By using open standards SandeepKumarPoonia
  • 23. GRID COMPUTING Grid computing is a form of distributed computing whereby a "super and virtual computer" is composed of a cluster of networked, loosely coupled computers, acting in concert to perform very large tasks. Grid computing (Foster and Kesselman, 1999) is a growing technology that facilitates the executions of large-scale resource intensive applications on geographically distributed computing resources. Facilitates flexible, secure, coordinated large scale resource sharing among dynamic collections of individuals, institutions, and resource Enable communities (―virtual organizations‖) to share geographically distributed resources as they pursue common goals Ian Foster and Carl Kesselman SandeepKumarPoonia
  • 24. A COMPARISON SERIAL  Fetch/Store  Compute PARALLEL  Fetch/Store  Compute/ communicate  Cooperative game GRID  Fetch/Store  Discovery of Resources  Interaction with remote application  Authentication / Authorization  Security  Compute/Communicate  Etc SandeepKumarPoonia
  • 25. DISTRIBUTED COMPUTING VS. GRID  Grid is an evolution of distributed computing  Dynamic  Geographically independent  Built around standards  Internet backbone  Distributed computing is an ―older term‖  Typically built around proprietary software and network  Tightly couples systems/organization SandeepKumarPoonia
  • 26. WEB VS. GRID  Web  Uniform naming access to documents  Grid - Uniform, high performance access to computational resources Colleges/R&D Labs Software Catalogs Sensor nets http:// http:// SandeepKumarPoonia
  • 27. IS THE WORLD WIDE WEB A GRID ?  Seamless naming? Yes  Uniform security and Authentication? No  Information Service? Yes or No  Co-Scheduling? No  Accounting & Authorization ? No  User Services? No  Event Services? No  Is the Browser a Global Shell ? No SandeepKumarPoonia
  • 28. WHAT DOES THE WORLD WIDE WEB BRING TO THE GRID ?  Uniform Naming  A seamless, scalable information service  A powerful new meta-data language: XML  XML will be standard language for describing information in the grid  SOAP – simple object access protocol  Uses XML for encoding. HTML for protocol  SOAP may become a standard RPC mechanism for Grid services  Uses XML for encoding. HTML for protocol  Portal Ideas SandeepKumarPoonia
  • 29. THE ULTIMATE GOAL  In future I will not know or care where my application will be executed as I will acquire and pay to use these resources as I need them SandeepKumarPoonia
  • 30. WHY GRIDS?  Large-scale science and engineering are done through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed.  The overall motivation for ―Grids‖ is to facilitate the routine interactions of these resources in order to support large-scale science and Engineering. SandeepKumarPoonia
  • 31. AN EXAMPLE VIRTUAL ORGANIZATION: CERN‘S LARGE HADRON COLLIDER 1800 Physicists, 150 Institutes, 32 Countries 100 PB of data by 2010; 50,000 CPUs? SandeepKumarPoonia
  • 32. GRID COMMUNITIES & APPLICATIONS: DATA GRIDS FOR HIGH ENERGY PHYSICS Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPSFrance Regional Centre Italy Regional Centre Germany Regional Centre InstituteInstituteInstitute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents www.griphyn.org www.ppdg.net www.eu-datagrid.org SandeepKumarPoonia
  • 33. INTELLIGENT INFRASTRUCTURE: DISTRIBUTED SERVERS AND SERVICES SandeepKumarPoonia
  • 34.  Early 90s  Gigabit testbeds, metacomputing  Mid to late 90s  Early experiments (e.g., I-WAY), academic software projects (e.g., Globus, Legion), application experiments  2002  Dozens of application communities & projects  Major infrastructure deployments  Significant technology base (esp. Globus ToolkitTM)  Growing industrial interest  Global Grid Forum: ~500 people, 20+ countries THE GRID: A BRIEF HISTORY SandeepKumarPoonia
  • 35. HOW IT EVOLVES Utility computing Service grid Data grid Processing grid Virtualization Service-oriented Open standard SandeepKumarPoonia
  • 36. EARLY ADOPTERS  Academic  Big science  Life science  Nuclear engineering  Simulation… SandeepKumarPoonia
  • 37. MARKET POTENTIAL  Financial services: risk management and compliance  Automotive: acceleration of product development  Petroleum: discovery of oils Source: “Perspectives on grid: Grid computing - next-generation distributed computing" Matt Haynos, 01/27/04 SandeepKumarPoonia
  • 38. Criteria for a Grid: Coordinates resources that are not subject to centralized control. Uses standard, open, general-purpose protocols and interfaces. Delivers nontrivial qualities of service. e.g., response time, throughput, availability, security Benefits Exploit Underutilized resources Resource load Balancing Virtualize resources across an enterprise Data Grids, Compute Grids Enable collaboration for virtual organizations SandeepKumarPoonia
  • 39. WHY DO WE NEED GRIDS?  Many large-scale problems cannot be solved by a single computer  Globally distributed data and resources SandeepKumarPoonia
  • 40. GRID APPLICATIONS Data and computationally intensive applications: This technology has been applied to computationally- intensive scientific, mathematical, and academic problems like drug discovery, economic forecasting, seismic analysis back office data processing in support of e-commerce  A chemist may utilize hundreds of processors to screen thousands of compounds per hour.  Teams of engineers worldwide pool resources to analyze terabytes of structural data.  Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands. Resource sharing  Computers, storage, sensors, networks, …  Sharing always conditional: issues of trust, policy, negotiation, payment, … Coordinated problem solving  distributed data analysis, computation, collaboration, … SandeepKumarPoonia
  • 41. GRID TOPOLOGIES • Intragrid – Local grid within an organisation – Trust based on personal contracts • Extragrid – Resources of a consortium of organisations connected through a (Virtual) Private Network – Trust based on Business to Business contracts • Intergrid – Global sharing of resources through the internet – Trust based on certification SandeepKumarPoonia
  • 42. COMPUTATIONAL GRID ―A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.‖ ‖The Grid: Blueprint for a New Computing Infrastructure‖, Kesselman & Foster Example : Science Grid (US Department of Energy) SandeepKumarPoonia
  • 43. DATA GRID  A data grid is a grid computing system that deals with data — the controlled sharing and management of large amounts of distributed data.  Data Grid is the storage component of a grid environment. Scientific and engineering applications require access to large amounts of data, and often this data is widely distributed. A data grid provides seamless access to the local or remote data required to complete compute intensive calculations. Example : Biomedical informatics Research Network (BIRN), the Southern California earthquake Center (SCEC). SandeepKumarPoonia
  • 44. BACKGROUND: RELATED TECHNOLOGIES  Cluster computing  Peer-to-peer computing  Internet computing SandeepKumarPoonia
  • 45. CLUSTER COMPUTING  Idea: put some PCs together and get them to communicate  Cheaper to build than a mainframe supercomputer  Different sizes of clusters  Scalable – can grow a cluster by adding more PCs SandeepKumarPoonia
  • 47. PEER-TO-PEER COMPUTING  Connect to other computers  Can access files from any computer on the network  Allows data sharing without going through central server  Decentralized approach also useful for Grid SandeepKumarPoonia
  • 48. PEER TO PEER ARCHITECTURE SandeepKumarPoonia
  • 49. METHODS OF GRID COMPUTING  Distributed Supercomputing  High-Throughput Computing  On-Demand Computing  Data-Intensive Computing  Collaborative Computing  Logistical Networking SandeepKumarPoonia
  • 50. DISTRIBUTED SUPERCOMPUTING  Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer.  Tackle problems that cannot be solved on a single system.  Examples: climate modeling, computational chemistry  Challenges include:  Scheduling scarce and expensive resources  Scalability of protocols and algorithms  Maintaining high levels of performance across heterogeneous systems SandeepKumarPoonia
  • 51. HIGH-THROUGHPUT COMPUTING  Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work.  Schedule large numbers of independent tasks  Goal: exploit unused CPU cycles (e.g., from idle workstations)  Unlike distributed computing, tasks loosely coupled  Examples: parameter studies, cryptographic problems SandeepKumarPoonia
  • 52. On-Demand Computing  Uses grid capabilities to meet short-term requirements for resources that are not locally accessible.  Models real-time computing demands.  Use Grid capabilities to meet short-term requirements for resources that cannot conveniently be located locally  Unlike distributed computing, driven by cost- performance concerns rather than absolute performance  Dispatch expensive or specialized computations to remote servers SandeepKumarPoonia
  • 53. COLLABORATIVE COMPUTING  Concerned primarily with enabling and enhancing human-to-human interactions.  Enable shared use of data archives and simulations  Applications are often structured in terms of a virtual shared space.  Examples:  Collaborative exploration of large geophysical data sets  Challenges:  Real-time demands of interactive applications  Rich variety of interactions SandeepKumarPoonia
  • 54. Data-Intensive Computing  The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases.  Particularly useful for distributed data mining.  Examples: •High energy physics generate terabytes of distributed data, need complex queries to detect “interesting” events •Distributed analysis of Sloan Digital Sky Survey data SandeepKumarPoonia
  • 55. LOGISTICAL NETWORKING  Logistical networks focus on exposing storage resources inside networks by optimizing the global scheduling of data transport, and data storage.  Contrasts with traditional networking, which does not explicitly model storage resources in the network.  high-level services for Grid applications  Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels. SandeepKumarPoonia
  • 56. P2P COMPUTING VS GRID COMPUTING  Differ in Target Communities  Grid system deals with more complex, more powerful, more diverse and highly interconnected set of resources than P2P. SandeepKumarPoonia
  • 57. A TYPICAL VIEW OF GRID ENVIRONMENT User Resource Broker Grid Resources Grid Information Service A User sends computation or data intensive application to Global Grids in order to speed up the execution of the application. A Resource Broker distribute the jobs in an application to the Grid resources based on user’s QoS requirements and details of available Grid resources for further executions. Grid Resources (Cluster, PC, Supercomputer, database, instruments, etc.) in the Global Grid execute the user jobs. Grid Information Service system collects the details of the available Grid resources and passes the information to the resource broker. Computation result Grid application Computational jobs Details of Grid resources Processed jobs 1 2 3 4 SandeepKumarPoonia
  • 58. GRID MIDDLEWARE  Grids are typically managed by grid ware - a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance)  Software that connects other software components or applications to provide the following functions: Run applications on suitable available resources – Brokering, Scheduling Provide uniform, high-level access to resources – Semantic interfaces – Web Services, Service Oriented Architectures Address inter-domain issues of security, policy, etc. – Federated Identities Provide application-level status monitoring and control SandeepKumarPoonia
  • 59. MIDDLEWARES  Globus –chicago Univ  Condor – Wisconsin Univ – High throughput computing  Legion – Virginia Univ – virtual workspaces- collaborative computing  IBP – Internet back pane – Tennesse Univ – logistical networking  NetSolve – solving scientific problems in heterogeneous env – high throughput & data intensive SandeepKumarPoonia
  • 60. TWO KEY GRID COMPUTING GROUPS The Globus Alliance (www.globus.org)  Composed of people from: Argonne National Labs, University of Chicago, University of Southern California Information Sciences Institute, University of Edinburgh and others.  OGSA/I standards initially proposed by the Globus Group The Global Grid Forum (www.ggf.org)  Heavy involvement of Academic Groups and Industry  (e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-Science Programme, US DOE, US NSF, Indiana University, and many others)  Process  Meets three times annually  Solicits involvement from industry, research groups, and academics SandeepKumarPoonia
  • 61. GRID USERS  Many levels of users  Grid developers  Tool developers  Application developers  End users  System administrators SandeepKumarPoonia
  • 62. SOME GRID CHALLENGES  Data movement  Data replication  Resource management  Job submission SandeepKumarPoonia
  • 63. SOME OF THE MAJOR GRID PROJECTS Name URL/Sponsor Focus EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create tech for remote access to super comp resources & simulation codes; in GRIP, integrate with Globus Toolkit™ Fusion Collaboratory fusiongrid.org DOE Off. Science Create a national computational collaboratory for fusion research Globus Project™ globus.org DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies; development and support of Globus Toolkit™; application and deployment GridLab gridlab.org European Union Grid technologies and applications GridPP gridpp.ac.uk U.K. eScience Create & apply an operational grid within the U.K. for particle physics research Grid Research Integration Dev. & Support Center grids-center.org NSF Integration, deployment, support of the NSF Middleware Infrastructure for research & education SandeepKumarPoonia
  • 64. SandeepKumarPoonia Grid in India-GARUDA •GARUDA is India's Grid Computing initiative connecting 17 cities across the country. •The 45 participating institutes in this nationwide project include all the IITs and C-DAC centers and other major institutes in India.
  • 65. GLOBUS GRID TOOLKIT  Open source toolkit for building Grid systems and applications  Enabling technology for the Grid  Share computing power, databases, and other tools securely online  Facilities for:  Resource monitoring  Resource discovery  Resource management  Security  File management SandeepKumarPoonia
  • 66. DATA MANAGEMENT IN GLOBUS TOOLKIT  Data movement  GridFTP  Reliable File Transfer (RFT)  Data replication  Replica Location Service (RLS)  Data Replication Service (DRS) SandeepKumarPoonia
  • 67. GRIDFTP  High performance, secure, reliable data transfer protocol  Optimized for wide area networks  Superset of Internet FTP protocol  Features:  Multiple data channels for parallel transfers  Partial file transfers  Third party transfers  Reusable data channels  Command pipelining SandeepKumarPoonia
  • 68. MORE GRIDFTP FEATURES  Auto tuning of parameters  Striping  Transfer data in parallel among multiple senders and receivers instead of just one  Extended block mode  Send data in blocks  Know block size and offset  Data can arrive out of order  Allows multiple streams SandeepKumarPoonia
  • 69. STRIPING ARCHITECTURE  Use ―Striped‖ servers SandeepKumarPoonia
  • 70. LIMITATIONS OF GRIDFTP  Not a web service protocol (does not employ SOAP, WSDL, etc.)  Requires client to maintain open socket connection throughout transfer  Inconvenient for long transfers  Cannot recover from client failures SandeepKumarPoonia
  • 72. RELIABLE FILE TRANSFER (RFT)  Web service with ―job-scheduler‖ functionality for data movement  User provides source and destination URLs  Service writes job description to a database and moves files  Service methods for querying transfer status SandeepKumarPoonia
  • 74. REPLICA LOCATION SERVICE (RLS)  Registry to keep track of where replicas exist on physical storage system  Users or services register files in RLS when files created  Distributed registry  May consist of multiple servers at different sites  Increase scale  Fault tolerance SandeepKumarPoonia
  • 75. REPLICA LOCATION SERVICE (RLS)  Logical file name – unique identifier for contents of file  Physical file name – location of copy of file on storage system  User can provide logical name and ask for replicas  Or query to find logical name associated with physical file location SandeepKumarPoonia
  • 76. DATA REPLICATION SERVICE (DRS)  Pull-based replication capability  Implemented as a web service  Higher-level data management service built on top of RFT and RLS  Goal: ensure that a specified set of files exists on a storage site  First, query RLS to locate desired files  Next, creates transfer request using RFT  Finally, new replicas are registered with RLS SandeepKumarPoonia
  • 77. CONDOR  Original goal: high-throughput computing  Harvest wasted CPU power from other machines  Can also be used on a dedicated cluster  Condor-G – Condor interface to Globus resources SandeepKumarPoonia
  • 78. CONDOR  Provides many features of batch systems:  job queueing  scheduling policy  priority scheme  resource monitoring  resource management  Users submit their serial or parallel jobs  Condor places them into a queue  Scheduling and monitoring  Informs the user upon completion SandeepKumarPoonia
  • 79. NIMROD-G  Tool to manage execution of parametric studies across distributed computers  Manages experiment  Distributing files to remote systems  Performing the remote computation  Gathering results  User submits declarative plan file  Parameters, default values, and commands necessary for performing the work  Nimrod-G takes advantage of Globus toolkit features SandeepKumarPoonia
  • 81. GRID CASE STUDIES  Earth System Grid  LIGO  TeraGrid SandeepKumarPoonia
  • 82. EARTH SYSTEM GRID  Provide climate studies scientists with access to large datasets  Data generated by computational models – requires massive computational power  Most scientists work with subsets of the data  Requires access to local copies of data SandeepKumarPoonia
  • 83. ESG INFRASTRUCTURE  Archival storage systems and disk storage systems at several sites  Storage resource managers and GridFTP servers to provide access to storage systems  Metadata catalog services  Replica location services  Web portal user interface SandeepKumarPoonia
  • 85. EARTH SYSTEM GRID INTERFACE SandeepKumarPoonia
  • 86. LASER INTERFEROMETER GRAVITATIONAL WAVE OBSERVATORY (LIGO)  Instruments at two sites to detect gravitational waves  Each experiment run produces millions of files  Scientists at other sites want these datasets on local storage  LIGO deploys RLS servers at each site to register local mappings and collect info about mappings at other sites SandeepKumarPoonia
  • 87. LARGE SCALE DATA REPLICATION FOR LIGO  Goal: detection of gravitational waves  Three interferometers at two sites  Generate 1 TB of data daily  Need to replicate this data across 9 sites to make it available to scientists  Scientists need to learn where data items are, and how to access them SandeepKumarPoonia
  • 89. LIGO SOLUTION  Lightweight data replicator (LDR)  Uses parallel data streams, tunable TCP windows, and tunable write/read buffers  Tracks where copies of specific files can be found  Stores descriptive information (metadata) in a database  Can select files based on description rather than filename SandeepKumarPoonia
  • 90. TERAGRID  NSF high-performance computing facility  Nine distributed sites, each with different capability , e.g., computation power, archiving facilities, visualization software  Applications may require more than one site  Data sizes on the order of gigabytes or terabytes SandeepKumarPoonia
  • 92. TERAGRID  Solution: Use GridFTP and RFT with front end command line tool (tgcp)  Benefits of system:  Simple user interface  High performance data transfer capability  Ability to recover from both client and server software failures  Extensible configuration SandeepKumarPoonia
  • 93. TGCP DETAILS  Idea: hide low level GridFTP commands from users  Copy file smallfile.dat in a working directory to another system: tgcp smallfile.dat tg-login.sdsc.teragrid.org:/users/ux454332  GridFTP command: globus-url-copy -p 8 -tcp-bs 1198372 gsiftp://tg-gridftprr.uc.teragrid.org:2811/home/navarro/smallfile.dat gsiftp://tg-login.sdsc.teragrid.org:2811/users/ux454332/smallfile.dat SandeepKumarPoonia
  • 95. THE HOURGLASS MODEL  Focus on architecture issues  Propose set of core services as basic infrastructure  Used to construct high-level, domain-specific solutions (diverse)  Design principles  Keep participation cost low  Enable local control  Support for adaptation  ―IP hourglass‖ model Diverse global services Core services Local OS A p p l i c a t i o n s SandeepKumarPoonia
  • 96. LAYERED GRID ARCHITECTURE (BY ANALOGY TO INTERNET ARCHITECTURE) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link InternetProtocolArchitecture SandeepKumarPoonia
  • 97. EXAMPLE: DATA GRID ARCHITECTURE Discipline-Specific Data Grid Application Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, … Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs, Access to data, access to computers, access to network performance data, … Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, clusters, networks, network caches, … Collective (App) App Collective (Generic) Resource Connect Fabric SandeepKumarPoonia
  • 98. SIMULATION TOOLS  GridSim – job scheduling  SimGrid – single client multiserver scheduling  Bricks – scheduling  GangSim- Ganglia VO  OptoSim – Data Grid Simulations  G3S – Grid Security services Simulator – security services SandeepKumarPoonia
  • 99. SIMULATION TOOL  GridSim is a Java-based toolkit for modeling, and simulation of distributed resource management and scheduling for conventional Grid environment.  GridSim is based on SimJava, a general purpose discrete-event simulation package implemented in Java.  All components in GridSim communicate with each other through message passing operations defined by SimJava. SandeepKumarPoonia
  • 100. SALIENT FEATURES OF THE GRIDSIM  It allows modeling of heterogeneous types of resources.  Resources can be modeled operating under space- or time-shared mode.  Resource capability can be defined (in the form of MIPS (Million Instructions Per Second) benchmark.  Resources can be located in any time zone.  Weekends and holidays can be mapped depending on resource‘s local time to model non- Grid (local) workload.  Resources can be booked for advance reservation.  Applications with different parallel application models can be simulated. SandeepKumarPoonia
  • 101. SALIENT FEATURES OF THE GRIDSIM  Application tasks can be heterogeneous and they can be CPU or I/O intensive.  There is no limit on the number of application jobs that can be submitted to a resource.  Multiple user entities can submit tasks for execution simultaneously in the same resource, which may be time-shared or space-shared. This feature helps in building schedulers that can use different market- driven economic models for selecting services competitively.  Network speed between resources can be specified.  It supports simulation of both static and dynamic schedulers.  Statistics of all or selected operations can be recorded and they can be analyzed using GridSim statistics analysis methods. SandeepKumarPoonia
  • 102. A MODULAR ARCHITECTURE FOR GRIDSIM PLATFORM AND COMPONENTS. Appn Conf Res Conf User Req Grid Sc Output Application, User, Grid Scenario’s input and Results Grid Resource Brokers or Schedulers … Appn modeling Res entity Info serv Job mgmt Res alloc Statis GridSim Toolkit Single CPU SMPs Clusters Load Netw Reservation Resource Modeling and Simulation SimJava Distributed SimJava Basic Discrete Event Simulation Infrastructure PCs Workstation ClustersSMPs Distributed Resources Virtual Machine SandeepKumarPoonia