After an overview of its fundamental technologies, Grid Computing is presented as the platform of choice for scientific High Performance Computing (HPC). The latest offerings in Cloud Computing (CC) would enable it to become a basis for creating easy to deploy, on-demand and widely accessible grids, putting HPC within the reach of most scientific and research communities. A case study framework is proposed for future development.
CLOUD COMPUTING: AN ALTERNATIVE PLATFORM FOR SCIENTIFIC COMPUTING
1. CLOUD COMPUTING: AN
ALTERNATIVE PLATFORM
FOR
SCIENTIFIC COMPUTING
SCIENTIFIC COMPUTING
Presented by: DAVID RAMIREZ
COMP5003 GRADUATE SEMINAR AND PROJECT RESEARCH
Instructor: Dr. A. Lodgher
PRAIRIE VIEW A&M UNIVERSITY OF TEXAS
May, 2009
2. Big scientific challenges…
Oak Ridge National Laboratory
Bioengineering
Bioinformatics Climate models
Business Week
Argonne Labs
Astrophysics model – exploding star Aerospace
3. University of Texas
The traditional approach…
Supercomputers, clusters
“Ranger” cluster at UT Austin TX
Ca. 4000 nodes (Linux based)
580 Tflops
31 TB local memory
Texas A&M University
IBM Blue Gene
Argonne National Laboratory (Illinois)
US Department of Energy
1 PFLOP
Other installations in progress
(Germany) will reach 4 PFLOP by 2011
64K Nodes and more
“Hydra“ at Texas A&M
52 nodes, 832 IBM processors (AIX based)
6.3 Tflops
1.6 TB memory, 20 TB storage
4. Cray XT5 JAGUAR
Cray – Oak Ridge National Laboratoy
1,4 PFLOP
181,000 processing cores
(AMD Opteron, 2 or 2 core)
Linux-based
16 to 32 GB memory per node
Oak Ridge National Laboratory
Necessity for high performance
visualization:
STALLION visualization center at
TACC (Texas Advanced Computing Center)
University of Texas, Austin
5. High Demands of Computing Power in Science
… some examples
720x720x1620 point grid
Lawrence Livermore National Lab. – VisIt Gallery
1620 processors
20 days
20 terabytes of data output.
Large-Eddy Simulation of Raleigh-Taylor instability
Lawrence Livermore National Lab. – VisIt Gallery
11 million cells
512 processors of the FROST supercomputer
at Lawrence Livermore National Lab.
36 hours
2 terabytes of data output
6. Scientific computing: Some History…
• Scientific computing always a driving force for hardware development.
• “Mainframe” the first platform.
• FORTRAN programming language became the (still dominant) standard.
IBM
C EXAMPLE OF FORTRAN CODE
REAL SUM, CNTR, NUM
SUM = 0
DO 10 CNTR = 1, 1000
READ(*,*) NUM
SUM = SUM + NUM
10 CONTINUE
7. The next steps in
hardware evolution…
Minicomputer www.Xconomy.com
HEWLETT PACKARD
DEC PDP-8
IBM
Desktop Minicomputer
Hewlett-Packard
The IBM PC
9. The next logical step:
Aggregate the power of networked computers
towards the solution of highly demanding computing
tasks. Parallelize solution of problems.
David Ramirez
THE GRID CONCEPT WAS BORN !
10. The concepts behind grid computing….
Before… SERIAL COMPUTING
SOURCE: https://computing.llnl.gov/tutorials/parallel_comp/#Whatis
Single computer – single CPU.
LAWRENCE LIVERMORE NATIONAL LABORATORY
A problem is broken into a discrete series of instructions.
Instructions are executed one after another.
Only one instruction may execute at any moment in time.
11. Now … PARALLEL COMPUTING
SOURCE: https://computing.llnl.gov/tutorials/parallel_comp/#Whatis
LAWRENCE LIVERMORE NATIONAL LABORATORY
Software designed to A problem is broken Each part is further
be run using multiple into discrete parts broken down to a
CPUs that can be solved series of instructions
concurrently
Instructions from
each part execute
simultaneously on
different CPUs
12. PARALLEL COMPUTING: DEFINITIONS
Simultaneous use of multiple computing resources to solve
a computational problem.
Run using multiple CPUs
A problem is broken into discrete parts that can be solved
concurrently
Each part is further broken down to a series of instructions
Instructions from each part execute simultaneously on
different CPUs
13. PARALLEL COMPUTER CLASSIFICATION :
Ol’ serial FLYNN’S TAXONOMY (1966) Graphics
computer processors
SISD SIMD
Single Instruction, Single Instruction,
Single Data Multiple Data
SOURCE: https://computing.llnl.gov/tutorials/parallel_comp/#Whatis
MISD MIMD
LAWRENCE LIVERMORE NATIONAL LABORATORY
Multiple Instruction, Multiple Instruction,
Single Data Multiple Data
Rare – Space Most
Shuttle Flight modern
Computer parallel
computers
14. COMPUTATIONAL PROBLEMS IN PARALLEL COMPUTING
Perfect for loose
grids
Embarrassingly parallel calculations: (delays not so
important)
• each sub-calculation is independent of all the other
calculations. Subtasks rarely or never communicate
between them. Best for High-throughput computing
Fine-grained calculations More suitable
for
• Each sub-calculation is dependent on the result of supercomputers
another sub-calculation. Subtasks communicate many
times per second. Best for High-performance computing
Coarse-grained calculations More suitable
for
• Subtasks communicate between them less frequently supercomputers
(just several times per second).
15. SIMPLE EXAMPLE – Heat modeling
Source: Lawrence Livermore National Laboratory
The entire array is Master process sends initial info to
partitioned and distributed workers, checks for convergence
as subarrays to all tasks. and collects results .
Each task owns a portion of
the total array. Worker process calculates
solution, communicating as
necessary with neighbor processes
17. HIGH-THROUGHPUT PROBLEMS
Problems divided into many
independent tasks
Computing grids used to schedule
these tasks, dealing them out to the
different processors in the grid.
As soon as a processor finishes on
task, the next task arrives.
Example: Large Hadron Collider Computer Grid (CERN / Geneva)
18. APPROACHES FOR PARALLEL COMPUTING
IMPLEMENTATION
CLUSTER
Processors are close together
High speed of network, low latency
When big: “Supercomputer”
Ideal for fine-grained, high performance
Computation.
Or a mix
GRID
Disperse – even wide distances
Is the most distributed form of parallel computing
Internet as main transport
Loose connectivity
High latency
Ideal for embarrasingly parallel, high
Throughput computation.
Mostly commodity hardware in nodes
19. GRIDS ALL OVER THE WORLD … CERN LHC
Computing Grid
200K processors
11 clusters
worldwide
Source: http://www.accessgrid.org
CERN Large Hadron Collider CG currently most important / powerful scientific grid
20. Pioneering Grid Vendors
Scientific applications General market : commerce,
External or internal grids industry, science
Pioneer client software Computing on demand model
Sun Grid Engine (middleware)
Lustre distributed Filesystem
21. Now…The Cloud meets the Grid…
Grid resources become abstractions (“black boxes”)
23. CASE STUDY:
AMAZON WEB SERVICES
EC2 S3
Elastic Compute Simple Storage
Cloud Service
SimpleDB SQS
(unstructured Simple Queue
database) Service
Elastic Enables
parallelization
MapReduce
24. PROOF-OF-CONCEPT PROPOSAL:
Use AWS cloud infrastructure as a
platform for scientific applications
oriented,
high performance / high throughput,
parallel computing.
25. AWS HOW-TO FOR DOING THIS ( “self-service”)
• Have embarrasingly parallel problem at hand.
• Code problem solution using parallel techniques such as MPI
1 (Message Passing Interface). Use Fortran, C, C++, Python.
* HADOOP is an open-source product of the Apache Software Foundation written in Java™
• Create running environment snapshot (full with OS, software) and
store in S3.
• MAP & REDUCE using AWS HADOOP* middleware
2 implementation. Balance loads.
• Feed separate tasks to n EC2 nodes. Start nodes on demand using
the S3-stored images. Deploy & Coordinate with HADOOP.
3 • Collect partial results, assemble into final product (master node).
26. Source: http://hadoop.apache.org/core/
MAP &
REDUCE
ARCHITECTURE
Price list for AWS (as of Spring, 2009)
http://aws.amazon.com/elasticmapreduce/#pricing
Service Cost
(using maximum capacity)
EC2 $0.80/hr
MapReduce $0.12/hr
S3 $0.15/GB per month
$0.10/GB Data transfer
28. SOME CURRENT ACADEMIC
CURRENT PROJECTS & WORKING
IMPLEMENTATIONS
OF CLOUD COMPUTING-BASED SCIENTIFIC GRIDS
(Nimbus Framework)
•University of Chicago (NIMBUS)
•University of Florida (STRATUS)
•University of Purdue (WISPY)
•Masarik University (KUPA) (Czech Republic)
Source: http://workspace.globus.org/clouds
29. CONCLUSION
By integrating networking, computation and information,
the Grid provides a practical, virtual platform for computing
suitable for scientific research.
AWS cloud services make it easy and affordable to
implement a sufficiently powerful, scalable, and practical
grid computing platform.
•Can be self serviced.
•On-demand model
•Very economic
•Suitable for fast-turnkey solutions without the expense of costly
infrastructure, computer time.
•Ideal in an academic environment, to foster hands-on research
with complex models.
30. FUTURE WORK
GLOBALLY : Grid computing, now more widely enabled by cloud-
computing (Infrastructure-as-a-Service) platforms and the
sponsorship of governments, industries and the scientific
community, is a fundamental component for the future of
computing.
LOCALLY: The goal of this research paper is to provide a basis for a
near-future practical, proof-of-concept implementation of a
Cloud-based Grid that puts Prairie View A&M University in the list
of universities having access and use of such infrastructure, for
the benefit of its students, academic staff, and the community in
general.