SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Operating Systems
           CMPSCI 377
Distributed Parallel Programming
               Emery Berger
   University of Massachusetts Amherst




     UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Outline
    Previously:


        Programming with threads
    

              Shared memory, single machine
          


    Today:


        Distributed parallel programming
    

        Message passing
    




                        some material adapted from slides by Kathy Yelick, UC Berkeley
        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science         2
Why Distribute?
    SMP (symmetric

    multiprocessor):
                                                              P2
                                                P1                             Pn
    easy to program
                                                                    $
                                                     $                              $
    but limited
        Bus becomes                                      network/bus
    
        bottleneck when
        processors not                                       memory
        operating locally
        Typically < 32
    
        processors
        $$$
    


        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science            3
Distributed Memory
    Vastly different platforms


      Networks of workstations

      Supercomputers

      Clusters




         UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   4
Distributed Architectures
    Distributed memory machines:


    local memory but no global memory
        Individual nodes often SMPs
    

        Network interface for all interprocessor
    

        communication – message passing

                                       P1      NI
                   P0      NI                                       Pn         NI
                                 memory
             memory                                 ...        memory

                                        interconnect


        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science        5
Message Passing
    Program: # independent communicating processes


        Thread + local address space only
    

        Shared data: partitioned
    


    Communicate by send & receive events


      Cluster = message sent over sockets




                                              s: 14
                      s: 12                                                     s: 11
                                                          receive Pn,s
     y = ..s ...                               i: 3
                        i: 2                                                    i: 1

                                                                    send P1,s
                                               P1                                Pn
                        P0
                                                      Network
        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science            6
Message Passing
    Pros: efficient


        Makes data sharing explicit
    

        Can communicate only what is strictly
    
        necessary for computation
             No coherence protocols, etc.
         




    Cons: difficult


        Requires manual partitioning
    

             Divide up problem across processors
         


        Unnatural model (for some)
    

        Deadlock-prone (hurray)
    


        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   7
Message Passing Interface
    Library approach to message-passing


    Supports most common architectural

    abstractions
        Vendors supply optimized versions
    

          ⇒ programs run on different machine, but with
             (somewhat) different performance
    Bindings for popular languages


        Especially Fortran, C
    

        Also C++, Java
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   8
MPI execution model
    Spawns multiple copies of same program


    (SPMD = single program, multiple data)
        Each one is different “process”
    

        (different local memory)
    Can act differently by determining which


    processor “self” corresponds to




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   9
An Example
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, size;
  MPI_Init(&argc, &argv );
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  printf(quot;Hello world from process %d of %dnquot;,
         rank, size);
  MPI_Finalize();
  return 0;
}


% mpirun –np 10 exampleProgram

     UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   10
An Example
#include <stdio.h>
#include <mpi.h>                 initializes MPI
                                     (passes
int main(int argc, char * argv[])arguments in)
                                   {
  int rank, size;
  MPI_Init(&argc, &argv );
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  printf(quot;Hello world from process %d of %dnquot;,
         rank, size);
  MPI_Finalize();
  return 0;
}


% mpirun –np 10 exampleProgram

     UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   11
An Example
#include <stdio.h>
#include <mpi.h>                                                             returns # of
                                                                            processors in
int main(int argc, char * argv[]) {                                            “world”
  int rank, size;
  MPI_Init(&argc, &argv );
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  printf(quot;Hello world from process %d of %dnquot;,
         rank, size);
  MPI_Finalize();
  return 0;
}


% mpirun –np 10 exampleProgram

     UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science                   12
An Example
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, size;
                                     which processor
  MPI_Init(&argc, &argv );
  MPI_Comm_size(MPI_COMM_WORLD, &size); am I?
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  printf(quot;Hello world from process %d of %dnquot;,
         rank, size);
  MPI_Finalize();
  return 0;
}


% mpirun –np 10 exampleProgram

     UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   13
An Example
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, size;
  MPI_Init(&argc, &argv );
  MPI_Comm_size(MPI_COMM_WORLD, &size);
                          we’re done
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
                           sending
  printf(quot;Hello world from process %d of %dnquot;,
                           messages
         rank, size);
  MPI_Finalize();
  return 0;
}


% mpirun –np 10 exampleProgram

     UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   14
An Example
% mpirun –np 10 exampleProgram
Hello world from process 5 of 10
Hello world from process 3 of 10
Hello world from process 9 of 10
Hello world from process 0 of 10
Hello world from process 2 of 10
Hello world from process 4 of 10
Hello world from process 1 of 10
Hello world from process 6 of 10
Hello world from process 8 of 10
Hello world from process 7 of 10
% // what happened?

    UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   15
Message Passing
    Messages can be sent directly to another


    processor
        MPI_Send, MPI_Recv
    

    Or to all processors


        MPI_Bcast (does send or receive)
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   16
Send/Recv Example
    Send data from process 0 to all


    “Pass it along” communication





    Operations:

        MPI_Send (data *, count, MPI_INT, dest, 0,
    
        MPI_COMM_WORLD );
        MPI_Recv (data *, count, MPI_INT, source, 0,
    
        MPI_COMM_WORLD );


        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   17
Send & Receive
int main(int argc, char * argv[]) {
  int rank, value, size;
  MPI_Status status;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  do {
    if (rank == 0) {
      scanf( quot;%dquot;, &value );
      MPI_Send(&value, 1, MPI_INT, rank + 1,
                0, MPI_COMM_WORLD );
    } else {
      MPI_Recv(&value, 1, MPI_INT, rank - 1,
                0, MPI_COMM_WORLD, &status );
      if (rank < size - 1)
        MPI_Send( &value, 1, MPI_INT, rank + 1,
                  0, MPI_COMM_WORLD );
    }
    printf(quot;Process %d got %dnquot;, rank, value);
  } while (value >= 0);
  MPI_Finalize();
  return 0;
}
      Send integer input in a ring
  
           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   18
Send & Receive
int main(int argc, char * argv[]) {
  int rank, value, size;
  MPI_Status status;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
                                                                         send
  do {
    if (rank == 0) {
                                                                      destination?
      scanf( quot;%dquot;, &value );
      MPI_Send(&value, 1, MPI_INT, rank + 1,
                0, MPI_COMM_WORLD );
    } else {
      MPI_Recv(&value, 1, MPI_INT, rank - 1,
                0, MPI_COMM_WORLD, &status );
      if (rank < size - 1)
        MPI_Send( &value, 1, MPI_INT, rank + 1,
                  0, MPI_COMM_WORLD );
    }
    printf(quot;Process %d got %dnquot;, rank, value);
  } while (value >= 0);
  MPI_Finalize();
  return 0;
}
      Send integer input in a ring
  
           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science      19
Send & Receive
int main(int argc, char * argv[]) {
  int rank, value, size;
  MPI_Status status;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  do {
    if (rank == 0) {
      scanf( quot;%dquot;, &value );
      MPI_Send(&value, 1, MPI_INT, rank + 1,                            receive from?
                0, MPI_COMM_WORLD );
    } else {
      MPI_Recv(&value, 1, MPI_INT, rank - 1,
                0, MPI_COMM_WORLD, &status );
      if (rank < size - 1)
        MPI_Send( &value, 1, MPI_INT, rank + 1,
                  0, MPI_COMM_WORLD );
    }
    printf(quot;Process %d got %dnquot;, rank, value);
  } while (value >= 0);
  MPI_Finalize();
  return 0;
}
      Send integer input in a ring
  
           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science         20
Send & Receive
int main(int argc, char * argv[]) {
  int rank, value, size;
  MPI_Status status;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  do {
    if (rank == 0) {
      scanf( quot;%dquot;, &value );
      MPI_Send(&value, 1, MPI_INT, rank + 1,                             message tag
                0, MPI_COMM_WORLD );
    } else {
                                                                         message tag
      MPI_Recv(&value, 1, MPI_INT, rank - 1,
                0, MPI_COMM_WORLD, &status );
      if (rank < size - 1)
        MPI_Send( &value, 1, MPI_INT, rank + 1,                          message tag
                  0, MPI_COMM_WORLD );
    }
    printf(quot;Process %d got %dnquot;, rank, value);
  } while (value >= 0);
  MPI_Finalize();
  return 0;
}


           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science        21
Exercise
         Compute expensiveComputation(i) on n processors;
     

         process 0 computes & prints sum
// MPI_Send (&value, 1, MPI_INT, dest, 0, MPI_COMM_WORLD );
int main(int argc, char * argv[]) {
  int rank, size;
  MPI_Status status;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  if (rank == 0) {
    int sum = 0;

      printf(“sum = %dnquot;, sum);
    } else {


    }
    MPI_Finalize();     return 0;
}

           UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   22
Broadcast
    Send and receive: point-to-point


    Can also broadcast data


        Source sends to everyone else
    




        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   23
Broadcast
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, value;
  MPI_Init( &argc, &argv );
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
  do {
    if (rank == 0)
       scanf( quot;%dquot;, &value );
    MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD);
    printf( quot;Process %d got %dnquot;, rank, value );
  } while (value >= 0);
  MPI_Finalize( );
  return 0;
}

          Repeatedly broadcast input (one integer) to all
      

             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   24
Broadcast
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, value;
  MPI_Init( &argc, &argv );
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
                              send or receive
  do {
                                  value
    if (rank == 0)
       scanf( quot;%dquot;, &value );
    MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD);
    printf( quot;Process %d got %dnquot;, rank, value );
  } while (value >= 0);
  MPI_Finalize( );
  return 0;
}

          Repeatedly broadcast input (one integer) to all
      

             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   25
Broadcast
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, value;
  MPI_Init( &argc, &argv );
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
                                  how many to
  do {
                                 send/receive?
    if (rank == 0)
       scanf( quot;%dquot;, &value );
    MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD);
    printf( quot;Process %d got %dnquot;, rank, value );
  } while (value >= 0);
  MPI_Finalize( );
  return 0;
}

          Repeatedly broadcast input (one integer) to all
      

             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   26
Broadcast
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, value;
  MPI_Init( &argc, &argv );
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
                                      what’s the
  do {
                                      datatype?
    if (rank == 0)
       scanf( quot;%dquot;, &value );
    MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD);
    printf( quot;Process %d got %dnquot;, rank, value );
  } while (value >= 0);
  MPI_Finalize( );
  return 0;
}

          Repeatedly broadcast input (one integer) to all
      

             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   27
Broadcast
#include <stdio.h>
#include <mpi.h>

int main(int argc, char * argv[]) {
  int rank, value;
  MPI_Init( &argc, &argv );
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
                                          who’s “root” for
  do {
                                            broadcast?
    if (rank == 0)
       scanf( quot;%dquot;, &value );
    MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD);
    printf( quot;Process %d got %dnquot;, rank, value );
  } while (value >= 0);
  MPI_Finalize( );
  return 0;
}

          Repeatedly broadcast input (one integer) to all
      

             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   28
Communication Flavors
    Basic communication


        blocking = wait until done
    

        point-to-point = from me to you
    

        broadcast = from me to everyone
    

    Non-blocking


        Think create & join, fork & wait…
    

        MPI_ISend, MPI_IRecv
    

        MPI_Wait, MPI_Waitall, MPI_Test
    

    Collective



        UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   29
The End




  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   30
Scaling Limits
    Kernel used in


    atmospheric models
         99% floating point
     

         ops; multiplies/adds
         Sweeps through
     

         memory with little
         reuse
         One “copy” of code
     

         running
         independently on
         varying numbers of
         procs

                                                                       From Pat Worley, ORNL
             UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science              31

Weitere ähnliche Inhalte

Ähnlich wie Operating Systems - Distributed Parallel Computing

Processes and Threads
Processes and ThreadsProcesses and Threads
Processes and ThreadsEmery Berger
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine ParallelismSri Prasanna
 
parellel computing
parellel computingparellel computing
parellel computingkatakdound
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPIyaman dua
 
Memory Management for High-Performance Applications
Memory Management for High-Performance ApplicationsMemory Management for High-Performance Applications
Memory Management for High-Performance ApplicationsEmery Berger
 
Chapel Comes of Age: a Language for Productivity, Parallelism, and Performance
Chapel Comes of Age: a Language for Productivity, Parallelism, and PerformanceChapel Comes of Age: a Language for Productivity, Parallelism, and Performance
Chapel Comes of Age: a Language for Productivity, Parallelism, and Performanceinside-BigData.com
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA Japan
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPIjbp4444
 
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsToyotaro Suzumura
 
HPAT presentation at JuliaCon 2016
HPAT presentation at JuliaCon 2016HPAT presentation at JuliaCon 2016
HPAT presentation at JuliaCon 2016Ehsan Totoni
 
Operating Systems - Introduction
Operating Systems - IntroductionOperating Systems - Introduction
Operating Systems - IntroductionEmery Berger
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Davide Carboni
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterSudhang Shankar
 
270_1_CIntro_Up_To_Functions.ppt
270_1_CIntro_Up_To_Functions.ppt270_1_CIntro_Up_To_Functions.ppt
270_1_CIntro_Up_To_Functions.pptJoshCasas1
 

Ähnlich wie Operating Systems - Distributed Parallel Computing (20)

Processes and Threads
Processes and ThreadsProcesses and Threads
Processes and Threads
 
Mpi
Mpi Mpi
Mpi
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine Parallelism
 
parellel computing
parellel computingparellel computing
parellel computing
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
 
hybrid-programming.pptx
hybrid-programming.pptxhybrid-programming.pptx
hybrid-programming.pptx
 
Memory Management for High-Performance Applications
Memory Management for High-Performance ApplicationsMemory Management for High-Performance Applications
Memory Management for High-Performance Applications
 
Chapel Comes of Age: a Language for Productivity, Parallelism, and Performance
Chapel Comes of Age: a Language for Productivity, Parallelism, and PerformanceChapel Comes of Age: a Language for Productivity, Parallelism, and Performance
Chapel Comes of Age: a Language for Productivity, Parallelism, and Performance
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPI
 
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
 
HPAT presentation at JuliaCon 2016
HPAT presentation at JuliaCon 2016HPAT presentation at JuliaCon 2016
HPAT presentation at JuliaCon 2016
 
Operating Systems - Introduction
Operating Systems - IntroductionOperating Systems - Introduction
Operating Systems - Introduction
 
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?
 
OpenMP.pptx
OpenMP.pptxOpenMP.pptx
OpenMP.pptx
 
openmpfinal.pdf
openmpfinal.pdfopenmpfinal.pdf
openmpfinal.pdf
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC cluster
 
Open MPI
Open MPIOpen MPI
Open MPI
 
270_1_CIntro_Up_To_Functions.ppt
270_1_CIntro_Up_To_Functions.ppt270_1_CIntro_Up_To_Functions.ppt
270_1_CIntro_Up_To_Functions.ppt
 

Mehr von Emery Berger

Doppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language BarrierDoppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language BarrierEmery Berger
 
Dthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic MultithreadingDthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic MultithreadingEmery Berger
 
Programming with People
Programming with PeopleProgramming with People
Programming with PeopleEmery Berger
 
Stabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance EvaluationStabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance EvaluationEmery Berger
 
DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)Emery Berger
 
Operating Systems - Advanced File Systems
Operating Systems - Advanced File SystemsOperating Systems - Advanced File Systems
Operating Systems - Advanced File SystemsEmery Berger
 
Operating Systems - File Systems
Operating Systems - File SystemsOperating Systems - File Systems
Operating Systems - File SystemsEmery Berger
 
Operating Systems - Advanced Synchronization
Operating Systems - Advanced SynchronizationOperating Systems - Advanced Synchronization
Operating Systems - Advanced SynchronizationEmery Berger
 
Operating Systems - Synchronization
Operating Systems - SynchronizationOperating Systems - Synchronization
Operating Systems - SynchronizationEmery Berger
 
Virtual Memory and Paging
Virtual Memory and PagingVirtual Memory and Paging
Virtual Memory and PagingEmery Berger
 
Operating Systems - Virtual Memory
Operating Systems - Virtual MemoryOperating Systems - Virtual Memory
Operating Systems - Virtual MemoryEmery Berger
 
MC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained EnvironmentsMC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained EnvironmentsEmery Berger
 
Vam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory AllocatorVam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory AllocatorEmery Berger
 
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory ManagementQuantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory ManagementEmery Berger
 
Garbage Collection without Paging
Garbage Collection without PagingGarbage Collection without Paging
Garbage Collection without PagingEmery Berger
 
DieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesDieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesEmery Berger
 
Exterminator: Automatically Correcting Memory Errors with High Probability
Exterminator: Automatically Correcting Memory Errors with High ProbabilityExterminator: Automatically Correcting Memory Errors with High Probability
Exterminator: Automatically Correcting Memory Errors with High ProbabilityEmery Berger
 
Operating Systems - Dynamic Memory Management
Operating Systems - Dynamic Memory ManagementOperating Systems - Dynamic Memory Management
Operating Systems - Dynamic Memory ManagementEmery Berger
 
Operating Systems - Architecture
Operating Systems - ArchitectureOperating Systems - Architecture
Operating Systems - ArchitectureEmery Berger
 
Operating Systems - Intro to C++
Operating Systems - Intro to C++Operating Systems - Intro to C++
Operating Systems - Intro to C++Emery Berger
 

Mehr von Emery Berger (20)

Doppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language BarrierDoppio: Breaking the Browser Language Barrier
Doppio: Breaking the Browser Language Barrier
 
Dthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic MultithreadingDthreads: Efficient Deterministic Multithreading
Dthreads: Efficient Deterministic Multithreading
 
Programming with People
Programming with PeopleProgramming with People
Programming with People
 
Stabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance EvaluationStabilizer: Statistically Sound Performance Evaluation
Stabilizer: Statistically Sound Performance Evaluation
 
DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)DieHarder (CCS 2010, WOOT 2011)
DieHarder (CCS 2010, WOOT 2011)
 
Operating Systems - Advanced File Systems
Operating Systems - Advanced File SystemsOperating Systems - Advanced File Systems
Operating Systems - Advanced File Systems
 
Operating Systems - File Systems
Operating Systems - File SystemsOperating Systems - File Systems
Operating Systems - File Systems
 
Operating Systems - Advanced Synchronization
Operating Systems - Advanced SynchronizationOperating Systems - Advanced Synchronization
Operating Systems - Advanced Synchronization
 
Operating Systems - Synchronization
Operating Systems - SynchronizationOperating Systems - Synchronization
Operating Systems - Synchronization
 
Virtual Memory and Paging
Virtual Memory and PagingVirtual Memory and Paging
Virtual Memory and Paging
 
Operating Systems - Virtual Memory
Operating Systems - Virtual MemoryOperating Systems - Virtual Memory
Operating Systems - Virtual Memory
 
MC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained EnvironmentsMC2: High-Performance Garbage Collection for Memory-Constrained Environments
MC2: High-Performance Garbage Collection for Memory-Constrained Environments
 
Vam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory AllocatorVam: A Locality-Improving Dynamic Memory Allocator
Vam: A Locality-Improving Dynamic Memory Allocator
 
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory ManagementQuantifying the Performance of Garbage Collection vs. Explicit Memory Management
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
 
Garbage Collection without Paging
Garbage Collection without PagingGarbage Collection without Paging
Garbage Collection without Paging
 
DieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesDieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe Languages
 
Exterminator: Automatically Correcting Memory Errors with High Probability
Exterminator: Automatically Correcting Memory Errors with High ProbabilityExterminator: Automatically Correcting Memory Errors with High Probability
Exterminator: Automatically Correcting Memory Errors with High Probability
 
Operating Systems - Dynamic Memory Management
Operating Systems - Dynamic Memory ManagementOperating Systems - Dynamic Memory Management
Operating Systems - Dynamic Memory Management
 
Operating Systems - Architecture
Operating Systems - ArchitectureOperating Systems - Architecture
Operating Systems - Architecture
 
Operating Systems - Intro to C++
Operating Systems - Intro to C++Operating Systems - Intro to C++
Operating Systems - Intro to C++
 

Kürzlich hochgeladen

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Kürzlich hochgeladen (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Operating Systems - Distributed Parallel Computing

  • 1. Operating Systems CMPSCI 377 Distributed Parallel Programming Emery Berger University of Massachusetts Amherst UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 2. Outline Previously:  Programming with threads  Shared memory, single machine  Today:  Distributed parallel programming  Message passing  some material adapted from slides by Kathy Yelick, UC Berkeley UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 2
  • 3. Why Distribute? SMP (symmetric  multiprocessor): P2 P1 Pn easy to program $ $ $ but limited Bus becomes network/bus  bottleneck when processors not memory operating locally Typically < 32  processors $$$  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 3
  • 4. Distributed Memory Vastly different platforms   Networks of workstations  Supercomputers  Clusters UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 4
  • 5. Distributed Architectures Distributed memory machines:  local memory but no global memory Individual nodes often SMPs  Network interface for all interprocessor  communication – message passing P1 NI P0 NI Pn NI memory memory ... memory interconnect UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5
  • 6. Message Passing Program: # independent communicating processes  Thread + local address space only  Shared data: partitioned  Communicate by send & receive events   Cluster = message sent over sockets s: 14 s: 12 s: 11 receive Pn,s y = ..s ... i: 3 i: 2 i: 1 send P1,s P1 Pn P0 Network UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 6
  • 7. Message Passing Pros: efficient  Makes data sharing explicit  Can communicate only what is strictly  necessary for computation No coherence protocols, etc.  Cons: difficult  Requires manual partitioning  Divide up problem across processors  Unnatural model (for some)  Deadlock-prone (hurray)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 7
  • 8. Message Passing Interface Library approach to message-passing  Supports most common architectural  abstractions Vendors supply optimized versions  ⇒ programs run on different machine, but with (somewhat) different performance Bindings for popular languages  Especially Fortran, C  Also C++, Java  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 8
  • 9. MPI execution model Spawns multiple copies of same program  (SPMD = single program, multiple data) Each one is different “process”  (different local memory) Can act differently by determining which  processor “self” corresponds to UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 9
  • 10. An Example #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, size; MPI_Init(&argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); printf(quot;Hello world from process %d of %dnquot;, rank, size); MPI_Finalize(); return 0; } % mpirun –np 10 exampleProgram UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 10
  • 11. An Example #include <stdio.h> #include <mpi.h> initializes MPI (passes int main(int argc, char * argv[])arguments in) { int rank, size; MPI_Init(&argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); printf(quot;Hello world from process %d of %dnquot;, rank, size); MPI_Finalize(); return 0; } % mpirun –np 10 exampleProgram UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11
  • 12. An Example #include <stdio.h> #include <mpi.h> returns # of processors in int main(int argc, char * argv[]) { “world” int rank, size; MPI_Init(&argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); printf(quot;Hello world from process %d of %dnquot;, rank, size); MPI_Finalize(); return 0; } % mpirun –np 10 exampleProgram UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12
  • 13. An Example #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, size; which processor MPI_Init(&argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &size); am I? MPI_Comm_rank(MPI_COMM_WORLD, &rank); printf(quot;Hello world from process %d of %dnquot;, rank, size); MPI_Finalize(); return 0; } % mpirun –np 10 exampleProgram UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
  • 14. An Example #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, size; MPI_Init(&argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &size); we’re done MPI_Comm_rank(MPI_COMM_WORLD, &rank); sending printf(quot;Hello world from process %d of %dnquot;, messages rank, size); MPI_Finalize(); return 0; } % mpirun –np 10 exampleProgram UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
  • 15. An Example % mpirun –np 10 exampleProgram Hello world from process 5 of 10 Hello world from process 3 of 10 Hello world from process 9 of 10 Hello world from process 0 of 10 Hello world from process 2 of 10 Hello world from process 4 of 10 Hello world from process 1 of 10 Hello world from process 6 of 10 Hello world from process 8 of 10 Hello world from process 7 of 10 % // what happened? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
  • 16. Message Passing Messages can be sent directly to another  processor MPI_Send, MPI_Recv  Or to all processors  MPI_Bcast (does send or receive)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
  • 17. Send/Recv Example Send data from process 0 to all  “Pass it along” communication  Operations:  MPI_Send (data *, count, MPI_INT, dest, 0,  MPI_COMM_WORLD ); MPI_Recv (data *, count, MPI_INT, source, 0,  MPI_COMM_WORLD ); UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
  • 18. Send & Receive int main(int argc, char * argv[]) { int rank, value, size; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); do { if (rank == 0) { scanf( quot;%dquot;, &value ); MPI_Send(&value, 1, MPI_INT, rank + 1, 0, MPI_COMM_WORLD ); } else { MPI_Recv(&value, 1, MPI_INT, rank - 1, 0, MPI_COMM_WORLD, &status ); if (rank < size - 1) MPI_Send( &value, 1, MPI_INT, rank + 1, 0, MPI_COMM_WORLD ); } printf(quot;Process %d got %dnquot;, rank, value); } while (value >= 0); MPI_Finalize(); return 0; } Send integer input in a ring  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
  • 19. Send & Receive int main(int argc, char * argv[]) { int rank, value, size; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); send do { if (rank == 0) { destination? scanf( quot;%dquot;, &value ); MPI_Send(&value, 1, MPI_INT, rank + 1, 0, MPI_COMM_WORLD ); } else { MPI_Recv(&value, 1, MPI_INT, rank - 1, 0, MPI_COMM_WORLD, &status ); if (rank < size - 1) MPI_Send( &value, 1, MPI_INT, rank + 1, 0, MPI_COMM_WORLD ); } printf(quot;Process %d got %dnquot;, rank, value); } while (value >= 0); MPI_Finalize(); return 0; } Send integer input in a ring  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
  • 20. Send & Receive int main(int argc, char * argv[]) { int rank, value, size; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); do { if (rank == 0) { scanf( quot;%dquot;, &value ); MPI_Send(&value, 1, MPI_INT, rank + 1, receive from? 0, MPI_COMM_WORLD ); } else { MPI_Recv(&value, 1, MPI_INT, rank - 1, 0, MPI_COMM_WORLD, &status ); if (rank < size - 1) MPI_Send( &value, 1, MPI_INT, rank + 1, 0, MPI_COMM_WORLD ); } printf(quot;Process %d got %dnquot;, rank, value); } while (value >= 0); MPI_Finalize(); return 0; } Send integer input in a ring  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
  • 21. Send & Receive int main(int argc, char * argv[]) { int rank, value, size; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); do { if (rank == 0) { scanf( quot;%dquot;, &value ); MPI_Send(&value, 1, MPI_INT, rank + 1, message tag 0, MPI_COMM_WORLD ); } else { message tag MPI_Recv(&value, 1, MPI_INT, rank - 1, 0, MPI_COMM_WORLD, &status ); if (rank < size - 1) MPI_Send( &value, 1, MPI_INT, rank + 1, message tag 0, MPI_COMM_WORLD ); } printf(quot;Process %d got %dnquot;, rank, value); } while (value >= 0); MPI_Finalize(); return 0; } UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21
  • 22. Exercise Compute expensiveComputation(i) on n processors;  process 0 computes & prints sum // MPI_Send (&value, 1, MPI_INT, dest, 0, MPI_COMM_WORLD ); int main(int argc, char * argv[]) { int rank, size; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); if (rank == 0) { int sum = 0; printf(“sum = %dnquot;, sum); } else { } MPI_Finalize(); return 0; } UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 22
  • 23. Broadcast Send and receive: point-to-point  Can also broadcast data  Source sends to everyone else  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 23
  • 24. Broadcast #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, value; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); do { if (rank == 0) scanf( quot;%dquot;, &value ); MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD); printf( quot;Process %d got %dnquot;, rank, value ); } while (value >= 0); MPI_Finalize( ); return 0; } Repeatedly broadcast input (one integer) to all  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 24
  • 25. Broadcast #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, value; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); send or receive do { value if (rank == 0) scanf( quot;%dquot;, &value ); MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD); printf( quot;Process %d got %dnquot;, rank, value ); } while (value >= 0); MPI_Finalize( ); return 0; } Repeatedly broadcast input (one integer) to all  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 25
  • 26. Broadcast #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, value; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); how many to do { send/receive? if (rank == 0) scanf( quot;%dquot;, &value ); MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD); printf( quot;Process %d got %dnquot;, rank, value ); } while (value >= 0); MPI_Finalize( ); return 0; } Repeatedly broadcast input (one integer) to all  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 26
  • 27. Broadcast #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, value; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); what’s the do { datatype? if (rank == 0) scanf( quot;%dquot;, &value ); MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD); printf( quot;Process %d got %dnquot;, rank, value ); } while (value >= 0); MPI_Finalize( ); return 0; } Repeatedly broadcast input (one integer) to all  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 27
  • 28. Broadcast #include <stdio.h> #include <mpi.h> int main(int argc, char * argv[]) { int rank, value; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); who’s “root” for do { broadcast? if (rank == 0) scanf( quot;%dquot;, &value ); MPI_Bcast( &value, 1, MPI_INT, 0, MPI_COMM_WORLD); printf( quot;Process %d got %dnquot;, rank, value ); } while (value >= 0); MPI_Finalize( ); return 0; } Repeatedly broadcast input (one integer) to all  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 28
  • 29. Communication Flavors Basic communication  blocking = wait until done  point-to-point = from me to you  broadcast = from me to everyone  Non-blocking  Think create & join, fork & wait…  MPI_ISend, MPI_IRecv  MPI_Wait, MPI_Waitall, MPI_Test  Collective  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 29
  • 30. The End UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 30
  • 31. Scaling Limits Kernel used in  atmospheric models 99% floating point  ops; multiplies/adds Sweeps through  memory with little reuse One “copy” of code  running independently on varying numbers of procs From Pat Worley, ORNL UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 31