This document analyzes and compares the performance of different inter-process communication (IPC) mechanisms in Unix-based operating systems, including pipes, message queues, and shared memory. Programs were written to transfer data between processes using each IPC mechanism. Pipes transferred around 95 MB/s, message queues transferred 120 MB/s, and shared memory, being the fastest, transferred around 4 GB/s. Therefore, the analysis showed that shared memory provides the best performance for inter-process communication compared to pipes and message queues.
1. Performance Analysis of IPC Mechanisms
Srivats Ubrani Anantharam Bharadwaj
Department of Computer Science
George Mason University
sbharad3@masonlive.gmu.edu
I. ABSTRACT
Inter-process communication (IPC) refers to the
coordination of activities among processes. A common
example of this is managing access to a given system
resource. Unix-based operating systems feature several
forms of Inter-Process Communication Mechanisms
including Pipes, UNIX domain sockets, named pipes (or
FIFO), Semaphores and Signals. Although each of these
were designed to perform a similar service, there do vary
in levels of performance. To carry out IPC, some form of
active or passive communication is required. Systems for
managing communication and synchronization between
cooperating processes are essential to many modern
operating systems. IPC has played a major role in UNIX
based operating systems.
The analysis of pipes, message queues and
shared memory involved examining the code written to
transfer data among two different processes. The
programs were run and throughput were measured on
Intel-i7 processor. It was observed that the pipes were
faster when compared to message queues and shared
memory. This paper describes the methodology of
profiling the code for analyzing the performance of these
IPC mechanisms.
II. INTRODUCTION
Among all vital aspects of an operating system,
process management is the foremost responsibility of an
operating system. A good OS must provide methods for
processes to communicate with each other. Such methods
are known as Inter-process Communication (IPC)
mechanisms. These mechanisms allow communication
and sharing of information between processes. Also they
provide ways to modularize the OS and parallelize
programs for computational speedup [1].
Regular files are not a feasible means of
communication medium for parallel processes. For
example, a reader/writer situation in which processes
writes data and the other process reads them. The file
should also keep track of all data transmitted that
consumes disk space.
A. PIPES
There are no simpler form of IPC as pipes [2]. Pipes
provide a means for communication between two
processes. The pipe call returns a pair of file descriptors
in read mode and in write mode. A pipe behaves like a
queue. The first thing written to the pipe is the first thing
read from the pipe. Writes (calls to write on the pipe’s
input descriptor) fill the pipe and block when the pipe is
full. They block until another process reads enough data
at the other end of the pipe and return when all the data
given to write have been transmitted. Reads (calls
to read on the pipe’s output descriptor) drain the pipe. If
the pipe is empty, a call to read blocks until at least a byte
is written at the other end. It then returns immediately
without waiting for the number of bytes requested
by read to be available.
B. MESSAGE QUEUES
Message queues provide an asynchronous
communications protocol, which means that the sender
and receiver do not need to interact with the message
queue at the same time. Message queues have limits on
the size of data that may be transmitted in a single
message and the number of messages that may be sent
on the queue. Different operating systems have different
implementations of message queues are they are
designed for specific purposes. Message queue is created
using msgget() that takes key as an argument and returns
descriptor of the queue if the queue exists. The flag
IPC_CREAT is used to create the message queue. The
receiver function uses msgrcv() to receive messages
from the queue of a specified message type. Finally, the
msgctl() method is used to destroy a message queue after
it by passing IPC_RMID flag[3].
2. C. SHARED MEMORY
Shared memory is the fastest form of IPC available.
It allows the processes to share a region of memory.
Once the memory is mapped into the address space of
the processes that are sharing the memory region, no
kernel involvement occurs in passing data between the
processes. The processes must synchronize the usage of
the shared region of memory among themselves. The
user can create/destroy/open this memory using a shared
memory object.
III. METHODOLOGY
Each IPC mechanism was implemented as simple
program that used the interfaces to transfer data to a
different process using a buffer of fixed size. The
performance was calculated based on the amount of data
read by the receiver in different intervals of time.
Calculating the amount of data received by a process
based on real time shows the efficiency of the IPC
mechanism.
A. REAL TIME MEASUREMENT
The C programming language offers clock and time
functions for measuring time in terms of system wide
clock or CPU clock cycles. One such function
clock_gettime() is used in this project. These functions
are defined in <time.h>. The argument tp belong to the
timespec struct. The clk_id argument of clock_gettime()
function is the identifier of the particular clock on which
to act. It is in this argument where the type of clock is
specified. The system wide clock is accessible to all
processes [4].
IV. DESIGN OF IPC PROGRAMS FOR ANALYSIS
The programs for IPC were written to measure the
data transferred between two processes in t seconds on
4GB RAM Ubuntu OS on Virtual Machine. The high
level design of each IPC are discussed below.
A. IMPLEMENTATION OF PIPES
A call for pipe() is made to open a pipe before the
fork() command. In the parent process the read end of the
pipe is closed and the data is written into the pipe
enclosed in a forever loop. In every iteration of the loop
the data is written into the pipe. After this the parent
process waits for the child to terminate. The timespec
struct has precision for measuring time seconds as well as
nanoseconds. The structure is defined in <time.h>:
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
B. REAL TIME MEASUREMENT
In all the implementations of IPC the process that
reads data, reads in infinite loop. The loop breaks out
once the clock exceeds the time interval specified.
The pseudocode of this logic is specified below:
while(true) {
get_time(start_time);
read_data(bytes);
get_time(end_time);
if(end_time – start_time >t)) {
/* Exit the program */
}
}
In the child process the time of the system wide clock
is retrieved by clock_gettime() function. The process
reads data from the pipe for specified amount of time. At
last the number of bytes read in the entire duration is
noted.
Below is the pseudocode of pipes program.
pipe(p);
fork();
if(parent){
for(;;) {
/* write to the pipe forever */
}
} else if(child){
get_time(start_time);
read_from_pipe(bytes);
get_time(end_time);
if(end_time – start_time >t))
/* Exit the program */
C. IMPLEMENTATION OF MESSAGE QUEUES
Message queues are implemented in two programs as
a sender and receiver of messages. The sender program
sends messages in an infinite loop. The receiver program
receives the data for a t seconds.
3. Below is a pseudocode for sender program:
while(1) {
// Send the message using msgsnd
msgsnd();
}
Below is a pseudocode for receiver program:
// Receive the message using msgrcv
for(;;) {
get_time(start_time);
msgrcv();
get_time(end_time);
}
}
D. IMPLEMENTATION OF SHARED MEMORY
The shared memory is implemented in parent child
process where each process shall access the memory
segment to read and write. The memory region is created
using msget() is of SHMSIZE bytes specified in the
program.
The pseudocode of the program is given below:
if(fork>0) {
// Create the shared memory using
IPC_CREAT flag
shmget(IPC_CREAT);
// Attach the shared memory to calling
process
shm = shmat(shmid, 0, 0);
for(;;) {
memset(shm,'a',SHMSIZE);
}
wait(NULL); // wait for child to
terminate
exit();
}
else { // Child process
// Attach to memory segment using
shmid.
for(;;){
get_time(start_time);
read_from_shm();
get_time(end_time);
if(end_time - start_time >t) {
//Exit the program
}
}
}
V. OBSERVATIONS
After the programs were run for different intervals of
time the bytes transferred were noted at each time
interval. They were plotted with time in X- axis against
data in Y-axis. The outputs are discussed below.
A. ANALYSIS OF PIPES
Although pipes are the simple forms of IPC as they
are unidirectional channels for transferring data, they are
not the fastest mode of IPC. For the time interval of 2
seconds, pipes could transfer approximately around 212
MB of data. At the end of 8 seconds 758 MB of data was
transferred. The average performance of pipes was found
to be 95 MB/s.
Fig 5.1: Performance of Pipes
B. ANALYSIS OF SYSTEM V MESSAGE QUEUES
Message queues were faster than pipes, yet much
slower than shared memory. After 2 seconds, 238MB of
data was received by the client process. The average
amount of data transferred would amount to 120MB/s.
4. Fig 5.2: Performance of Message Queues
C. ANALYSIS OF SHARED MEMORY
The shared memory is the fastest mode of IPC.
Probably because there is less kernel involvement in this
form apart from allocating memory segment and
attaching it to the caller process. In 2 seconds, 8 GB of
data was written into a memory segment. The average
amount of data written was around 4GB/s.
VI. CONCLUSION
Using a fixed buffer size of 4K the data transfer speed
of shared memory was found to be the highest. Hence,
shared memory is the fastest form of IPC followed by
message queues and pipes.
VII. ACKNOWLEDGEMENT
Thanks goes to Professor Harold M Greenwald for
encouraging me to do a research on this topic.
VIII.REFERENCES
[1] Kwame Wright, Karthik Gopalan, Performance
Analysis of Various Mechanisms for Inter-process
Communication.
[2] http://beej.us/guide/bgipc/output/html/multipage/pip
es.html.
[3] Linux man page, http://linux.die.net/man/2/msgrcv.
[4] Linux page, http://linux.die.net/man/3/clock_gettime
Fig 5.3: Performance of Shared Memory