Automating Google Workspace (GWS) & more with Apps Script
Â
Amoeba - Heterogeneous Multiprocessor Debugging in a Single Session of GDB
1. Heterogeneous Multiprocessor Debugging in a Single Session of GDB
Santosh Kumar and Kalpak Shah
{santosh.raghuram, kalpak.shah} @gmail.com
Pune Institute of Computer Technology, Pune.
1. INTRODUCTION
ABSTRACT
SoCs today try to maximize power efficiency and
processor utilization by using heterogeneous
With the proliferation of both homogeneous and multiprocessor systems employing multiple RISCs and
heterogeneous multiprocessor systems, debugging DSPs [4]. The role of the RISC processors is to provide
applications on such multiprocessors becomes a overall control of the system, managing and monitoring of
challenge. Todays SoC's combine the power of multiple the system activity. Hence the RISC processor generally
RISCs and DSPs to produce powerful handheld devices. hosts an operating system and avails the usage of
But debuggers have not evolved for such SoCs. GDB is thousands of applications written for the operating system
an open-source debugger which can currently be it hosts. DSPs on the other hand, have only a primitive set
configured only for a single architecture, and can only of functions. However, a DSP can operate faster and more
debug one target of that architecture at a time. Hence it efficiently when it comes to executing specific kinds of
cannot debug homogeneous or heterogeneous multi- calculations/tasks. The typical applications that can benefit
processor systems as a unit. We propose a GDB that can from DSP are media players, in which DSP can be a
support multiple CPU target architectures and ABI decoder.
(Application Binary Interface) and simultaneously debug
all the targets in a single session. This allows harnessing Such systems demand the development of concurrent
GDBâs powerful scripting interface which could be used applications. The debugging of such parallel applications
in regression suites. Features like barrier break-points, poses a great challenge to conventional debuggers The
lockstep, stop/continue-all, etc. will be provided for debugging process, which take at least 50% of the
multiple processors. This will entail enhancing the design development effort [3], together with testing poses the
and user interface provided by GDB and allow GDB to following problems when it comes to concurrent
be the preferred debugger for such architectures. Each debugging.
multiprocessor vendor can then design a GDB port for
their processors and use the powerful features offered by ⢠The state of different targets and synchronization
GDB. All the targets are multiplexed and will deal with objects is not visible during debugging due to lack of
the same debugger. Hence the debugger can coordinate debugger information. Knowledge about this state is
their activities and run them as a unit. A single session of essential to allow inferences about the execution stage
GDB will be able to handle tasks such as processor of the program and its progress relative to the partial
interfacing, inter-processor communication, run-time order of synchronization.
execution and coordination. The processors can be run in
lockstep to debug synchronization errors, mutex and ⢠Since each debugger session can control only one
semaphore problems. We also provide the design and processor, no inter-processor control is possible. This
implementation of a GUI in Eclipse CDT. is shown in Fig 1. where each session of GDB can
communicate with only one processor.
Keywords: Multiprocessor Debugging, GDB,
Heterogeneous Multiprocessors, Interprocessor- ⢠Also each debugger session can only be configured for
Coordination, Concurrent Debugging. only a single architecture and ABI.
2. ⢠Non-availability of a GUI which can show code, GUI.. Section 8 explains the implementation of a usecase
disassembly and breakpoint windows for multiple for the multiprocessor debugger. Section 9 compares our
programs. work to that of similar proprietary debuggers. Section 10
concludes the paper.
GDB Session 1 GDB Session N
2. DESIGN OVERVIEW
Local Local
memory memory The components of the framework for multiprocessor
debugging comprise of two executable components and
Processor 1 Processor N two interfaces. The executable components are the Eclipse
CDT GUI on one side and the GDB on the other side. The
MI is the machine interface which acts as a parser for
No interprocessor GDB and an output formatter and annotator for Eclipse.
control
Shared Memory
Fig 1. Difficulty posed due to different sessions of GDB
Since the debugger could not impose inter-processor
control, it was very dificult to debug data races,
synchronization errors, etc. An ideal multiprocessor
debugger should provide the following features:
⢠Ability to simultaneously debug multiple targets of
varied architectures.
Fig 2. Framework for the debugger
⢠Interprocessor breakpoints block a processor until all
processors reach that breakpoint. GDB is a widely used open-source debugger with in-built
support for embedded debugging. GDB supports a wide
⢠Status inquiries about processors. range of features like.
⢠Scheduling control provides the means to forcibly Breakpoints
suspend and resume processors, and also to Single-stepping
stop/continue all processors at a time. Support for multiple source languages
Symbol handling
The technical issues of these features are presented in Disassembly
detail in the description of the design. Remote debugging facility
Multi-threaded support
Scripting facility
This paper is structured as follows. Section 2 gives an
overview of the design and describes the various
components of the multiprocessor debugger. Section 3 Most of GDB's target manipulation passes through an
explains the asynchronous behaviour of GDB. Section 4 abstraction known as the target vector. A target vector is
details the design of the single GDB session and its similar in concept to a C++ class, with about 30-40
advantages. Section 5 explains the salient features of the methods. GDB, being an open-source debugger includes
debugger. Section 6 looks at the implementation details support for hundreds of targets. A multiprocessor system
vendor just has to write his own target vector and plug it
of the multiprocessor debugger. Section 7 enunciates
into GDB and use all the existing features of GDB.
the design and implementation of the Eclipse CDT
3. GDB/MI is a line based machine oriented text interface to Making GDB asynchronous required a complete
GDB. The Eclipse CDT communicates with GDB through rerganization of the wait_for_inferior structure. The new
the MI. The MI has been added to GDB to provide a wait_for_inferior now periodically queries the processors
seamless and consistent interface to the UIs. Hence the MI for events and reports these events to the interface or to a
was chosen to be the interface between GDB and Eclipse. target vector. Here is the design for the new
wait_for_inferior.
Eclipse [8], was initially developed as a Java IDE, and
since then been expanded as an IDE for C and C++ /* Checks if any of the targets has raised any event or not */
programs too. On Linux, the IDE has been developed to algorithm check_inferior()
combine with GDB and gdbserver installations, as {
debuggers for C programs. Thus Eclipse IDE being store current_processor in temp_processor;
platform independent was the ideal choice for for all processors in processor_chain
implementation of the GUI. {
/* Usually there is spurt of communication
3. ASYNCHRONOUS OPERATION between GDB and gdbserver after long gaps.
So we try to increase the responsiveness by
Original GDB could only debug one processor at a time getting all the events in one go, until inferior
and hence it worked in synchronization with the times out. */
execution of the program. So once the inferior began its while(timeout does not occur)
execution, GDB had to wait till control returned back to {
it, i.e. until the program stopped. The algorithm below /* Set the GDB state to that of âtempâ
explains the problem in existing GDB. processor */
select_processor(temp);
algorithm wait_for_inferior()
/* target_wait() returns an inferior processed /
threaded if inferior has raised an event. Else if
{
inferior is executing, then target_wait returns a
while(1)
âtimeoutâ */
{
/* Wait for target to return */
call ret_ptid=target_wait();
while (!target_wait());
if ret_ptid == valid ptid then
{
/* Check the event and take appropriate
/* GDB handles the event raised
action depending on whether event was a
by the inferior Like breakpoint hit,
breakpoint hit, inferior exit, signal, etc. */
inferior started, inferior stopped */
handle_inferior_event();
handle_inferior_event();
if (control is with GDB)
}
break;
}
} }
} }
So GDB was forced to stay in this loop and process the
events of only inferior. The following GDB session explains 4. ADVANTAGES OF A SINGLE
this vestige of GDB. SESSION OF GDB
(gdb) file a.out A single session of GDB now needs to maintain the
information of all the processors being debugged. Each
(gdb) r processor may have a different architecture and ABI
ContinuingâŚâŚ
4. (Application Binary Interface). Hence a single session of breakpoints, continue/stop-all. Its possible to pause the
GDB had to be configured for more than one architecture. entire system to examine the state of each core, preventing
data from being processed (lost) while examining the state
GDBâs target architecture defines what sort of machine- of the multi core system.
language programs GDB can work with, and how it works
with them.GDB provides a mechanism for handling
variations in OS ABIs. There are two major components in 5. FEATURES
the OS ABI mechanism: sniffers and handlers. A sniffer
examines a file matching a BFD architecture/flavour pair The various features added to enable multiprocessor
in an attempt to determine the OS ABI of that file. But debugging can be enumerated as follows:
GDB can only sniff those architectures with which it is
configured. So we had to make sure that GDB was ⢠Ability to maintain processor groups. This allows the
configured with multiple architectures. We have added a programmer the flexibility to handle different
new target âi386-armâ, and when configured with this processors as units. The processors can be added or
target GDB can recognize i386 and arm binaries in any removed from groups at random, thereby allowing the
format (elf, coff, stabs, etc.). programmer to let some processors execute unaffected
until a certain point and then include them in a group.
Single GDB Session ⢠The processors in a group can be run in lockstep.
Instead of stepping individual processors separately, a
Processor 1 Context group of processors can be stepped together, easing
With architecture and ABI the job of the programmer.
Processor 1 Context ⢠Barrier breakpoints will help to debug synchronization
With architecture and ABI problems. These breakpoints act as joins for the
processors, as a processor in a barrier cannot proceed
until all the processors have reached a specific point.
Local Local
memory memory
⢠We can also exercise inter-processor control by being
able to stop all the processors in a group when one
Processor 1 Processor N processor hits a breakpoint, or when a processor
crashes.
Complete
interprocessor ⢠The programmer can also, continue all processors in a
control group at the same time, or stop all the processors at
almost the same time with very less skid. This is very
Shared Memory
useful because sometimes the programmer needed to
stop all the processors instantaneously but because of
different sessions of GDB, he could not do so. Now the
Fig 3. Single session of GDB debugging multiple processors processors can be stopped in a single command and
their state can be studied.
Since a single session can now seamlessly interact with all
the processors, inter-processor control is possible. This
allows the programmer to work with the multi-core sytem as
a whole instead of dealing with each processor individually.
Hence our debugger has features like lockstep, barrier
5. NO COMMAND DESCRIPTION
1 processor pid Add a processor pid
2 group gid Add a group gid
3 select-processor pid Select processor pid as the current processor to work with
4 select-group gid Select a processor group gid as the current group to work with
5 lockstep Lockstep multiple processors in the current group
6 barrier 1.2, 2.4 Add a barrier breakpoint.between {processor 1, breakpoint no. 2} and
{processor 2, breakpoint no. 4}
7 continue-all Continues all processors
8 stop-all Stops all processors in the currently selected group
9 info processors Get the status of all the processors
10 info groups Get the status of the groups
Table 1. List of multiprocessor commands added to GDB
communicate with a single GDB process, which implies,
6. ECLIPSE CDT GUI sharing all the above mentioned streams. The original and
modified launch frameworks are shown in figures 4 and 5
respectively.
The main issues which were to be resolved within Eclipse Eclipse implements the original model with two
which would enable multi-processor debugging can be threads running in concurrency,
listed as: i.e: receiving thread: rxThread
transmitting thread: txThread.
1. Multiple debug launches should be made to correspond This model has now been modified by
to a single GDB session. implementing the framework as in figure 5 by running
2. Information about which processor the user is currently three threads concurrently,
debugging should be conveyed to GDB.
3. Additional commands, which are added to GDB as part
of multi-processor GDB, have to be added on the Eclipse
side also. Original Launch
6.1 Single session of GDB Launch 1 Launch 2
Eclipse in its original form used to open a virtual
terminal for a debug launch and then exec a GDB process 1. Send 1. Send
on the same. Each debug launch hence would possess a set command command
of streams; error, log, input, output; which Eclipse would 2. Wait 2. Wait
tap for effective communication between Eclipse and 3. Get output 3. Get output
GDB. Since eclipse uses versions of MI to communicate 4. Process 4. Process
with GDB, a debug launch also possesses a MI session. output output
In the multi-processor GDB, however, we have
5. Create 5. Create
only a single GDB process. This GDB process is âexecâ-ed
on a virtual terminal, as originally. However, multiple events. events.
debug launches communicate with the same GDB process.
Thus, Eclipse should identify the presence of a primary Receive Send Receive
session and hence should not exec another GDB process
but instead share the resources of the primary GDB
process => session. Thus, Eclipse will now have to GDB GDB
Fig 4. Normal Launch of Eclipse
6. Further, a processor with a unique ID is created
with every launch and the corresponding MISession is
Multi-processor Launch given the same ID. Thus, the rxThread, txThread and
multiProcStreamer which belong to the MISession know
the ID of the processor which they are enabling to debug.
Launch 1 Launch 2
6.2 Token Management
1. Send 1. Send
Every MI command is preceded by a number
command command which indicates the number of the command, i.e: token.
2. Wait 2. Wait For effective multi-processor debugging, we have utilized
3. Get output 3. Get output this token to indicate the current processor/group which
4. Process 4. Process the user is debugging. As per GDB implementation, token
output output has been implemented as an integer, i.e: 2^32. Hence, as
5. Create 5. Create per the original configuration, a total of 2^32 commands
events. events. can be sent before the token wraps around.
The token format we propose and implement for
Proc 1 Proc 2 multi-processor debugging is:
Send Send
g/p! p/g id cmd token
31 30 16 0
Where,
g/p!, if 1 => user is debugging a processor group gid.
if 0 => user is debugging a processor pid.
Multi-processor p/g id => id of processor / group which is currently being
debugged.
Streamer cmd token => token representing current command.
Proc 1 Proc 2
Thus, with this new configuration, the token wraps
GDB around every 2^16 commands. However, the debug
configuration remains unaffected because of this as
Eclipse and GDB are both independent of commands, once
Fig 5. Multiprocessor Launch of Eclipse their response is received.
As mentioned in the above section, the
i.e: receiving thread: rxThread multiProcStreamer checks the token for its p/ g id. The
transmitting thread: txThread. command response/output which is received by the
multiProcStreamer is then forwarded to the rxThread
Multi-processor streamer: multiProcStreamer
which is part of the miSession with id same as that of the
GDB output.
The txThread checks for any commands to be sent
to GDB and if present, sends them to GDB.
The multiProcStreamer reads the output given by
GDB, line by line and sends it for processing to the 6.3 Additional Commands
appropriate rxThread. This appropriation is explained in
section 6.2. The following commands have been added for
The rxThread originally used to do what the effective multi-processor debugging:
multiProcStreamer used to do and sent the output for
processing, after which events are generated, to be reflected
to the user. Now, the rxThread gets the output from the
multiProcStreamer and sends it for processing, etc.
7. Command Corresponding MI Communication) libraries between the RISCs and DSPs
Command [6]. The debugging of such concurrent environments can
Lockstep over -exec-next(1) be better tackled by this debugger.
Lockstep into -exec-step(1)
Continue All -exec-continue In order to showcase the utility of our multiprocessor
Barrier Insert -barrier-insert(procs, lines) debugger, we have created the following usecase. We
Barrier Remove -barrier-remove(procs, lines) have created a .RAW to .PGM convertor as a proof of
Add to Group -group(proc, group) concept. We have used QEMU [7] as a simulator for the
Processor -processor(proc, group) RISC processor. We run a process under QEMU which
Script - script(path) does the file management and system administration. The
File Exec and -file-exec-and-symbols(path) DSP processor is simulated by the native x86 processor.
Symbols The process which runs native on x86, does the
calculations required for the decoding. We have
simulated inter-processor communication between the
7. MOTIVATING EXAMPLES two processors by way of semaphores.
Most concurrent programming problems can be attributed 8. RELATED WORK
to a lack of proper synchronization in the access of
shared resources (for example, CPU and bus cycles, Each multiprocessor vendor has their own debugger but
memory, and various devices). The problems are these debuggers are proprietary. Cradle has their
manifested in the form of data corruptions, race multiprocessor debugger called InspectorT [10] but it is
conditions, deadlocks, stalls, and starvation. designed to support only the features of Cradleâs
processors. Hence it cannot be used by any other entity
working in the embedded systems field, and neither can it
Single session of be extended. TotalviewTM is a multiprocess debugger
developed by Etnus [5]. It can be used for distributed
GDB
concurrent debugging but it is proprietary and expensive.
Also it cannot be used in heterogeneous environments.
Serial link TCP/IP link ARMâs Realview Developer Suite has a MULTI
debugger which can simultaneously develop and debug
applications on a system with multiple ARM cores or an
RISC DSP ARM core plus a DSP core, within the same debug
processor/s processor/s environment. It is a proprietary product of ARM and does
not include support for processors. GDB on the other
IPC hand includes support for ARM, i86, MIPS, PowerPC,
SPARC, and a lot more. Hence our debugger can be used
out of the box for all these processors.
Shared
memory 9. CONCLUSION
Fig 4. An example of multicore debugging This paper enunciates the design and implementation for a
heterogeneous multiprocessor debugger. The work is
Consider the above example, which is prominently used based on the paradigm of concurrent debugging. The
in many embedded systems. Generally the RISC paper also describes the extensions implemented in GDB
processors do the system management while the DSPs and Eclipse, to enable multiprocessor debugging. Sample
perform the calculation intensive part [9]. Generally the implementations of the usecases for this debugger have
RISCs put the data to be processed in the shared memory shown its adequacy for debugging multiple processors.
and resume their processing. During this time the DSPs Our debugger enables a framework for seamless
process the data and upon completion notify the RISC. integration of new targets while simultaneously providing
There are generally shared IPC (Inter Processor extensibility hiterto unavailable.
8. REFERENCES
[1] Stan Shebs. The Future of GDB
www.redhat.com/support/wpapers/cygnus/cygnus_gdb/
[2] www.gnu.org/softwares/gdb
[3] Daneil Shulz. âThread Aware-Debuggingâ
[4] Amar Shan
âHeterogeneous Processing: a Strategy for
Augmenting Moore's Lawâ.
www.linuxjournal.com
[5] http://www.etnus.com/
[6] DSP gateway
[7] Fabrice Bellard. âQEMU, a Fast and Portable
Dynamic Translatorâ. In USENIX Conference 2005
[8] www.eclipse.com
[9] http://focus.ti.com/omap/docs/omaphomepage.tsp
[10] www.cradle.com/products/rds_sdk.shtml