3. Preface
Read This First
About This Manual
This manual is a reference for programming TMS320C6000 digital signal
processor (DSP) devices.
Before you use this book, you should install your code generation and
debugging tools.
This book is organized in five major parts:
- Part I: Introduction includes a brief description of the ’C6000 architecture
and code development flow. It also includes a tutorial that introduces you
to the tools you will use in each phase of development and an optimization
checklist to help you achieve optimal performance from your code.
- Part II: C Code includes C code examples and discusses optimization
methods for the code. This information can help you choose the most
appropriate optimization techniques for your code.
- Part III: Assembly Code describes the structure of assembly code. It
provides examples and discusses optimizations for assembly code. It also
includes a chapter on interrupt subroutines.
- Part IV: C64x Programming Techniques describes programming
considerations for the C64x.
iii
4. Related Documentation From Texas Instruments
Related Documentation From Texas Instruments
The following books describe the TMS320C6000 devices and related support
tools. To obtain a copy of any of these TI documents, call the Texas Instru-
ments Literature Response Center at (800) 477−8924. When ordering, please
identify the book by its title and literature number.
TMS320C6000 Assembly Language Tools User’s Guide (literature number
SPRU186) describes the assembly language tools (assembler, linker,
and other tools used to develop assembly language code), assembler
directives, macros, common object file format, and symbolic debugging
directives for the ’C6000 generation of devices.
TMS320C6000 Optimizing C Compiler User’s Guide (literature number
SPRU187) describes the ’C6000 C compiler and the assembly optimizer.
This C compiler accepts ANSI standard C source code and produces as-
sembly language source code for the ’C6000 generation of devices. The
assembly optimizer helps you optimize your assembly code.
TMS320C6000 CPU and Instruction Set Reference Guide (literature
number SPRU189) describes the ’C6000 CPU architecture, instruction
set, pipeline, and interrupts for these digital signal processors.
TMS320C6000 Peripherals Reference Guide (literature number SPRU190)
describes common peripherals available on the TMS320C6201/6701
digital signal processors. This book includes information on the internal
data and program memories, the external memory interface (EMIF), the
host port interface (HPI), multichannel buffered serial ports (McBSPs),
direct memory access (DMA), enhanced DMA (EDMA), expansion bus,
clocking and phase-locked loop (PLL), and the power-down modes.
TMS320C64x Technical Overview (SPRU395) The TMS320C64x technical
overview gives an introduction to the ’C64x digital signal processor, and
discusses the application areas that are enhanced by the ’C64x VelociTI.
iv
5. Trademarks
Trademarks
Solaris and SunOS are trademarks of Sun Microsystems, Inc.
VelociTI is a trademark of Texas Instruments Incorporated.
Windows and Windows NT are trademarks of Microsoft Corporation.
The Texas Instruments logo and Texas Instruments are registered trademarks
of Texas Instruments Incorporated. Trademarks of Texas Instruments include:
TI, XDS, Code Composer, Code Composer Studio, TMS320, TMS320C6000
and 320 Hotline On-line.
All other brand or product names are trademarks or registered trademarks of
their respective companies or organizations.
v
25. Chapter 1
Introduction
This chapter introduces some features of the C6000 microprocessor and
discusses the basic process for creating code and understanding feedback.
Any reference to C6000 pertains to the C62x (fixed-point), C64x (fixed-point),
the C64x+ (fixed-point), and the C67x (floating-point) devices. Though most
of the examples shown are fixed-point specific, all techniques are applicable
to each device.
Topic Page
1.1 TMS320C6000 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.2 TMS320C6000 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.3 Code Development Flow to Increase Performance . . . . . . . . . . . . . . . 1-3
1-1
26. TMS320C6000 Architecture
TMS320C6000 Architecture / TMS320C6000 Pipeline
1.1 TMS320C6000 Architecture
The C62x is a fixed-point digital signal processor (DSP) and is the first DSP
to use the VelociTIt architecture. VelociTI is a high-performance, advanced
very-long-instruction-word (VLIW) architecture, making it an excellent choice
for multichannel, multifunctional, and performance-driven applications.
The C67x is a floating-point DSP with the same features. It is the second DSP
to use the VelociTIt architecture.
The C64x is a fixed-point DSP with the same features. It is the third DSP to use
the VelociTIt architecture.
The C6000 DSPs are based on the C6000 CPU, which consists of:
- Program fetch unit
- Instruction dispatch unit
- Instruction decode unit
- Two data paths, each with four functional units
- Thirty-two 32-bit registers (C62x and C67x)
- Sixty-four 32-bit registers (C64x and C64x+)
- Control registers
- Control logic
- Test, emulation, and interrupt logic
1.2 TMS320C6000 Pipeline
The C6000 pipeline has several features that provide optimum performance,
low cost, and simple programming.
- Increased pipelining eliminates traditional architectural bottlenecks in
program fetch, data access, and multiply operations.
- Pipeline control is simplified by eliminating pipeline locks.
- The pipeline can dispatch eight parallel instructions every cycle.
- Parallel instructions proceed simultaneously through the same pipeline
phases.
1-2
27. Code Development Flow to Increase Performance
1.3 Code Development Flow to Increase Performance
Traditional development flows in the DSP industry have involved validating a
C model for correctness on a host PC or Unix workstation and then
painstakingly porting that C code to hand coded DSP assembly language. This
is both time consuming and error prone. This process tends to encounter
difficulties that can arise from maintaining the code over several projects.
The recommended code development flow involves utilizing the C6000 code
generation tools to aid in optimization rather than forcing you to code by hand
in assembly. These advantages allow the compiler to do all the laborious work
of instruction selection, parallelizing, pipelining, and register allocation. This
allows you to focus on getting the product to market quickly. These features
simplify the maintenance of the code, as everything resides in a C framework
that is simple to maintain, support, and upgrade.
The recommended code development flow for the C6000 involves the phases
described below. The tutorial section of this programmer’s guide focuses on
phases 1 − 3. These phases instruct you when to go to the tuning stage of
phase 3. What is learned is the importance of giving the compiler enough
information to fully maximize its potential. An added advantage is that the
compiler provides direct feedback on the entire program’s high MIPS areas
(loops). Based on this feedback, there are some very simple steps you can
take to pass complete and better information to the compiler allowing you a
quicker start in maximizing compiler performance.
Introduction 1-3
28. Code Development Flow to Increase Performance
You can achieve the best performance from your C6000 code if you follow this
code development flow when you are writing and debugging your code:
Phase 1: Write C code
Develop C Code
Compile
Profile
Yes
Efficient? Complete
No
Refine C code
Phase 2:
Refine C Code
Compile
Profile
Yes
Efficient? Complete
No
Yes More C
optimization?
No
Write linear assembly
Phase 3:
Write Linear
Assembly Assembly optimize
Profile
No
Efficient?
Yes
Complete
1-4
29. Code Development Flow to Increase Performance
Table 1−1 lists the phases in the 3-step software development flow shown on
the previous page, and the goal for each phase:
Table 1−1. Three Phases of Software Development
Phase Goal
1 You can develop your C code for phase 1 without any knowledge of
the C6000. Use the C6000 profiling tools that are described in the
Code Composer Studio User’s Guide to identify any inefficient areas
that you might have in your C code. To improve the performance of
your code, proceed to phase 2.
2 Use techniques described in this book to improve your C code. Use
the C6000 profiling tools to check its performance. If your code is still
not as efficient as you would like it to be, proceed to phase 3.
3 Extract the time-critical areas from your C code and rewrite the code
in linear assembly. You can use the assembly optimizer to optimize
this code.
Because most of the millions of instructions per second (MIPS) in DSP
applications occur in tight loops, it is important for the C6000 code generation
tools to make maximal use of all the hardware resources in important loops.
Fortunately, loops inherently have more parallelism than non-looping code
because there are multiple iterations of the same code executing with limited
dependencies between each iteration. Through a technique called software
pipelining, the C6000 code generation tools use the multiple resources of the
VelociTI architecture efficiently and obtain very high performance.
This chapter shows the code development flow recommended to achieve the
highest performance on loops and provides a feedback list that can be used
to optimize loops with references to more detailed documentation.
Introduction 1-5