IMRT

9.17 Intensity-Modulated Radiation Therapy Planning
AL Boyer
J Unkelbach, Harvard Medical School, Boston, MA, USA
ã 2014 Elsevier B.V. All rights reserved.
9.17.1 The Concept of Intensity-Modulated Radiation Therapy 432
9.17.1.1 Prerequisites for the Development of Intensity-Modulated Radiation Therapy 432
9.17.1.2 Rational for IMRT: Concave Target Volumes 432
9.17.1.3 Advantage of IMRT over 3D Conformal Techniques 433
9.17.1.4 Historical Perspective 434
9.17.2 Optimization of Fluence Distributions 435
9.17.2.1 Dose-Deposition Matrix 436
9.17.2.2 IMRT Planning: A Step-by-Step Demonstration 436
9.17.2.2.1 Initialization and input 436
9.17.2.2.2 Formulation as an optimization problem 436
9.17.2.2.3 Solution to the IMRT problem: the optimal treatment plan 437
9.17.2.2.4 Assessing trade-offs 438
9.17.2.3 The IMRT Optimization Problem 438
9.17.2.3.1 Dose-volume effects 439
9.17.2.3.2 Use of clinical outcome models in IMRT optimization 440
9.17.2.3.3 Further remarks 440
9.17.2.4 Optimization Algorithms 440
9.17.2.4.1 Visualization of the FMO problem 441
9.17.2.4.2 Gradient descent 442
9.17.2.4.3 Including second derivatives 443
9.17.3 The Means to Deliver Optimized Fluence Distributions 443
9.17.3.1 SMLC or Step-and-Shoot Delivery 444
9.17.3.1.1 Basic leaf-pair algorithm 445
9.17.3.1.2 Logarithmic direct aperture decomposition 448
9.17.3.1.3 Matrix inversion 449
9.17.3.2 DMLC Delivery 450
9.17.3.2.1 Leaf-pair speed optimization 451
9.17.3.2.2 Special quality assurance 451
9.17.3.3 Dosimetry of the End of the Leaf 453
9.17.3.4 Practical Dosimetry Considerations 456
9.17.4 Direct Aperture Optimization 457
9.17.4.1 Local Leaf Position Optimization 457
9.17.4.1.1 Approximate dose calculation 458
9.17.4.1.2 Optimizing leaf positions 459
9.17.4.2 Aperture Generation Methods 459
9.17.4.2.1 Generating new apertures 459
9.17.4.2.2 Solving the pricing problem 460
9.17.4.3 Extensions 460
9.17.4.3.1 Integration of improved dose calculation 460
9.17.4.3.2 Hybrid methods and extensions 461
9.17.4.3.3 Generalization to constrained optimization 461
9.17.5 Multicriteria Planning Methods 461
9.17.5.1 Prioritized Optimization 461
9.17.5.2 Interactive Pareto-Surface Navigation Methods 462
9.17.5.2.1 Pareto optimality 462
9.17.5.2.2 Navigating the Pareto surface 463
9.17.5.2.3 Approximating the Pareto surface 464
9.17.5.2.4 Remarks 464
Comprehensive Biomedical Physics http://dx.doi.org/10.1016/B978-0-444-53632-7.00914-X 431

9.17.6 Clinical Application of IMRT 465
9.17.6.1 Prostate 465
9.17.6.2 Head and Neck 466
9.17.6.3 Other Sites 466
9.17.6.4 Comparison of IMRT versus 3D-CRT 467
9.17.6.5 Quality Assurance 468
References 469
Glossary
Beamlet/bixel Refers to narrow beam segment of an
incident radiation beam.
Fluence map Refers to the discretized version of the lateral
fluence distribution of an incident radiation beam. The
fluence map specifies the intensity of all beamlets.
Objective function Is a mathematical function to quantify
clinical goals in an IMRT optimization problem.
Pareto surface Refers to the collection of all Pareto-optimal
IMRT treatment plans, that is, plans that cannot be
improved in one objective without worsening at least one
other objective.
9.17.1 The Concept of Intensity-Modulated
Radiation Therapy
The central problem for treating cancer with ionizing radiation
is finding a means to expose malignant cells to a tumoricidal
dose without exposing healthy tissue to a dose that will lead to
unacceptable damage. External beam teletherapy and inter-
nally administered brachytherapy can both be exploited to
this end. The well-known skin-sparing and moderate attenua-
tion properties of megavoltage (4–50 MV) x-rays have led to
their widespread use for treatment of tumors other than skin
cancer by teletherapy. The direction and collimation of the
x-ray beam are to be devised to optimize the dose to the target
tumor while protecting normal structures as much as possible
by collimation. If the tumor dose can be delivered at a suffi-
ciently high level without increasing the dose to normal tissue
to damaging levels, a medically useful probability of control of
tumors can be achieved without producing an unacceptable
risk of damage to normal tissues.
9.17.1.1 Prerequisites for the Development of
Intensity-Modulated Radiation Therapy
There are two technical developments for advanced radiother-
apy treatment planning in the last quarter of the twentieth
century: computerized tomography (CT) and multileaf colli-
mators (MLCs). Three-dimensional tomographic imaging of
the patient using CT and the addition of MLCs to medical
linear accelerators made intensity-modulated radiation ther-
apy (IMRT) technically feasible. Prior to the general access to
CT scanners, teletherapy was planned as treatment fields hav-
ing two-dimensional shapes that could be determined by mea-
suring the anatomy of the patient visible on projection
radiographs. Since these anatomical x-ray shadows were them-
selves projections through the three-dimensional shapes of
the patient’s organs, the locations of invaginations and con-
cave surfaces could not be determined. Even if they could be,
there was no way to cause dose distributions to conform to
these features of three-dimensional target volume surfaces. The
image data sets acquired by CT scanners capable of volume
acquisition are in essence three-dimensional digital models of
patient anatomy, an essential foundation for physical model-
ing of dose delivered by radiotherapy beams. The development
of advanced treatment planning computer systems coupled
with widespread access to fast CT scanners enabled the inves-
tigation of more effective radiotherapy treatment techniques.
These assets led to the development of three-dimensional con-
formal radiotherapy (3D-CRT), techniques that optimize
shaped collimation of multiple fields such that the relatively
uniform dose delivered by each field is confined to the projec-
tion of the target tumor in the direction of each selected x-ray
cone emanating from the treatment machine (Webb, 1993).
Although 3D-CRT is a major improvement on the two-
dimensional treatment planning that preceded it, radiation
oncologists still found that it did not provide the degree con-
trol of the deposition of radiation that they needed.
9.17.1.2 Rational for IMRT: Concave Target Volumes
A classic example occurs with treating the prostate (see Figure 1).
The prostate lies in the midplane of the pelvis beneath the
bladder, between the symphysis pubis and the anterior rectal
wall. The seminal vesicles and often the lateral lobes of the
prostate can form a concave volume into which the convex
anterior wall of the rectum fits. Alternatively, in certain patients,
the rectum can extend around the prostate forming a pocket in
the anterior rectal wall in which the prostate fits. Only millime-
ters of tissue separate the malignant glandular acini in the
interior of the prostate from the radiosensitive lining of the
rectum. The development of tools to plan 3D-CRT treatments
enabled radiation oncologists to visualize the target tumors
relative to isodose surfaces in three dimensions (see Figure 2).
The dilemma that the anatomy of the prostate presents can be
illustrated by the problem faced by radiation oncologists
attempting to utilize 3D-CRT to investigate escalation of dose
to early-stage prostate cancer during the last decade of the twen-
tieth century. It was soon appreciated that in order to realize the
goal of increasing doses to tumor volumes while reducing or
keeping constant doses to the radiosensitive normal rectal wall,
some means were needed to cause the distribution of dose to be
432 Intensity-Modulated Radiation Therapy Planning

shaped around the convex or concave anterior surface of the
rectum. Exploring beam directions and relative intensity
weights, manually optimized by experienced treatment plan-
ners, failed to find a way to avoid overdosing the anterior rectal
wall in order to achieve the high doses to the prostate that the
radiation oncologists were wishing to investigate. Similar situa-
tions abound at other cancer treatment sites. Classical applica-
tions of IMRT include paraspinal tumor geometries and cancers
in the head and neck region. Paraspinal tumors, where the target
volume surrounds the spinal cord that is to be spared from
irradiation, are used to illustrate IMRT planning in
Section 9.17.2.
9.17.1.3 Advantage of IMRT over 3D Conformal Techniques
IMRT refers to radiotherapy delivery methods for which the
fluence distribution in the plane perpendicular to the incident
beam direction is modulated. IMRT can deliver distributions of
dose that flow into concavities and contract from convexities.
Distributions can be made to exhibit diminutions of dose
within the interior of a higher dose volume. Even though
there are limits to the amplitude of these dose modulations,
this feature of IMRT carries a distinct advantage over 3D-CRT in
most instances. The capability of avoiding delivering full doses
to uninvolved sensitive structures (organs at risk or OARs) near
target volumes requiring high doses is arguably the major
advantage of IMRT. Radiation oncologists were quick to inves-
tigate and exploit IMRT for the treatment of cancer of the
prostate. Figure 3 illustrates the advantage of IMRT over 3D-
CRT in a prostate treatment site. The dose delivered by IMRT to
the convex anterior rectal and to the base of the bladder can be
controlled to the extent that the dose to the prostate can be
raised to levels that would risk perforations of the rectal wall if
attempted with 3D-CRT.
Since OARs can be made to receive few doses with IMRT than
with 3D-CRT, it is possible to increase the dose per daily fraction
with IMRT. This ‘accelerated’ pace of dose delivery leads to a
greater probability of tumor cell kill. Modest increases in dose
per fraction can produce benefits worth exploiting.
A secondary advantage is the automated nature of the plan-
ning process with IMRT. A highly experienced planner may be
able to devise complex treatment field arrangements (employ-
ing field-in-field techniques and selecting unique beam angles)
that compete with IMRT plans. But time and training efforts are
required to produce such mastery. There are few opportunities
for quantifying the production of such plans, and the vagaries of
an artistic skill lead to inconsistent results. The IMRT process
lends itself the use of mathematical optimization techniques,
which automate and optimize the design of incident radiation
beams and thereby consistently produce plans of high quality.
B
P
R
95% isodose
Figure 2 Three-dimensional conformal radiotherapy (3D-CRT)
visualization of anatomical structures relative to isodose surface for a
prostate treatment plan. The bladder (B in blue) and the rectum (R in
green) are visualized using a wire-frame rendering. The prostate (P in red)
and seminal vesicles (S in white) are rendered as solid surfaces. The
pose is similar to Figure 1. A wire frame covers the surface corresponding
to a dose that is 95% as great as the maximum dose within the
volume. The treatment strategy was composed of four large fields to treat
involved lymph nodes followed by smaller fields to treat the prostate
alone. The posterior 95% isodose surface penetrates into the anterior wall
of the rectum.
Figure 1 A sagittal section through the male pelvis illustrating the close
proximity of the prostate (P) to the bladder (B) and the rectum (R).
The anterior rectal wall is covered in the lateral projection by the lateral
lobes of the prostate and the seminal vesicles.
Intensity-Modulated Radiation Therapy Planning 433

9.17.1.4 Historical Perspective
The development of IMRT occurred over the last decade of the
twentieth century and the first decade of the twenty-first cen-
tury (Bortfeld, 2006; Webb, 2003). As with most successful
modern technologies, after its initial development, additional
refinements and improvements have continued up to the pre-
sent. IMRT was developed by an international collection of
medical physicists from many radiation oncology centers.
Anders Brahme shared his early thoughts on the subject
through publications and symposia (Brahme et al., 1982;
Lind and Brahme, 1985), an example being the Workshop on
Developments in Dose Planning and Treatment Optimization
at the Karolinska Institute in Stockholm, Sweden, in 1991.
Ideas and information were exchanged through formal pre-
sentations and informal face-to-face discussions at the annual
meetings of the major medial physics and radiation oncology
professional societies, such as the European Society for Radi-
otherapy & Oncology (ESTRO), the American Society for
Radiation Oncology (ASTRO), and the American Association
of Physicists in Medicine (AAPM), and through their associa-
tion journals. The willingness of individual investigators to
share their work candidly with their colleagues, to exchange
criticisms, and to maintain friendly rivalries was the driving
force contributing to the rapid growth and sophistication of
the technology. The British government support of the Royal
Marsden Hospital Joint Department of Physics and Institute of
Cancer Research in Sutton, Surrey, United Kingdom
(Webb, Convery, and Rosenbloom); the German support of
the Deutsches Krebsforschungszentrum (DKFZ) in Heidelberg,
Germany (Schlegel, Bortfeld, and Stein); and the support of
multiple investigators in the United States by the National
Cancer Institute (NCI) poured millions of dollars into the
effort. Leaders of commercial entities (notably Varian, Sie-
mens, Elekta, and the NOMOS Corporation) had the foresight
to invest substantial commercial funding, including grants to
academic investigators, into the development of the MLC hard-
ware and IMRT treatment planning software. These webs of
relationships make it difficult to lay out a single linear chro-
nology of the development of ideas and key demonstrations of
the technology.
The history of NCI grant RO1-CA43840 (Arthur Boyer, PI)
can be used as an example. The application was written in
September and submitted to the NCI by the MD Anderson
Cancer Center (MDACC) in Houston, TX, in October 1991. It
was awarded for 5 years with a start date of 1 July 1992. The
three research objectives were to explore the development of
three tools:
1. Conformal optimization tools. Optimal fluence distributions
for beams from fixed directions were to be developed as had
been proposed in the inverse planning work of Brahme
(1988a,b). Dose-volume histograms were to be used as a
means of control and evaluation of the optimization. Opti-
mization using simulated annealing, as had been investi-
gated by Webb (1989), was to be compared with ‘beam
ensemble’ optimization proposed by Brahme (Ka¨llman
et al., 1988), and techniques were to be borrowed from
CT reconstruction (Bortfeld et al., 1990). The work by
Censor et al. (1988) was referenced. A biological objec-
tive function was to be considered as an alternative.
Section 9.17.2 reviews the foundations of optimizing flu-
ence distributions.
2. Dynamic MLC (DMLC) compensation tools. A sequence of
radiation exposures made with stationary leaf positions was
to be developed to deliver the fluence distributions com-
puted using the first tool (see Section 9.17.3). Film dosim-
etry was to be employed to verify the delivery of the dose
distributions. The proposal for MLC delivery referenced the
earlier work by Brahme (Lind and Brahme, 1987).
IMRT
B B
P P
R R
3D-CRT
Figure 3 Rendering of relative dose by color on the surfaces of the bladder (B), prostate (P), and rectum (R) viewed from the patients’ right side while the
patient is lying on their back. Anterior is toward the top of the figures and posterior is toward the bottom. Shades of red on the bladder and rectum
indicate increasing levels of dose. The right image is rendered from a 3D-CRT plan. The image on the left is rendered from an intensity-modulated
radiotherapy (IMRT) plan for the same patient. In both cases, the uniform red coloring of the prostate surface demonstrates that the target would be uniformly
treated. The reduction in red in the IMRT rendering on the anterior rectal wall and the inferior surface of the bladder compared with the 3D-CRT rendering
demonstrates the calculated reduction of dose to these organs at risk (OAR) using IMRT. Figure reprinted with permission from Varian Medical Systems.

3. Electronic Portal Imaging Device (EPID) field verification tools.
Patient position and the MLC treatment sequences were to
be verified using EPID images. Correlations of image fea-
tures were to use Fourier transform-based correlation.
Clearly, these concepts had not been developed in a vacuum.
The proposal wove together threads of existing ideas along with a
few innovations into a yarn that strung together the whole treat-
ment planning, delivery, and verification process. The reviewers
were convinced by the preliminary data in the application that
the investigative team could carry the project through. The arrival
of Thomas Bortfield at the MDACC in 1992 within weeks of the
beginning of the project, as a postdoctoral fellow, contributed
inestimably to the early success of the effort. Within 9 months, a
three-dimensional inverse planning algorithm had been derived
from his earlier thesis work in two dimensions, the step-and-
shoot sweeping window algorithm had been refined and dem-
onstrated with a clinical MLC newly installed at the MDACC
(Bortfeld et al., 1994), and dose distributions had been delivered
to a film phantom that proved the feasibility of producing three-
dimensional dose volumes bounded by concave surfaces
(Bortfeld et al., 1994). An agreement was brokered with Clif
Ling, the chair of the physics department, and Rodhe Mohan,
the chief of the excellent software development group at the
Memorial Sloan Kettering Cancer Center (MSKCC), that Bortfeld
would tarry in New York in mid-1993 on his way back to
Germany long enough to share the software that had been devel-
oped in Houston. The MSKCC group delivered the first clinical
IMRT treatment using this form of the technology to a prostate
cancer patient (Ling et al., 1996) in 1995. The same year (1995),
the grant was transferred from MDACC to Stanford University
where the PI had accepted an appointment as the director of the
Radiation Physics Division of the Department of Radiation
Oncology. The division was collaborating with the Department
of Neurosurgery at Stanford to develop the Cyberknife Robotic
Radiosurgery System and had just recently treated the first patient
with this device. Soon, postdoctoral fellows and staff funded by
the grant were working on both robotic and cone-beam delivery.
The cone-beam IMRT development was advanced by collabora-
tion with the NOMOS Corporation. Bruce Curran had moved
from academia to industry to work on the implementation of a
cone-beam optimizer and MLC sequence composer within the
NOMOS planning system. Curran installed a prototype treat-
ment planning system at Stanford in 1996 with the able assis-
tance of Stanford faculty physicist Lei Xing (Xing et al., 1999).
The first patient to receive IMRT treatments with the cone-beam
approach using commercial assets received their initial treatment
at Stanford on 11 November 1997. The procedure required
12 min. That the audacious objectives of the grant would be
realized to the extent that patients would be treated by the end
of the last year of the award can only be attributed to the industry
and ingenuity of the medical physicists directly and indirectly
involved. This example demonstrates how medical physicists
from the DKFZ, MSKCC, MDACC, and Stanford worked together
without institutionally initiated formal prearrangements, shared
information and critical software, carried out key clinical devel-
opments, and worked with industry to make IMRT a viable
medical tool. But this example is only a few strands of the global
web of medical physicists who made invaluable contributions.
The list of the many other physicists working on IMRT and their
accomplishments could fill the rest of this chapter. At the risk of
seeming to overlook these worthies, the reader is referred to more
extensive historical discussions (Webb, 2001).
9.17.2 Optimization of Fluence Distributions
IMRT refers to radiotherapy delivery methods for which the
fluence distribution in the plane perpendicular to the incident
beam direction is modulated. To that end, we assume that the
radiation beam is divided into small beam segments, which are
in principle deliverable by a MLC. The lateral fluence distribu-
tion of the beam is thereby discretized into small elements,
which are commonly referred to as beamlets or bixels (see
Figures 4 and 5 for an illustration). A beamlet is a pyramidal
Figure 4 An opposed pair of multileaf collimator (MLC) leaves (gray)
are driven by electric motors. Their positions are encoded as well. They
form a gap between their ends within which an integer number of
beamlets (yellow) is delivered along their motion path (blue). The length
of the beamlet in the direction of leaf travel is a parameter selectable
for the treatment planning system. The width of the beamlet
(perpendicular to leaf motion) is determined by the leaf width. Note the
curved leaf ends and the tongues and grooves on the sides of the leaves.
Voxel i
Bixel j
Figure 5 Schematic illustration of the beamlet and dose-deposition
matrix concepts in IMRT. The incident radiation beam is divided
into beamlets; the dose-deposition matrix stores the dose contribution of
each beamlet to each voxel in the patient.

radiation cone whose apex is at the center of the x-ray source
(bremsstrahlung x-ray target) and whose base is a rectangle
(see Section 9.17.3.1 for details). For an MLC with 1 cm leaf
width, the fluence distribution is represented by the intensities
of 1Â1 cm beamlets. Nowadays, modern MLCs with a smaller
leaf width often allow for a finer discretization into 5Â5 mm
beamlets. The discrete representation of the fluence is com-
monly referred to as the fluence map.
In this section, we discuss the concepts and methods to
determine the intensity of each beamlet. This problem is referred
to as the fluence map optimization (FMO) problem. To that
end, we first introduce the concept of the dose-deposition
matrix, which relates the beamlet intensities to the dose distri-
bution in the patient (Section 9.17.2.1). In Section 9.17.2.2,
the concept of IMRT planning will be demonstrated step by step,
using a paraspinal tumor case as an example. The goal of FMO is
to determine the beamlet intensities in such a way that the
chance of tumor cure is maximized, while the probability of
severe normal tissue complications is minimized. We will see
how this notion is translated into mathematical terms by
formulating IMRT planning as a mathematical optimization
problem (Section 9.17.2.3). Section 9.17.2.4 will provide an
introduction to the most basic optimization algorithms that
can be used to solve FMO problems.
9.17.2.1 Dose-Deposition Matrix
The quality of a treatment plan is primarily judged based on
the dose distribution in the patients. Thus, we would like to
determine the fluence maps of the incident beams as to best
approximate a desired dose distribution. In order to achieve
this, we have to relate the fluence to the dose distribution in the
patient. In this section, we introduce the dose-deposition
matrix concept, which provides exactly this link.
For IMRT planning, the patient is discretized into small
volume elements referred to as voxels. We assume that the
dose-calculation algorithm can provide the dose distribution
of any incident beam. Therefore, the dose-calculation algo-
rithm can be used to obtain the dose distribution in the patient
for every beamlet in the fluence map. Let us denote the dose
that beamlet j contributes to voxel i in the patient for unit
intensity as Dij; and let us denote the intensity of beamlet j as
xj. The total dose di delivered to voxel i is then simply given by
the superposition of all beamlet contributions:
di ¼
X
j
Dijxj
Here, the matrix of dose contributions Dij of beamlets j to
voxel i is referred to as the dose-influence matrix or the dose-
deposition coefficients. The dose-deposition matrix concept is
illustrated in Figure 5. In practice, the fluence is commonly
quantified in monitor units (MU). In this case, the natural unit
of the dose-deposition matrix is Gy/MU, such that the resulting
dose distribution in the patient is obtained in Gy. The dose-
deposition matrix concept is convenient since it allows for a
separation of the mathematical optimization of beamlet inten-
sities xj from the dose-calculation algorithm: in IMRT plan-
ning, the dose-deposition matrix is often calculated up front
and held in memory. Subsequently, the dose distribution is
obtained by a simple matrix multiplication d¼Dx.
9.17.2.2 IMRT Planning: A Step-by-Step Demonstration
In this section, we demonstrate the concepts of IMRT planning
for an example case. We consider the patient shown in Figure 6.
In this case, the target volume to be treated with radiation (red
contour) surrounds the spinal cord (green contour). The latter
is the main dose-limiting organ at risk, which is to be spared
from irradiation. In addition, the kidneys (orange contours)
are located in proximity to the target volume, representing the
secondary OAR to be spared.
9.17.2.2.1 Initialization and input
For IMRT planning, a segmentation of the patient is required,
which specifies to which organ or anatomical structure each
voxel belongs to. In the example in Figure 6, each voxel is
assigned to the spinal cord, the target volume, the kidneys, or
the remaining healthy tissues in the patient. In addition, we
require a setup of the fluence map. Similar to 3D-CRT, this
starts with selecting the location of the isocenter. For IMRT
planning, we determine the set of all beamlets that are poten-
tially helpful in finding the most desirable treatment plan.
Loosely speaking, this corresponds to all beamlets that con-
tribute a significant dose to the target volume. A common
method for initializing the fluence map consists in including
all beamlets for which the central axis of the corresponding
beam segment intersects the target volume. Given the voxel
discretization of the patient, the isocenter, and the beamlet grid
for each incident beam, a dose-calculation algorithm is used to
calculate the dose distribution of each beamlet in the patient,
that is, the dose-deposition matrix, Di j.
9.17.2.2.2 Formulation as an optimization problem
In order to determine the optimal incident fluence distribu-
tions, we have to specify the desired dose distribution. In other
words, we have to characterize what a ‘good’ treatment plan is.
In the example case in Figure 6, treatment planning aims at
different goals:
1. Deliver a prescribed dose dpres
to the target volume. As in
most cases, the target volume contains tumor cells
Figure 6 A typical indication for IMRT: a paraspinal tumor geometry.
The tumor (red) entirely surrounds the spinal cord, which is to be
spared from irradiation. In addition, the kidneys are located in proximity
to the target volume.

embedded in a normal tissue stroma such that treatment
planning aims at a homogeneous dose in the target, avoid-
ing both underdosing, which would fail to kill all the tumor
cells, and overdosing, which would destroy the normal
tissue stroma along with blood vessels and nerves passing
through it. If enough of the stroma survives, tissue will
regrow in the target volume. Otherwise, the target volume
will contain an abscess leading to muscle, nerve, and circu-
lation problems.
2. Minimize dose to the kidneys.
3. Aim at a conformal dose distribution and avoid unneces-
sary dose to all healthy tissues.
4. Limit the dose to the spinal cord. The maximum dose
delivered to any part of the spinal cord has to stay below a
maximum tolerance dose dS
max
.
For IMRT planning, these goals have to be translated into
mathematical terms. This is done by defining functions, which
represent measures for how good a treatment plan is and
whether it is acceptable at all. In this context, we distinguish
objectives and constraints:
Constraints are conditions that are to be satisfied in any case.
Every treatment plan that does not satisfy the constraints
would be unacceptable. The set of constraints together
defines the feasible region, which corresponds to the set
of treatment plans that satisfy all constraints.
Objectives are functions that measure the quality of a treatment
plan. They may represent measures to quantify how close a
treatment plan is to the ideal or desired treatment plan.
In the previously mentioned example, the first three goals
can be formulated as objectives; the fourth goal of enforcing a
strict maximum on the spinal cord dose represents a constraint.
The goal of delivering a homogeneous dose to the target vol-
ume can be formulated via a quadratic objective function:
fT dð Þ ¼
1
NT
XNT
i¼1
di À dpres
ð Þ2
where the summation occurs over the NT voxels located by
three-dimensional indices i that belong to the target volume.
Ideally, every voxel that belongs to the target volume receives
the prescribed dose dpres
, which corresponds to a value of zero
for the function fT. Otherwise, fT yields the average quadratic
deviation from the prescribed dose. The larger the objective
value is, the more the dose deviates from the prescription dose,
corresponding to a worse treatment plan.
Similarly, the goal of minimizing the dose to the kidneys
can be formulated as an objective function. For example, we
can define the objective fK as
fK dð Þ ¼
1
NK
XNK
i¼1
di
that aims at minimizing the mean dose to the kidneys. The goal
of conforming the dose distribution to the target volume can,
for example, be described via a piecewise quadratic penalty
function
fH dð Þ ¼
1
NH
XNH
i¼1
di À dmax
i
À Á2
þ
where the þ operator is defined through (di Àdi
max
)þ ¼di Àdi
max
if di !di
max
and zero otherwise. Thus, di
max
is the maximum dose
that is accepted in voxel i; dose values exceeding di
max
are penal-
ized quadratically. Clearly, in normal tissue voxels directly adja-
cent to the target volume, high doses are unavoidable, whereas at
large distance from the target volume, treatment planning
should aim at avoiding unnecessary dose. Therefore, di
max
can
be chosen based on the distance between voxel i and the target
volume.
Finally, we would like to ensure that the dose in all voxels
that belong to the spinal cord does not exceed a maximum
tolerance dose dS
max
. If we will not accept any treatment plan
that exceeds the maximum dose, this can be implemented as a
constraint, not an objective. In this case, we can formulate the
constraint as
di dmax
S for all i E S
where S is the set of indices of three-dimensional vectors
pointing to voxels within the spinal cord volume.
Treatment planning simultaneously aims at minimizing all
of the previously mentioned objective functions, that is,
ideally, we would like each tumor voxel to receive the pre-
scribed dose, while no dose is delivered to the normal tissues.
It is clear that the objectives associated with different structures
are inherently conflicting. Thus, the treatment planner will
have to weight these conflicting objectives relative to each
other and accept a compromise. The traditional approach in
IMRT planning consists in manually assigning importance
weights w to each objective, using a high weight for the most
important objective and a smaller weight for less important
goals. The best treatment plan is then defined as the one that
minimizes the weighed sum of objectives:
wTfT dð Þ þ wKfK dð Þ þ wHfH dð Þ
Given the mathematical formulation of the clinical goals,
IMRT planning uses mathematical optimization algorithms in
order to determine the fluence map x, corresponding to the
dose distribution d¼Dx, which minimizes the weighted sum
of objectives, subject to all constraints of the dose distribution
and under the condition that all beamlet weights have to be
positive. We will further discuss optimization algorithms in
Section 9.17.2.4. In the succeeding text, we first take a look
at the result of such an optimization.
9.17.2.2.3 Solution to the IMRT problem: the optimal
treatment plan
Figure 7 shows the optimal dose distribution obtained for a
specific choice of optimization parameters: the spinal cord
dose was constrained to two-third of the prescription dose.
The voxel-dependent maximum dose di
max
in the conformity
objective was formulated to provide a dose falloff to one-third
of the prescription dose at 1 cm distance from the target sur-
face. Nine equispaced coplanar incident beams are used. It is
apparent that IMRT is capable of conforming the high-dose
region relatively tightly to the target volume. In particular, the
dose to the spinal cord can be reduced to doses much below
the prescription dose. This would not be possible using 3D
conformal techniques without the possibility of modulating
the intensity of the incident radiation beams.

Figure 8 shows the dose contribution of two out of nine
beam directions, illustrating the use of intensity modulation.
The intensities of the beamlets that intersect with the spinal
cord are reduced to near zero. This allows for a dose reduction
in the spinal cord but at the same time yields an inhomoge-
neous dose distribution in the target volume. All nine beams in
combination deliver the prescribed, homogeneous dose distri-
bution to the target volume.
9.17.2.2.4 Assessing trade-offs
Different objectives in IMRT planning are inherently conflict-
ing. Clearly, there is a trade-off between delivering dose to the
tumor and reducing dose to healthy tissues. In the previously
mentioned example, the dose to the spinal cord is constrained
to two-third of the prescription dose, which compromises the
coverage of the target volume. In regions near the spinal cord,
the target volume does not receive the prescribed dose. To
improve the coverage of the target volume, higher doses to
the spinal cord have to be accepted. In addition, IMRT plan-
ning involves trading off the dose burden of adjacent healthy
tissues. In the previously mentioned example, there is a trade-
off between sparing the kidneys from irradiation and the con-
formity of the dose distribution in the remaining normal
tissue. Achieving a very low dose in the kidneys leads to higher
doses in the normal tissue anterior and posterior to the target
volume. This is illustrated in Figure 9. In comparison with
Figure 9, the weighting factor for the kidney mean dose was
increased and the weighting factor for the conformity objective
was decreased. Thereby, the kidney dose could be substantially
reduced. Through the use of mathematical optimization, the
beam segments that penetrate the kidneys are automatically
avoided. However, this comes at the price of a less conformal
dose distribution, that is, higher doses in the normal tissue
anterior and posterior to the target volume.
In today’s clinical practice, the treatment planner chooses
the objective weights manually, based on prior experience and
trail-and-error experience with the treatment plan before them.
In Section 9.17.5, we discuss multicriteria optimization
methods that represent a more elaborate approach to control-
ling the trade-off between different objectives.
9.17.2.3 The IMRT Optimization Problem
In the previous section (Section 9.17.2.2), we illustrated IMRT
planning step by step for an example case. In this section, we
take a more formal look at IMRT planning as a mathematical
Figure 8 The contribution of two of nine beam directions. The beam in panel (a) directly from the posterior demonstrates the symmetric reduction of
intensity of the beamlets that intersect the spinal cord but not the kidney. One side of the posterior oblique beam in panel (b) intersects part of the
kidney and is less intense than the opposite side of the beam that does not intersect a kidney.
Figure 7 IMRT dose distribution for the paraspinal case example,
demonstrating the ability of IMRT to conform the dose distribution to
concave target volumes.
Figure 9 IMRT dose distribution for the paraspinal case example,
demonstrating the trade-off between conformity of the dose distribution
and the minimization of the kidney dose.

optimization problem. Mathematically, the FMO problem can
be formulated as
minimize
x
f dð Þ
subject to gk dð Þ ck
di ¼
X
j
Dijxj xj ! 0
The first line indicates that we minimize an objective func-
tion f with respect to the fluence map x, which corresponds to
dose distribution d. The second line indicates that we are
restricted to dose distributions that satisfy the constraints
gk(d) ck. The third line specifies the relation between fluence
and dose, and the last line requests that all beamlet intensities
have to be positive in order to be physically meaningful. Treat-
ment planning involves balancing different clinical objectives.
Therefore, the objective function f is a weighted sum of indi-
vidual objectives:
f dð Þ ¼
X
n
wnfn dð Þ
Here, wn are positive weighting factors that are used to
control the relative importance of different terms in the com-
posite objective function.
The objective function that may be the most commonly
used in current treatment planning systems is a piecewise
quadratic penalty function:
fn dð Þ ¼
1
Nn
XNn
i¼1
di À dmax
i
À Á2
þ
or fn dð Þ ¼
1
Nn
XNn
i¼1
dmin
i À di
À Á2
þ
Here, dmax
is a maximum tolerance dose for an organ,
which is usually specified by the treatment planner through
the graphical user interface in the treatment planning system.
Similarly, for target volumes, dmin
is a minimum dose that is to
be delivered to the target volume. Common constraints are
maximum dose values in OAR and minimum doses in target
volumes. In the next subsection, additional commonly used
objectives and constraints are discussed.
9.17.2.3.1 Dose-volume effects
An organ at risk will typically receive an inhomogeneous dose
distribution. Often, the question arises whether it is preferable
to irradiate a small part of the organ to a large dose while
sparing the remaining parts to a large extent or whether it is
better to spread out the dose and avoid large doses in all parts
of the organ. In that context, one distinguishes parallel organs
and serial organs. For organs with a serial structure, the func-
tion of the whole organ will fail if one part of the organ is
damaged. One prominent example for a serial organ is the
spinal cord. For serial organs, it is therefore crucial to limit
the maximum dose delivered to the organ, rather than the
mean dose. For a parallel organ, the function of the organ as
a whole is preserved even if a part of the organ is damaged. An
example for a parallel organ is the lung. The dependence of a
clinical outcome on the irradiated volume of an organ is com-
monly referred to as a volume effect or dose-volume effect. For
IMRT planning, clinical knowledge on dose-volume effects is
to be translated into appropriate objective functions. Today,
mainly two types of objective/constraint function are being
applied: Dose-Volume Histogram (DVH) objectives and the
concept of equivalent uniform dose (EUD).
9.17.2.3.1.1 DVH objectives and constraints
The clinical evaluation of treatment plans often uses the dose-
volume histogram. A typical evaluation criterion for the target
volume is that at least 95% of the target volume should receive
a dose equal or higher than the prescription dose. Similarly, a
criterion for an OAR could be that at most 20% of the organ
should receive more than 30 Gy.
From an optimization perspective, it is not straightforward
to handle DVH constraints in a rigorous way. A naive imple-
mentation of a DVH constraint requires the use of integer
variables. For example, the constraint that no more than v%
of an organ should receive a dose higher than dcrit
can formally
be written as
1
N
XN
i¼1
bi v
bi ! M di À dcrit
À Á
bi E 0; 1f g
where M is a large constant. Here, bi is a binary integer variable
that is introduced for every voxel, which takes the value 1 if the
dose di exceeds dcrit
and 0 if di is smaller than dcrit
. The use of
integer variables represents a different type of optimization
problem, is computationally demanding, and requires algo-
rithms that differ considerably from those that is described in
Section 9.17.2.4.
In practice, DVH constraints are therefore handled approx-
imately through a heuristic tactic. Given the current dose dis-
tribution, one can identify the fraction of voxels that exceed the
dose level dcrit
. If this fraction is smaller than v, the DVH
constraint is fulfilled. Otherwise, a quadratic penalty function
is introduced that aims at reducing the dose to those voxels that
exceed dcrit
by the least amount, neglecting the fraction v that
receives the highest dose.
9.17.2.3.1.2 Equivalent uniform dose
An alternative approach to quantifying dose-volume effects
consists in using generalized mean values of the dose distribu-
tion defined as
EUD dð Þ ¼
1
N
XN
i¼1
dið Þa
" #1=a
for Ni & OAR
where the exponent a is larger than one for OARs. For the special
case a¼1, EUD(d) is equivalent to the mean dose in the organ.
In the limit of large a values, the value of EUD(d) approaches
the maximum dose in the organ. Thus, parallel organs are
described via a small value of a close to 1, whereas serial organs
are described via large values of a: (approximately 10). The
generalized mean value is commonly referred to as EUD.
The generalized mean value can also be applied to target
volumes by using negative exponents. For a large negative
value of a, the EUD approaches the minimum dose in the
target volume. In practice, exponents in the range of a¼À10
toÀ20 are considered.

9.17.2.3.2 Use of clinical outcome models in IMRT
optimization
From the beginning of the development of IMRT, the question
regarding the adequate objective function to be used has per-
sisted. Intuitively, we would like to translate the notion of
‘maximizing the tumor control probability (TCP)’ while
‘minimizing the normal tissue complication probability
(NTCP)’ more directly into mathematical terms (Brahme
et al., 1988; Ka¨llman et al., 1992).
9.17.2.3.2.1 Sigmoid outcome models
One of the most common methods for relating treatment out-
come to the dose distribution consists in performing logistic
regression. As an example, we consider NTCP models. However,
the same methodology can be applied to TCP models. The
severity of a radiation side effect is clinically assessed in discrete
stages. Typically, one is interested in avoiding severe complica-
tions. For example, in the treatment of lung cancer, treatment
planning may aim at minimizing the probability for radiation
pneumonitis of grade two or higher. This converts the observed
clinical outcome into a binary outcome label. NTCP modeling
can thus be considered as a classification problem, which aims
at estimating the probability of a complication given features of
the dose distribution. Standard statistical classification methods,
such as logistic regression, can be applied to this problem. In
logistic regression, the NTCP model is given by
NTCP dð Þ ¼
1
1 þ exp Àf d; qð Þð Þ
Here, f is a function of the dose distribution d and the
model parameters q. The central problem in statistical analysis
and modeling of patient outcome consists in determining the
function f, that is, selecting features of the dose distribution
that are correlated with outcome. One of the most commonly
used representations of f is given by
NTCP dð Þ ¼
1
1 þ exp g TD50 À EUD dð Þð Þ½ Š
In this case, f is a linear function of a single feature of the
dose distribution, namely, the EUD. For EUD(d)¼TD50, the
value of NTCP evaluates to 0.5, that is, TD50 corresponds to
the effective dose that leads to a complication probability of
50%. The parameter g determines the slope of the dose–
response relation. The NTCP model has three parameters
(TD50, g, and the EUD exponent a) that can be fitted to outc-
ome data, for example, through maximum likelihood
methods. This NTCP model is equivalent to the Lyman–
Kutcher–Burman (LKB) model, except that the LKB model
traditionally uses a different functional form of the sigmoid.
Although phenomenological outcome models may play an
increasing role in treatment plan evaluation, their capabilities
from a treatment plan optimization perspective have remained
limited so far. The previously mentioned NTCP model represents
an increasing function of the EUD, that is, higher EUD always
leads to higher NTCP, independent of the parameters TD50 and
g. As a consequence, the dose distribution that minimizes EUD is
the same as the dose distribution that minimizes NTCP. Hence,
from an IMRT optimization perspective, minimizing EUD and
NTCP is equivalent (Romeijn et al., 2004).
9.17.2.3.3 Further remarks
9.17.2.3.3.1 Linear programming formulations
One of the fundamentals of FMO is the linear relation between
the dose and the incident fluence. Due to the linearity, it is
possible to formulate IMRT planning as a linear optimization
problem (LP). LPs are optimization problems for which both
the objective f and all constraint functions gk are linear func-
tions of the optimization variables. At first glance, an exclusive
use of linear functions appears restrictive. However, it turns out
that most nonlinear objective functions currently can be mim-
icked using linear formulations by introducing auxiliary opti-
mization variables. LP formulations for FMO have mostly been
studied in research environments. However, the first-generation
IMRT planning systems and contemporary commercial planning
systems are primarily focused around quadratic objective func-
tions and DVH- and EUD-based objective and constraint
functions.
9.17.2.3.3.2 Size of the optimization problem
IMRT treatment planning corresponds to a large-scale optimi-
zation problem since it involves a large number of variables.
The number of beamlets for a single incident beam depends on
the size of the target volume and the beamlet resolution (typ-
ically 5 or 10 mm) and is usually in the order of 102
–103
for
each beam direction. Assuming that ten beam directions are
used, the total number of beamlets is expected to be in the
order of 103
–104
. Furthermore, the patient is discretized into
voxels, with a typical resolution of 2–4 mm, resulting in the
total number of voxels in the order of 106
.
If all beamlets contributed a significant dose to all voxels in
the patient, the number of elements in the dose-deposition
matrix would be 109
–1010
. If each element is stored as a 4-
byte integer, the dose-deposition matrix requires approximately
10 GB of memory. However, in practice, the dose contributions
to voxels at large distance from a beamlet’s central axis are set
to zero. Thereby, the total number of nonzero elements is
substantially reduced, and the dose-deposition matrix can be
stored in a sparse format.
9.17.2.3.3.3 Convexity
Many objective functions commonly applied in IMRT plan-
ning are convex. This is in particular the case for the piecewise
quadratic objective, the linear objectives, and the generalized
EUD for exponents |a|>1. The convexity property of objective
and constraint functions has important implications for the
optimization of fluence maps. An optimization problem
defined through a convex objective function f and convex
constraint function gk has a unique global minimum, that is,
there are no local minima, which are not the global minimum.
Thus, gradient descent-based optimization algorithms as
described in Section 9.17.2.4 will reliably find the optimal
fluence map. The only nonconvex functions commonly
applied in practice are DVH constraints.
9.17.2.4 Optimization Algorithms
Our goal in this chapter is to provide the reader with an
understanding of the most basic optimization algorithms that
do not require advanced knowledge of optimization theory. In

Section 9.17.2.4.1, we start with a geometric visualization
of the IMRT optimization problem. In Section 9.17.2.4.2, the
gradient descent algorithm in the context of IMRT is described,
which in principle is sufficient to optimize fluence maps. In
Section 9.17.2.4.3, extensions of gradient descent methods
toward quasi-Newton algorithms are outlined. Certainly, the
field of IMRT optimization has advanced significantly, and
increasingly complex algorithms for constrained optimization
are being applied. These algorithms require knowledge of opti-
mization theory, which is beyond the scope of this chapter. The
interested reader is referred to the optimization literature (e.g.,
Bertsekas, 1999 or Nocedal and Wright, 2006) or a review of
mathematical optimization problems in radiotherapy by
Ehrgott et al. (2010).
9.17.2.4.1 Visualization of the FMO problem
Due to the large number of beamlets (optimization variables),
it is not possible to visualize directly the objective and con-
straint functions for a full IMRT planning problem. Neverthe-
less, it is helpful to understand the structure of the IMRT
optimization problem. To that end, we consider a simplified
version of an IMRT planning problem in which only two
beamlets and four voxels are considered. We consider the
following dose-deposition matrix:
D ¼
1:3 0:7 0:1 1:0
0:7 1:3 0:5 0:3

where the first two columns correspond to the tumor voxels
and columns 3 and 4 correspond to OAR voxels. We further
assume that we aim to deliver a dose of 2 to both of the tumor
voxels, and we impose a maximum dose constraint of 0.8 and
1.0 on the OAR voxels.
The goal of delivering the prescribed dose to the tumor
voxels is expressed via a quadratic objective function. The
optimization for this illustrative example can be formulated as
minimize
1
2
X2
i¼1
di À 2ð Þ2
subject to d3 0:8
d4 1:0
di ¼
X2
j¼1
Dij xj xj ! 0
Since we only have two optimization variables, the objec-
tive and constraint functions can be visualized explicitly. This
is done in Figure 10. The objective function is depicted via
isolines. Since we consider a quadratic objective function, it
represents a two-dimensional parabola. The minimum of the
objective function is located at beamlet intensities x1 ¼1 and
x2 ¼1. At this point, both tumor voxels receive the prescribed
dose and the objective function is zero.
We now consider the constraints on the OAR voxels. Since
the dose in each voxel is a linear function of the beamlet
intensities, the constraints represent hyperplanes in beamlet
intensity space, that is, lines in two dimensions. In Figure 10,
we show the lines where the constraints d3 ¼0.8 and d4 ¼1.0
are met exactly. For all beamlet intensities beyond these lines,
the maximum dose to an OAR voxel is exceeded. All beamlet
intensity combinations below the lines form the feasible
region. Thus, the optimal solution to the IMRT planning prob-
lem is given by the point within the feasible region that has the
smallest value of the objective function. In this example, this is
approximately given by x1 ¼0.7 and x2 ¼1.2 and is indicated
by the red dot in Figure 10. By multiplying this solution with
the dose-deposition matrix, we obtain the corresponding opti-
mal dose distribution.
In this case, the constraint for OAR voxel 4 is binding, that
is, the OAR voxel receives the maximum dose we allow for. We
further note that the minimum of the objective function is
outside of the feasible region, which means that, in order to
fulfill the maximum OAR dose constraint, we have to compro-
mise in terms of target dose homogeneity.
9.17.2.4.1.1 Approximate handling of constraints through penalty
functions
A common approach in IMRT planning consists in approxi-
mating the maximum dose constraints in OARs via penalty
Minimum of objective
Feasible set
0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0.5 1 1.5 2
(no constraints
violated)
Intensity beamlet x1
Intensity beamlet x2
Desired solution: minimum of
constrained problem
Constraint OAR voxel 3
d3=dmax
Figure 10 Visualization of the IMRT optimization problem for two beamlets. The quadratic objective function is depicted via isolines; the linear
maximum dose constraints of OAR voxels are shown as thick black lines.

functions. More specifically, we can consider the composite
objective function where a quadratic penalty function, multi-
plied with a weight w, is added to the original objective for
target dose homogeneity:
f dð Þ ¼
1
2
X2
i¼1
di À 2ð Þ2
þ w d3 À 0:8ð Þ2
þ þ d4 À 1:0ð Þ2
þ
Â Ã
Adding the penalty function does not change the objective
function within the feasible region; only the objective function
values outside of the feasible region are increased. This is shown
in Figure 11 for penalty weights of w¼5 and w¼20. While w is
increased, the unconstrained minimum of the function f moves
closer to the optimal solution of the constrained problem.
9.17.2.4.2 Gradient descent
In this section, we introduce the most generic optimization
algorithm, which can in principle be used to generate an IMRT
treatment plan. To that end, we assume that we want to min-
imize an objective function f, subject to the constraint that all
beamlet intensities are positive. We do not consider additional
constraints g on the dose distribution, that is, all treatment
goals are included in the objective function (e.g., through the
use of quadratic penalty functions).
The gradient of the objective function is the vector of partial
derivatives of f with respect to the beamlet intensities xj:
rf ¼
@f
@x1
⋮
@f
@xJ
0
B
B
B
B
B
@
1
C
C
C
C
C
A
The gradient vector is oriented perpendicular to the isolines
of the objective function; it points to the direction of maximum
slope in the objective function landscape. Thus, taking a small
step into the direction of the negative gradient yields a fluence
map x that corresponds to a lower value of the objective
function, that is, an improved plan. This gives rise to the most
basic iterative nonlinear optimization algorithm: in each itera-
tion k, the current fluence map xk
is updated according to
xkþ1
¼ xk
þ arf xk
À Á
where a: is a step size parameter, which has to be sufficiently
small in order for the algorithm to converge.
9.17.2.4.2.1 Calculation of the gradient
The calculation of the gradient of the objective function with
respect to the beamlet intensities can be calculated by using the
chain rule in multiple dimensions: given that the objective is a
function of the dose distribution, we have
@f
@xj
¼
XN
i¼1
@f
@di
@di
@xj
The partial derivative of the voxel dose di with respect to the
beamlet weight xj is simply given by the corresponding element
of the dose-deposition matrix:
@di
@xj
¼ Dij
The partial derivative of the objective function with respect
to dose in voxel i describes by how much the objective function
changes by varying the dose in voxel i. For the quadratic
objective function
f dð Þ ¼
1
N
XN
i¼1
di À dpres
ð Þ2
the components of the gradient vector are given by
@f
@xj
¼
1
N
XN
i¼1
2 di À dpres
ð ÞDij
which has an intuitive interpretation: the total change in the
objective function value due to changing the intensity of
0
(a) (b)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
(w=5) (w=20)
2
0
0.2
0.4
0.6
0.8
1
1.2
x2
x1x1
x2
1.4
1.6
1.8
2
0.5 1 1.5 2 0 0.5 1 1.5 2
Figure 11 Visualization of the composite objective function containing quadratic penalty functions to approximate maximum dose constraints. For
increasing weights w for the penalty function, the minimum of the composite objective function moves closer to the optimal solution of the constrained
problem.

beamlet j is obtained by summing over the contributions of all
voxels. The contribution of a voxel is given by the dose error
(di Àdpres
) multiplied by the influence Dij of beamlet j onto the
voxel i. If the dose di exceeds the prescribed dose, the voxel’s
contribution is positive; voxels that are underdosed yield a
negative contribution to the gradient component. If the gradi-
ent component is negative after summing over the contribu-
tions of all voxels, the impact of the underdosed voxels
dominates. A step in the direction of the negative gradient
corresponds to increasing the beamlet weight xj, thus reducing
the extent of underdosing.
9.17.2.4.2.2 Handling the positivity constraint
So far, only the objective function f is considered, not taking
into account the positivity constraint on the beamlet weights.
Applying the gradient descent algorithm without accounting
for the positivity constraint leads to negative intensities for
some of the beamlets, which is not meaningful. Different
extensions of the gradient descent algorithm exist in order to
ensure positive beamlet weights.
One method consists in simply setting all negative beamlet
intensities to zero after each gradient step. Formally, this cor-
responds to a projection algorithm for handling bound con-
straints. An alternative approach is based on a variable
transformation. In this case, a new optimization variable is
introduced for every beamlet, which is defined as the square
root of the intensity. Thus, the beamlet intensity, given by
the squared value of the variable, is always positive, while the
optimization variable can take any value. This way, the con-
strained optimization problem is converted into a fully uncon-
strained problem.
9.17.2.4.2.3 Improvements to gradient descent
The generic gradient descent algorithm shows slow conver-
gence in practical IMRT optimization problems. Improvements
to the generic gradient descent algorithms can be made mainly
in three aspects:
1. Selecting an appropriate step size using line search
algorithms.
2. Improving the descent direction by including second-
derivative information.
3. Improving the handling of constraints using more
advanced algorithms for constrained optimization.
For the first and third aspects, the reader is referred to the
advanced optimization literature. The second aspect is out-
lined in Section 9.17.2.4.3.
9.17.2.4.3 Including second derivatives
The generic gradient descent algorithm considers the first deriv-
ative of the objective function at the current fluence map x. This
can be interpreted as finding a hyperplane that is tangential to
the objective function at x. The convergence properties of
iterative optimization algorithms can be improved by includ-
ing second-derivative (i.e., curvature) information. This can be
interpreted as finding a quadratic function that is tangential to
the objective function at x. The iterative optimization algo-
rithm, known as the Newton method, then performs a step
toward the minimum of the quadratic approximation.
To formalize this concept, we consider a second-order Tay-
lor expansion of the objective function f at the fluence map x:
~f x þ Dxð Þ ¼ f xð Þ þ
XJ
j¼1
@f
@xj
Dxj þ
XJ
j,k¼1
@2
f
@xj@xk
DxjDxk
By defining the Hessian H as the matrix of second deriva-
tives, this can be written as
~f x þ Dxð Þ ¼ f xð Þ þ rf xð ÞDx þ DxT
H xð ÞDx
The idea of the Newton method consists in taking a step Dx
such that we reach the minimum of the quadratic approxima-
tion. For the special case that the original objective function f is
a quadratic function, the approximation is exact, and thus, the
Newton method finds the optimal solution in a single step.
Generally, f will not be a purely quadratic function. However, it
is assumed that a Newton step will approach the optimum
faster than a step along the gradient direction.
To calculate the Newton step Dx*, we set the gradient of ~f
with respect to Dx to zero, which yields the condition
rf xð Þ þ H xð ÞDx*
¼ 0
Thus, the Newton step is given by
Dx*
¼ ÀH xð ÞÀ1
rf xð Þ
This leads to a modified iterative optimization algorithm in
which the beamlet intensities are updated according to
xkþ1
¼ xk
À aH xk
À ÁÀ1
rf xk
À Á
We can further note that the Newton method has a natural
step size a¼1.
In practical IMRT optimization, the pure Newton method is
not applied. A naive computation of the Newton step involves
the calculation of the Hessian matrix at point x, inverting the
Hessian matrix and multiplying the inverse Hessian H(xk
)À1
with the gradient vector. In IMRT optimization, the size of the
Hessian matrix is given by the number of beamlets squared.
Therefore, the explicit calculation and inversion of the Hessian
is computationally prohibitive. Thus, IMRT optimization
employs the so-called quasi-Newton methods, which rely on
an approximation of the Newton step. One of the most popu-
lar methods that have been successfully applied in IMRT plan-
ning is the limited memory L-BFGS quasi-Newton algorithm.
In this algorithm, the descent direction H(xk
)À1
rf(xk
) is
approximated based on the fluence maps and gradients evalu-
ated during the previous iterations of the algorithm, which
avoids a costly matrix inversion. The comprehensive descrip-
tion of the L-BFGS algorithm can be found in Nocedal and
Wright (2006).
9.17.3 The Means to Deliver Optimized Fluence
Distributions
The development of the computer-controlled MLC for field
shaping was a major step forward that set the stage for IMRT.
Beam modulation was first implemented by using computer-
controlled motorized block collimators to deliver a wedged
dose distribution. It is intuitively obvious that by holding

one collimator stationary while moving its opposing mate
across the radiation field, a triangular beam profile will
result. It may not be so obvious that one should be able to
deliver a beam profile of an arbitrary form using two oppos-
ing leaves of a MLC. However, one can do so simultaneously
with each leaf pair in a MLC, thereby modulating the radia-
tion field within the shaped collimation boundary of a cone
beam to within certain limits imposed by the attenuation
properties of the MLC leaves and the speed of the leaf
motion. There are two approaches to the implementation of
beam modulation with an MLC: DMLC techniques and
segmental MLC (SMLC) techniques (IMRT Therapy Collabo-
rative Working Group, 2001).
Discussion of modulation methods can be facilitated with
some idea of how the MLCs are designed and controlled. The
designs differ among different vendors (Boyer et al., 2001).
This discussion will describe one Varian Medical Systems
design, the space not permitting a description of all the vari-
ants. The MLCs are composed of plates of tungsten, called
leaves, manufactured with a density that strikes a compromise
between high attenuation (brittle with high attenuation) and
efficiency and cost of manufacture and maintenance (more
malleable and lower attenuation). The design using leaves
that move perpendicular to the central axis of the radiation
field and have curved ends is depicted in Figure 12. The curved
ends provide for a constant penumbra as a leaf traverses the
radiation field in a straight line across its range of motion. The
aperture formed by the leaves is visualized on the patient by
a light source in the collimator directed toward the patient
by a thin mirror placed at a 45
angle to the central axis of
the radiation field. The location of the virtual light source is
adjusted to be at the x-ray target. The shadow of the leaves at
the boundary of this light field is an indication of the edge of
the radiation field. Each leaf is driven by an electric motor that
is in turn driven by a sophisticated computer-controlled cur-
rent distribution system. Each leaf is provided with a position
encoder. The leaves are placed in pairs (designated A leaves and
B leaves) that move along a common track with their curved
ends forming a gap between them through which radiation can
pass. A computer system coordinates the application of current
and the signal from the encoders so as to move the leaves to a
required location. Systems with 80 leaf pairs are widely avail-
able that produce leaf tracks (5 mm wide) at the patient. The
system sets the leaves to create a shaped treatment aperture.
Once the aperture is formed, a preset fluence is delivered
through the aperture as determined by a transmission ioniza-
tion chamber near the x-ray target that monitors the x-ray
beam. The current from the monitor chamber is digitized and
presented to the computer control system as ‘MU.’ MU are
calibrated to deliver a desired dose. The MLC was originally
introduced to implement 3D-CRT. However, it was soon real-
ized that the computer control of the system was an enabling
technology for beam intensity modulation within the overall
aperture shape.
9.17.3.1 SMLC or Step-and-Shoot Delivery
The SMLC is a fundamentally digital approach to defining and
delivering a modulated beam fluence. To employ the SMLC
approach, the fluence distribution for each gantry angle is
calculated beforehand by an optimization algorithm as
described in Section 9.17.2. Furthermore, the delivery of the
fluence map requires that the fluence distribution is discretized
in increments of spatial position for the MLC leaves. This is
usually accounted for in the optimization of the fluence distri-
bution, which is discretized into beamlets (synonymously bix-
els) that match the resolution of the MLC leaves. A beamlet is a
pyramidal radiation cone whose apex is at the center of the
x-ray source (bremsstrahlung x-ray target) and whose base is a
rectangle (see Figure 4 for an illustration). The beamlet base
can be defined in the plane perpendicular to the axis of rota-
tion of the collimator that passes through the treatment
machine isocenter (the isocenter plane). The width of the
beamlet base is determined by the width of the MLC leaf pair
with which it is associated. The length of the beamlet base is
measured in the direction of leaf travel. It is a parameter
selected for the planning and delivery system to be as small
as practical given the speed of the computers and the electro-
mechanical limitations of the MLC control system. The points
that define the bounds of the beamlet along its length direction
are control points for the SMLC delivery sequence. The faces of
the beamlets created by the ends of the leaves are characterized
by a penumbra that is slightly extended relative to the light
field by attenuation through the curved leaf ends. The other
sides of the beamlets are affected somewhat by the tongues and
grooves in the MLC leaf sides. These interlocking side shapes
reduce interleaf transmission along the sides of the leaf tracks.
Dose monitor
ionization chamber
MLC leaves
Rectangular
field
collimators
X-ray target Primary collimator
Figure 12 Schematic of bremsstrahlung x-ray beam production,
modification, and monitoring. Bremsstrahlung x-rays are produced by a
high-energy beam of electrons striking a metallic target. The resulting
cone of x-rays is truncated by a primary conical collimator. A conical
flattening filter attenuates the forward peak of the bremsstrahlung
radiation pattern. A set of parallel-plate ionization chambers monitor the
intensity and flatness of the beam. Rectangular block collimators
truncate the conical beam to a broad rectangular beam. The tungsten
MLC leaves form the beam to a desired shape. The leaves are also used to
modulate the intensity of the beam inside this shape.

For IMRT delivery using the SMLC approach, the fluence of
each beamlet is further discretized in increments of fluence
intensity (MU). As a result of discretizing the fluence, the
treatment becomes the delivery of a sequence of shaped ‘win-
dows’ composed of gaps between the MLC leaves. The window
formed by the MLC leaves through which radiation can pass is
also referred to as an aperture. Each aperture in the delivery
sequence consists of gaps between the MLC leaves that are an
integer number of beamlet lengths. During delivery, a window
is first formed by the gaps of the first instance in the sequence
without x-ray radiation, a step. Once all the leaves are verified
by the control computer to be in place, all leaf motion is frozen
and a discrete increment of x-ray radiation (e.g., number of
MU) is delivered, a shoot. This cycle composes the instances of
the sequence. This step-and-shoot process is repeated until
dose through all the required instance windows has been
delivered.
9.17.3.1.1 Basic leaf-pair algorithm
To further understand the fundamental concept of SMLC, con-
sider two simple examples. Consider first a fluence profile at
the top of Figure 13 that is to be delivered by a single leaf pair.
The profile is delivered by four beamlets of intensities 1, 2, 3,
and 1 from left to right. In the lower part of the figure are six
sequences that each deliver the desired profile. The order of
instances in each example runs from bottom to top. Sequence
1 is a ‘close-in’ method that starts with the leaves set (first step)
at the outer control points (lowest blue bar), delivers a unit of
fluence (first shoot), and closes down on the profile maxima
with two more steps each followed by shoots of one increment
of fluence moving up the sequence depiction. The accumulated
fluence is the desired fluence. The other sequences deliver the
same fluence profile using instances with different gaps for the
steps. Sequence 6 is an example of the sweeping window
approach. In this type of sequence, the gap between the leaves
begins at the left side of the profile. Each instance moves the
leaf ends progressively toward the right side of the profile.
As this example demonstrates, any profile of discrete con-
trol points and discrete fluence values can be delivered by a
multitude of sequences. The number of sequences that are
possible for a complex profile is very large. A one-dimensional
profile may have multiple maxima with intensity levels of H1,
H2, H3, . . . discrete fluence increments. In general let Max be
the total number of such maxima. Between any two maxima is
a minimum that drops to intensity levels P1, P2, . . . discrete
fluence increments. It has been shown (Webb, 1998a,b) that
the total number of possible sequences that will deliver the
profile is (Boyer et al., 2012).
0
Intensity profile
Sequence 1
Sequence 4
Sequence 2
Sequence 5
Sequence 3
Sequence 6
21-1-2
1
2
3
0
21-1-2
1
2
3
0
21-1-2
1
2
3
0
21-1-2
1
2
3
0
21-1-2
1
2
3
0
4321
1
2
3
0
21-1-2
1
2
3
Figure 13 A simple fluence intensity profile consisting of four beamlets of intensities 1, 2, 3, and 1 from left to right. In the succeeding text, six
sequences are illustrated that can each deliver the fluence profile. Each sequence is composed of three instances to be delivered in order from the
bottom to the top. The sequence 1 is the close in approach and sequence 6 is the sliding window approach. Reproduced from Boyer AL, Ezzell GA, and
Yu CX (2012) Treatment Planning in Radiation Oncology. 3rd edn. Philadelphia, PA: Lippincott, Williams Wilkins, with permission from Lippincott,
Williams Wilkins.

A ¼
H1!H2!H3! Á Á Á HMax!
P1!P2! Á Á Á PMaxÀ1!
Applying this equation to our simple example in Figure 13,
there is a single peak of intensity 3, so that Hmax ¼3 and no
minima, so that PmaxÀ1 ¼0.
A ¼
3!
0!
¼ 6
consistent with Figure 13.
In what sense can one sequence be better than another?
Consider the total number of spatial units moved by the leaves
in the six sequences of Figure 13. For each instance, the sum of
the motions of the A and B leaves is three units for sequence 1,
four units for sequence 2, four units for sequence 3, five units for
sequence 4, four units for sequence 5, and three units for
sequence 6. Sequence 1 (the close-in approach) and sequence 6
(the sliding window approach) require less total leaf travel than
all the others. These sequences are more efficient than the others.
A second simple example of the beamlets in a cone beam is
shown in Figure 14. The intensity of the beamlets along five leaf
tracks labeled 18 through 22 are indicated by the height of the
bars. Each leaf track is indicated by a different bar color. In this
example, the beamlet width is 1 cm. The beamlet fluence inten-
sity increment is 10 arbitrary units. Every leaf pair that crosses the
bounding fluence aperture will be tasked with delivering its own
fluence profile during the sequence. Next, consider the algorithm
by which the sweeping window sequence can be automatically
determined for a given profile (see Figure 15). In panel (a) of
Figure 15, the beam profile for leaf track l¼19 from Figure 14 is
reproduced. The ordinate (labeled only on the bottom of panel
(b)) gives the leaf position control points (increments of one in
this example). Leaf A and leaf B will only stop at these points.
Leaf A will come from the left side of the figure and leaf B will
come from the right. The abscissa gives fluence or MU in an
arbitrary scale. This example will use discrete increments of ten
units of fluence. The first two beamlets on the left are to receive
20 increments and the next is a maximum required to receive 80
increments. Note that there are three maxima separated by two
minima. The algorithm can be described graphically using the
open and closed dots and the black and white numbers centered
within each fluence increment level in panel (a) of Figure 15.
The algorithm consists of producing the black and white num-
bers by tracing the profile from left to right. Each time a fluence
increment of ten units is crossed moving in the upward direction,
a white dot is placed on the profile at the control point on the left
(leaf A) side of the beamlet and numbered sequentially in black.
Each time an increment of ten units is crossed moving in the
downward direction, a black dot is placed on the profile at the
control point of the right (leaf B) side of the beamlet and num-
bered sequentially in white. Since the fluence started on the left at
zero and ended on the right at zero, there must be the same
number of white and black numbered dots. The numbered
0
6
17
18
Track
num
ber
19
20
21
22
23
5
4
3
2
1
0
Center between control points
-1
-2
-3
-4
-5
-6
10
20
30
40
50
60
70
80
Figure 14 Graphic depiction of beamlet intensities for an optimized fluence distribution. The intensity levels from 0 to 80 are in arbitrary units.
The centers of the beamlets from À6 to þ6 cm are indicated as well as MLC leaf-track numbers. The beamlets required to be delivered by each leaf pair
are of different colors. The beamlet intensities along a track are fluence profiles that leaf A and leaf B of that track must deliver. Reproduced from
Boyer AL, Ezzell GA, and Yu CX (2012) Treatment Planning in Radiation Oncology. 3rd edn. Philadelphia, PA: Lippincott, Williams Wilkins,
with permission from Lippincott, Williams Wilkins.

sequence of control points forms the instances of the sequence as
depicted in the bottom of the figure. Instance 1 is the lowest bar,
representing a gap produce with leaf A set at À4.5 and leaf B set
at À1.5. A fluence of ten units is delivered through the gap
between them. Moving up the sequence, there is no motion
before the delivery of the second fluence of ten units for Instance
2. Then, leaf A moves to the control point at À2.5 but leaf B
remains stationary. Another fluence of ten units is delivered for
Instance 3. The sequence continues as the leaves move progres-
sively from left to right. The accumulated fluence is the desired
0
1
2
3
9
10 11
4
5
6
7
8 1
2
3
4
5 6
7
8
9
10
1110
DF0
10
(b)
(a)
5.54.53.52.51.50.5-0.5-1.5-2.5-3.5-4.5-5.5
20
30
40
50 A-leaf B-leaf
Leaf track l= 19
A-leaf
B-leaf
60
70
80
90
100
110
20
30
40
50
60
70
80
Figure 15 A simple example of a fluence profile is given in top panel (a). This is the fluence profile for leaf track 19 in Figure 16. The sweeping window
segmental MLC (SMLC) algorithm sweeps the leaves from left to right. The algorithm uses the numbered open and closed dots to construct the
sequence as described in the text. Panel (b) gives the resulting sequence that starts at the lowest bar and progresses upward. Reproduced from Boyer
AL, Ezzell GA, and Yu CX (2012) Treatment Planning in Radiation Oncology. 3rd edn. Philadelphia, PA: Lippincott, Williams Wilkins, with permission
from Lippincott, Williams Wilkins.

profile. These steps can be easily programmed into an algorithm
operating on a file (standardized in Digital Imaging and Com-
munication in Medicine-Radiation Therapy (DICOM-RT) for-
mat) containing the control points and the number of fluence
increments that make up the original desired profile.
The sweeping window sequence is only one of many
instances that can be discovered for the profile as described
earlier. For the example in Figure 15, the number of possible
sequences is
A ¼
8! 6! 6!
4! 5!
The result is over 1.2 million sequences for one profile out
of five in this cone beam. The total number of possibilities for
the whole cone beam is the product of the number of possi-
bilities for each of the five profiles, a number in this example
that is in the trillions. The example given is much simpler than
most sequences in a clinical IMRT plan. In Figure 15, a total of
110 MU is required to deliver a peak beamlet with 80 MU
intensity. Other candidates for this fluence profile would
require more MU. They would be less desirable. A number of
authors have shown that, in general, the sliding window SMLC
sequence is the most efficient for complex sequences required
to deliver profiles with multiple peaks (Ma et al., 1998).
The gaps for each leaf pair in a sequence are put together to
create a sliding window for the sequence (see Figure 16). This
figure depicts the sliding window SMLC sequence that delivers
the whole area of fluence depicted in Figure 16. Note that the
sequence can be simplified by combining the first four instances
into a single instance of 0.20 fractional MU. Notealsothat the leaf
pair for track 19 did not begin until the fifth step-and-shoot cycle.
The overall sequence can be optimized by the synchronization of
the leaf pairs in each track (Ma et al., 1999). This description of
the algorithm also neglects fluence accumulated by transmission
through the MLC leaves (about 1–2%) and other practical dosim-
etry problems that will be considered in a later section.
9.17.3.1.2 Logarithmic direct aperture decomposition
Another approach to designing the sequences considers all the
beamlets at once in order to group multiple gaps together as
instances directly (Siochi, 1999). One such approach is the
logarithmic aperture decomposition of the optimized fluence
distributions for fixed gantry fields (Xia and Verhey, 1998).
This strategy is based on the notion that proceeding by powers
of 2 leads in some sense to optimal processes. The method will
be described by using it to create a sequence of apertures for the
fluence distribution depicted graphically in Figure 14. A
numerical matrix depiction of the distribution is given as
Instance 1 in Figure 17. The highest value in the distribution
0.25
0.20
0.15
0.10
0.05
0.50
0.45
0.40
0.35
0.30
0.75
0.70
0.65
0.60
0.55
1.00
0.95
0.90
0.85
0.80
Figure 16 The aperture step sequence created by combining SMLC leaf sequences for the five leaf-track fluence profiles in the intensity distribution
depicted graphically in Figure 14. The central axis of the field is indicated by a red cross. A cumulative fraction of the total monitor units (MU) to be
delivered in 0.05 increments during each shoot part of each instance is given in the upper left beside each subgraph. Leaf track 19 is indicated by a
dashed line. The sequence depicted in Figure 15 starts in the instance labeled 0.25 cumulative fractional MU. The profile for track 19 is completely
delivered by the shoot in the instance labeled 0.75. The total number of MU required to deliver the peak beamlet intensity is 200 MU for this sequence.
Reproduced from Boyer AL, Ezzell GA, and Yu CX (2012) Treatment Planning in Radiation Oncology, 3rd edn. Philadelphia, PA: Lippincott,
Williams Wilkins, with permission from Lippincott, Williams Wilkins.

is 80 MU. The highest power of two that can be delivered is
therefore 64 MU and can be only delivered at two peak inten-
sity beamlets in leaf tracks 19 and 21. Assuming an instance of
64 MU has been delivered, the residual matrix is depicted next
for a second instance. In the second instance, there are more
beamlets to which 32 MU can be delivered. However, some
tracks require more than one contiguous gap. Instances 2, 3,
and 4 are needed to irradiate beamlets in 32 MU increments.
The matrix is updated by subtracting the delivered MU for each
of these instances. The process is continued until only residual
areas of 2 MU are left. Instances 13, 14, and 15 complete the
delivery down to zero MU left to be delivered. The sequence so
constructed is efficient.
9.17.3.1.3 Matrix inversion
The one-dimensional leaf-track sequencing method and the
two-dimensional decomposition leaf sequencing method are
conceptually mathematical operations on a two-dimensional
intensity matrix. In the previous sections, we have used graphic
examples to describe the concepts. The implementation
requires computer algorithms that take these steps mathemat-
ically. An elegant example is a matrix operator method for leaf
sequencing developed by Ma et al. (1998, 1999). The beamlet
intensity map I (the same matrix as x in Section 9.17.2.1) is
treated as a matrix, and the steps to the leaf positioning
sequences are matrix operations on I that lead to an ordered
set of matrices describing the gap sequence. To describe this
method, we will employ the simple example used in the pre-
vious sections (see Figure 14). The matrix representation of
this intensity map is
I ¼
00 30 00 20 00 30 50 00 00
20 20 80 50 40 60 50 60 30
10 40 30 40 40 20 30 60 30
00 00 20 50 40 10 30 50 70
00 00 00 00 20 30 10 40 60

Note that we only use the leaf tracks and leaf position
indices that deliver some nonzero intensity. Each row in the
matrix represents a leaf track, and each column represents a
beamlet bounded by leaf-end control positions. The top row of
I contains the MU levels along leaf track 18, the second row
contains the MU levels along leaf track 19 (this row was ana-
lyzed earlier), and the bottom row of I contains the MU levels
along leaf track 22. The next step is to locate positive and
negative gradients in I by means of an ‘increment matrix’ A
that separates out the positive and negative steps:
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
5
0
0
30
40
70
60
0
4
0
0
60
60
60
40
0
3
0
50
50
30
30
10
0
2
0
30
60
20
10
30
0
1
0
0
40
40
40
20
0
0
0
20
50
40
50
0
0
-1
0
0
80
30
20
0
0
-2
0
30
20
40
0
0
0
-3
0
Instance 1 – 64MU
Instance 14 – 2MU
Instance 2 – 32MU
Instance 15 – 2MU
Center between control pointsCenter between control points
TracknumberTracknumber
TracknumberTracknumber
0
20
10
0
0
0
-4
0
0
0
0
0
0
0
-5
0
0
0
0
0
0
0
-6
17
18
19
20
21
22
23
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
5
0
0
30
40
6
60
0
4
0
0
60
60
60
40
0
3
0
50
50
30
30
10
0
2
0
30
60
20
10
30
0
1
0
0
40
40
40
20
0
0
0
20
50
40
50
0
0
-1
0
0
16
30
20
0
0
-2
0
30
20
40
0
0
0
-3
0
0
20
10
0
0
0
-4
0
0
0
0
0
0
0
-5
0
0
0
0
0
0
0
-6
17
18
19
20
21
22
23
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
2
0
2
0
0
-1
0
0
0
0
0
0
0
-2
0
0
0
0
0
0
0
-3
0
0
0
2
0
0
0
-4
0
0
0
0
0
0
0
-5
0
0
0
0
0
0
0
-6
17
18
19
20
21
22
23
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
3
0
0
2
0
2
0
0
2
0
0
0
0
2
0
0
1
0
0
0
0
0
0
0
0
0
0
2
0
2
0
0
-1
0
0
0
2
0
0
0
-2
0
2
0
0
0
0
0
-3
0
0
0
2
0
0
0
-4
0
0
0
0
0
0
0
-5
0
0
0
0
0
0
0
-6
17
18
19
20
21
22
23
Figure 17 Logarithmic direct aperture decomposition. The first two and last two instances of a step-and-shoot sequence composed by creation of
apertures that deliver exponentially decreasing (powers of 2) increments of MU. The upper left panel depicts the matrix representing the cone beam
intensity pattern of Figure 16. The two elements in which 64 MU may be delivered are indicated by white. The second panel at the top shows the
residual MU to be delivered and indicates in white the window through which 32 MU may be delivered. The same method is repeated for consecutive
instances down to the fourteenth instance depicted in the bottom left panel. The residual intensity matrix consists of elements requiring 2 MU.
The white elements indicate possible gaps for the delivery of 2 MU. The last panel depicts the last delivery instance after which all intensities have been
delivered.

A ¼ IW
where W is defined as
W
1 À1 : : : 0
0 1 À1: : : :
0 0 : : : :
: 0 : : : :
: : : 0 1 À1
0 : : 0 0 1
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
Applying the gradient detection matrix to our example
yields
A ¼ IW
¼
þ30 þ30 À20 þ20 À30 À20 þ50 þ00 þ00
þ00 À60 þ30 þ10 À20 þ10 À10 þ30 þ30
þ30 þ10 À10 þ00 þ20 À10 À30 þ20 þ40
þ00 À20 À30 þ10 þ30 À20 À20 À20 þ70
þ00 þ00 þ00 À20 À10 þ20 À30 À20 þ60

The increment matrix, A, is then decomposed into two
matrices, one with only positive elements and the other with
only negative elements, A¼Aþ
þAÀ
. The Aþ
and AÀ
matrices
are calculated as
Aþ

1
2
Aj j þ Að Þ
AÀ

1
2
Aj j À Að Þ
8

:
An application to our example is
Aþ
¼
30 30 00 20 00 00 50 00 00
00 00 30 10 00 10 00 30 30
30 10 00 00 20 00 00 20 40
00 00 00 10 30 00 00 00 70
00 00 00 00 00 20 00 00 60

and
AÀ
¼
00 00 20 20 30 20 00 00 00
00 60 00 00 20 00 10 00 00
00 00 10 00 00 10 30 00 00
00 20 30 00 00 20 20 20 00
00 00 00 20 10 00 30 20 00

The leaf trajectory (LT) matrices for delivering the intensity
map I are then calculated using the following operations:
IT ¼ Aþ
WÀ1
IL ¼ AÀ
WÀ1

where IT and IL are LT matrices for the trailing and the leading
leaves, respectively. The value of an IL or IT matrix element is
the accumulated MU or intensity delivered at the location of a
leaf checkpoint corresponding to the position of the column in
this matrix. In the example, we obtain the results
IT ¼ Aþ
WÀ1
¼
100 100 070 070 050 050 050 000 000
110 110 110 080 070 070 060 060 030
090 090 080 080 080 060 060 060 040
110 110 110 110 110 070 070 070 070
080 080 080 080 080 080 060 060 060

and
IL ¼ AÀ
WÀ1
¼
100 070 070 050 050 020 000 000 000
090 090 030 030 030 010 010 000 000
080 050 050 040 040 040 030 000 000
110 110 090 060 060 060 040 020 000
080 080 080 080 060 050 050 020 000

Ma has shown that the total number of leaf segments is
given by the total number of nonzero nonequal elements of
the LT matrices under this algorithm. For this example, there
are a total of 11 of them, that is, {10, 20, 30, 40, 50, 60, 70, 80,
90, 100, 110}. These correspond to the equal steps of ten from
0 to the maximum value of 110 used in the original intensity
matrix. Therefore, the total MU for the delivered sequence is
110 and the total number of segments is 11 with 10 MU
delivered for each segment. The shape of a subfield at the
instant when the accumulated MU equals m is composed
using the following expression:
O mð Þ ¼ IT À mLð Þ þð Þ
À IL À mLð Þ þð Þ
where O(m) is a matrix giving the shape of the open aperture of
the subfield mapped out by as positive þ1 elements in the
matrix, L is the unit matrix whose elements are all þ1, and the
apex symbol (þ) denotes the operation replacing all positive
elements (including 0) of the bracketed matrix with þ1 but
other elements with zero. The entire 11 beam apertures are
given in Figure 18. Note that the direction of motion of the
leaves is switched in this depiction, being right to left as com-
pared to left to right in our earlier examples. If one compares
the gap sequence for leaf track 19 in Figure 15(a) with the track
19 gap sequence in Figure 18, one finds that they are in fact
identical.
9.17.3.2 DMLC Delivery
The earliest implementation of beam modulation was the
development of algorithms and computer control systems to
deliver beams that mimicked the dose distributions produced
by a physical radiotherapy wedge. The delivery sequence
begins with the x-ray collimating blocks positioned at the
margins of the field to be delivered. As radiation is delivered
at a uniform rate, one collimator is moved with a controlled
velocity across the x-ray beam to a stopping point near its
opposing mate. The side of the x-ray beam at which the moving
collimator stops will obviously receive more dose than the side
at which it started. The system uses position control points
across the track of the moving collimator. Accurate control of
the MU delivered by the times at which the collimator reaches
each control point allows the creation of a beam dose profile
that mimics the physical wedge dose profile. The control sys-
tem requires a feedback loop that compares the MU delivered
and the collimator position. If the collimator begins to fall
behind the MU delivery schedule, its velocity can be increased.
Conversely, if the collimator gets ahead of schedule, its velocity
can be decreased. With closely spaced control points and a
short feedback loop cycle, the block’s trajectory can be pre-
cisely controlled. The application of this technology to the
simultaneous control of all the leaves in an MLC enables the
development of a continuously sweeping DMLC. In this imple-
mentation, the effective velocity of the leaves can be measured
in units of distance moved per MU delivered instead of

distance moved per time elapsed. Given the ability to precisely
control the effective velocity of all the MLC leaves, changing
the effective velocities of the leading leaves and the trailing
leaves determines the accumulated fluence delivered for each
beamlet along the leaf-pair track. The beamlet intensity is
determined by t, the difference in time (in units of MU) at
which the leading leaf crosses the beamlet position and the
time (in units of MU) at which the trailing leaf crosses the
beamlet position. This MU difference determines the beamlet
fluence value and is directly related to the dosimetry of the
treatment.
9.17.3.2.1 Leaf-pair speed optimization
What algorithm will best compose a dynamic LT schedule for a
fluence profile? The ideas behind the algorithm can be dis-
cussed with the aid of Figure 19. Figure 19(a) depicts a fluence
profile against position. This simple arbitrary example contains
three maxima and two minima. Thus, it has six regions deter-
mined by the sign of the gradient of the profile. The modulated
fluence is to be created by a schedule for leaves moving from
left to right with a leading leaf B on the right moving with
velocity VB and a trailing leaf A on the left moving with velocity
VA. The mathematical derivations of the leaf velocities, VA(t)
and VB(t), must be such that the MU delivered between the
time leaf B reaches a position and opens it to the receipt of
radiation and the time leaf A reaches the position and shuts off
the radiation to that point is equal to the beamlet intensity for
that position. The first positive gradient region is similar to a
wedged field with a complex shape created by a gap increasing
between the leaves. It could be delivered with the trailing leaf
being stationary and the velocity of the leading leaf modulated
to form the beam shape. However, the next region has a
negative gradient and must be delivered by a closing gap. The
leading leaf B must therefore race to the position of the first
maximum with maximum velocity, Vmax, so that it can partic-
ipate in a closing gap beginning at that point. The regions of
the graph having negative slope can be rotated about the
vertical (Figure 19(b)) and shifted (Figure 19(c)) to maintain
the same opening time t as the original profile but allow for
closing gaps. The resulting figure can then be skewed with a
slope that corresponds to the maximum leaf velocity
(Figure 19(d)). The graphic operations can be translated into
mathematical operators from which the leaf velocities can be
derived (Xing et al., 2005). The leaf velocities for leaf A, VA, and
leaf B, VB, as a function of the leaf position for the dynamic
leaf sequence were originally derived independently by sev-
eral investigators (Dirkx et al., 1998; Spirou and Chui, 1994;
Svensson et al., 1994).
Y gradient VA VB
Positive Vmax/[1þVmax (dYdx)] Vmax
Negative Vmax Vmax/[1ÀVmax (dYdx)]
where dY/dx is the gradient of the fluence profile. The numer-
ical result for our example in Figure 19 using the equations
earlier is depicted in Figure 20.
9.17.3.2.2 Special quality assurance
The actions sufficient to insure the safe and accurate perfor-
mance of an IMRT treatment system fall into two overlapping
18 50 MU
19
20
21
22
18 60 MU
19
20
21
22
18 70 MU
19
20
21
22
18 80 MU
10 MU
20 MU
30 MU
40 MU
19
20
21
22
18 90 MU
19
20
21
22
18 100 MU
19
20
21
22
18 110 MU
19
20
21
22
18
19
20
21
22
18
19
20
21
22
18
19
20
21
22
18
19
20
21
22
Figure 18 The leaf aperture sequence produced by the matrix method. The MU identify the order of the sequence. The leaf-track numbers are
given to the right of each instance of the sequence. Note that the gaps sweep from right to left. The sequence of gap widths and locations are the same as
that depicted for the step-and-shoot sequence given in Figure 16.

divisions of labor. Initially, the installation of the system must
be probed with extensive tests and measurements to demon-
strate acceptable performance and to collect, verify, and install
data for the computer files that will be used routinely. These
tests then evolve into efficient routine tests intended to verify
that the system continues to perform as initially demonstrated.
The IMRT system is extensive, overlapping with more routine
treatment procedures and processes, consisting of
• the general imaging equipment (CT scanners, MRI
scanners, PET/CT scanners, and gamma cameras) used to
Position (cm)
AccumulatedMU
Leaf position schedule
Leaf A
Leaf B
0.0-1.0-2.0-3.0-4.0-5.0-6.0
0
10
20
30
40
50
60
70
80
90
100
1.0 2.0 3.0 4.0 5.0 6.0
Figure 20 Computed leaf trajectories for the fluence profile in Figure 19. The schedules for leaf A and leaf B are a function of total time that is in
turn proportional to MU. Leaf B begins moving with maximum velocity (red line), while leaf B produces the beam modulation. Then, the roles are
switched back and forth in each gradient region for the rest of the sequence.
Position (cm)(a)
FluenceF
Position (cm)(b)
FluenceF
Position (cm)(c)
FluenceF
Position (cm)
1/Vmax
(d)
FluenceF
Figure 19 Graphic description of the derivation of the dynamic MLC (DMLC) algorithm. Panel (a) depicts the fluence profile to be delivered and
indicates the positive and negative gradient regions. Panel (b) depicts a rotation of the negative gradient regions about the horizontal. Panel (c) depicts a
shift of the regions to remove discontinuities in a leaf opening window. Panel (d) depicts the shear of the trajectories to account for the maximum
leaf speed.

IMRT

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie IMRT

Ähnlich wie IMRT (20)

IMRT