Weitere ähnliche Inhalte Ähnlich wie A Soft-Decision Approach for Microcalcification Mass Identification from Digital Mammogram. (20) Kürzlich hochgeladen (20) A Soft-Decision Approach for Microcalcification Mass Identification from Digital Mammogram.1. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 36 DECEMBER 2008 ISSN 2070-3740
A Soft-Decision Approach for Microcalcification
Mass Identification from Digital Mammogram
B.Surendiran€, A.Vadivel£, Henry Selvaraj¥
Abstract—Breast cancer is one of the major causes of fatality
among women aged 40 and above. Digital mammography is used
by radiologist for analysis and interpretation of cancer. Visual
reading and interpretation of mammograms is a very demanding
and expensive job. Even well-trained experts may have an interobserve variation rate of 65-75 percent. Computer Aided
Diagnosis (CAD) systems have been developed to complement
radiologists in interpreting mammograms for mass detection and
identification of calcification. Thus, it is very important to develop
CADs that can identify malignant lesions effectively. A
combination of CAD scheme and expert knowledge would
effectively improve the rate of detection and accuracy of masses.
We use a soft-decision approach for identifying the
microcalcification mass present in digital mammograms. A
suitable clustering algorithm is applied for partitioning the digital
mammogram into various meaningful regions. During post
processing phase, the background region is identified and not
considered for further processing. The Coefficient of Variation
(CV) of various regions of the partitioned mammogram is
calculated and the microcalcification lesion present in
mammogram is identified. The experimental result is found to be
encouraging.
Keywords - Mammogram, Soft-decision, Microcalcification, Gray
Weight
have been proposed and available for early detection and
screening of breast cancers, the mammography is being
considered as one of the most effective method [1]. Two
important early signs of the disease are micro calcifications
and masses [2]. Among these two signs, masses are
considered to be more difficult to detect than
microcalcifications, since in this category the low-level
features are usually found to be obscured or similar to
normal breast parenchyma. In addition, the masses are quite
thin and often present in the dense areas of the breast tissue.
It has smoother boundaries than micro calcification and has
shapes like circumscribed, speculated, lobulated or illdefined. The circumscribed ones usually have distinct
boundaries of 2-30mm in diameter and high-density
radiopaque. Among these, the speculated ones have rough,
star-shaped boundaries and the lobulated ones have
irregular shapes [3]. The masses present must be classified
as benign and malignant for improving the biopsy yield
ratio. Further, the masses are classified as malignant and
benign based on certain properties of the respective region.
While the masses with radiopaque and more irregular
shapes are usually defined as malignant, regions combined
with radiolucent shapes are being defined as benign [14].
The content of a mammogram can be
differentiated with four levels of intensity such as
background, fat tissue, breast parenchyma and calcifications
with increasing intensities. The masses develop from the
epithelial and connective tissues of breasts and their
densities on mammograms are inseparably together with
parenchyma pattern. In medical viewpoint, reading visually
and interpreting mammograms is considered to a very
demanding job for radiologists. Their judgment essentially
depends on the training, experience and subjective criteria.
Even well-trained experts may have an inter-observe
variation rate of 65-75 percent. Computer Aided Diagnosis
(CAD) systems have been developed to complement
radiologists in interpreting mammograms for mass
detection and identification of calcification. This has given
an edge over the diagnosis and observed that 65-90 percent
of the biopsies of suspected cancers turned out to be benign.
Thus, it is very important to develop CAD systems that can
distinguish between benign and malignant lesions
effectively. The combination of CAD scheme and expert’s
knowledge would effectively improve the rate of detection
accuracy of masses. While the detection sensitivity without
CAD is found to be 80 percent, with CAD, the detection
sensitivity is found to be up to 90 percent [5].
I. INTRODUCTION
It has been found that breast cancer occurs in over eight
percent of women during their lifetime, which has become
one of the leading causes of death [4]. It has been noticed
from various studies that there is a positive association of
tissue type with potential breast cancer risks [6],[13].
Further, women who have breast cancer can easily get
contralateral cancers in the other breast [8],[10]. However,
distinguishing a new primary from metastasis is not always
possible due to their similarities in features. The asymmetry
property of breast parenchyma between the two sides has
been found to be one of the useful signs for detecting
primary breast cancer [7]. While various methods have
been proposed and available for early detection and
screening of breast cancers, the mammography is being
considered as one of the most effective method [1]. Two
important early signs of the disease are micro calcifications
€
B.Surendiran is currently working as research scholar at
National
Institute
of
Technology
Tiruchirappalli,
India.
(Email:405107004@nitt.edu).
£
A Vadivel is with the Department of Computer Applications, National
Institute of Technology Tiruchirappalli, India. (Corresponding Author:
Email:vadi@nitt.edu ).
¥
Henry Selvaraj is with the Department of Electrical and
Computer
Engineering, University of Nevada, Las Vegas, Nevada, United States of
America, 89154 (Email: selvaraj@unlv.nevada.edu)
Generally, most of the existing mass detection
CAD schemes involve various phases such as digitizing
mammograms, image processing, image segmentation,
feature extraction and selection, classification and
The asymmetry property of breast parenchyma between the
two sides has been found to be one of the useful signs for
detecting primary breast cancer [7]. While various methods
PWASET VOLUME 36 DECEMBER 2008 ISSN 2070-3740
1236
© 2008 WASET.ORG
2. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 36 DECEMBER 2008 ISSN 2070-3740
evaluation. First, the image preprocessing is being carried
out by which the mammogram is digitized to suppress noise
and improve the contrast content discrimination. Secondly,
image segmentation is performed for locating the
suspicious regions and they are considered to be the
Regions of Interest (ROI). However, it is different from the
common definition of segmentation in image processing. In
the third phase, the low-level features are extracted and
selected for classifying lesion types or removing false
positives. Finally, the detection or classification of masses
is being performed. Although, the CAD schemes were
independently developed using different data sets of limited
size, most of the schemes yielded similar performance, in
the rage of 85-95 percent true positives rate, with 1-2
percent false positives.
In this paper, we use a soft-decision approach from
the HSV color space for identifying microcalcification
masses present in digital mammogram. The K-means
clustering algorithm is applied for partitioning the digital
mammogram into various meaningful regions. The
Coefficient of Variation (CV) of various regions of the
segmented mammogram is used as a statistical measure for
identifying microcalcification masses. The soft-decision
based gray content estimation from the HSV color space is
presented in Section 2. In Section 3, we present pixel
grouping by K-means clustering algorithm and post
processing schemes. The experimental result is presented in
Section 4 and we conclude the paper in the last Section.
The content of mammogram can be discriminated with four
intensity levels. However, it is known that the mass present
in mammogram is surrounded by smooth boundaries. Thus,
it is essential to capture the boundary information through
pixel values for partitioning the masses in spatial domain.
We do spatial domain processing in the HSV color space,
since this color space is closely related to the human visual
perception of color and gray pixels. For each pixel, a
weighted value is calculated from the soft-decision function
and it captures the degree of gray content of a pixel. The
weight function is found to be robust against noise [11].
The function is given below:
GW (S , I ) = 1 − S
Sat=0.0
Sat=0.2
Sat=0.4
r2
for !( R = G = B) (1)
The range of GW (S , I ) is [0-1] and it estimates the
degree of gray content of a pixel using both the saturation
and intensity values. From Eq. 1, the gray weight value
holds for ! ( R = G = B ) . This is due to the fact that for
R=G=B, the saturation of a pixel is zero and the gray
weight value will always be zero irrespective of the
intensity value. However, to capture effectively the degree
of gray content of a pixel, we slightly perturb either the
value of R, G or B, which influences the saturation value of
a pixel. Further, the function is smooth with saturation
value and found to be continuous [12] as shown in Figure 1.
It is evident from Fig.1 that r1 should take a value which is
slightly higher than 0.0 and r2 should take a value slightly
less than 1.0 for having smooth variation gray level
weighted values. From our earlier work, it is found that the
suitable values for r1=0.1 and r2=0.85.
II. SOFT-DECISION BASED GRAY CONTENT ESTIMATION FROM
THE HSV COLOR SPACE
2
r1 (255 / I )
Sat=0.6
Sat=0.8
Sat=1.0
DS
1.5
1
0.5
0
0
20
40
60
80
100
120
140
160
180
200
220
240
Intensity
Fig. 1. Variation of Partial Derivative of GW (S, I) with S for Different Values of Saturation with r1=0.1, r2=0.85
III. PIXEL GROUPING BY K-MEANS CLUSTERING ALGORITHM
AND POST PROCESSING
PWASET VOLUME 36 DECEMBER 2008 ISSN 2070-3740
1237
© 2008 WASET.ORG
3. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 36 DECEMBER 2008 ISSN 2070-3740
Fig. 2. Result of K-means Clustering for grouping the pixels and
post Processing (a) Original Mammogram (b) After applying KMeans Clustering (c) Applying Connected Component analysis
(d), (e) and (f) Final refined mammogram
We perform a series of post-processing after applying the
clustering algorithm on the weighted gray value of pixels of
mammogram, which is calculated using Eq. 1. The
clustering problem is to represent the mammogram as a set
of n non-overlapping partitions and is given below:
I ≡{O1|O2|O3|….|On}
In Figure 3, we also show the segmented mammogram
images without performing connected component analysis.
(2)
Here, each Oi consists of the position of all the image pixels
and its equivalent gray weight, size and center. We use KMeans clustering for pixel grouping. In K-Means clustering
algorithm, we start with K=2 and adaptively increase the
number of clusters till maximum number of clusters is
reached, which is 8 and the result of clustering and post
processing is shown in Figure 2. In Fig. 2(a), we show a
large irregularly-shaped spiculated lesion mammogram. In
Figure 2(b), we show the transformed image after it is
clustered based on gray weight. It is noticed from Fig. 2(b)
that the clustered pixels do not yet contain sufficient
information about the various regions in the image because
it is not yet known if all the pixels that belong to the same
cluster are actually part of the same region or not. To
ascertain this, we perform a connected component analysis
[9] of the pixels to determine the different regions in the
mammogram. We also identify the connected components
whose size is less than a certain percentage of the size of
the mammogram. These small regions are to be merged
with the surrounding clusters in the next step. Such regions
which are candidates for merger are shown in white in Fig.
2(c). In the last post-processing step, the small regions are
merged with their surrounding regions with which they
have maximum overlap. The mammogram at the end of this
step is shown in Fig. 2(d). It is seen that the various
foreground and background objects of the mammogram
have been clearly segmented.
The segmented
mammograms for analysis and interpretation is shown in
Fig. 2(e)-(f).
2(a)
2(b)
(a)
(b)
(c)
(d)
Fig 3. Mammogram segmentation with different K (a) Original
mammogram (b) Segmented mammogram with K=4 (c)
Segmented mammogram with K=6 (d) Segmented mammogram
with K=8.
It is observed from Fig. 3 that the microcalcification masses
present is found to be very small with ill-defined boundary.
Using connected component analysis approach for merging
smaller regions with neighboring larger regions, in this
case, result in loss of information about the
microcalcification mass region. Thus, for our experiments,
we adaptively decide whether to use connected component
analysis or not. In another post-processing approach, we
group the regions as larger and smaller by choosing a
suitable threshold on size of the regions as given below.
I1 ≡{O1|O2|O3|….|Ok}
(3)
I2 ≡{Ok+1|Ok+2|Ok+3|….|On}
(4)
where k<n. As a result of this partition, the background
information is also eliminated.
In addition, regions having gray weight less than the
average of gray weight of the entire mammogram are also
discarded and not considered for further calculation to
avoid the small background regions. Finally, since the
microcalcification mass area in a mammogram is small, it
will fall in the smaller partition.
2(c)
IV. EXPERIMENTAL RESULTS
2(d)
2(e)
We have used mammogram from the Digital Database for
Screening Mammography database, The Computer Vision /
Image Analysis Research Laboratory at the University of
2(f)
PWASET VOLUME 36 DECEMBER 2008 ISSN 2070-3740
1238
© 2008 WASET.ORG
4. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 36 DECEMBER 2008 ISSN 2070-3740
standard deviation of data should always be measured in the
context of the mean of the data. Since, the CV is a
dimensionless number and when comparing data sets with
different dimension, units or wildly different means, we
should use the CV for comparison instead of the standard
deviation. We found that using CV measure is more
suitable, as the size of various regions present in
mammogram will have different dimensions.
South
Florida
(http://marathon.csee.usf.edu/Mammography/DDSM) for
carrying out our experiments and considered the
malignancy category only. We have used only the gray
level information and the degree of grayness captured
through gray weight function. The Coefficient of Variation
(CV) has been used as statistical measure for identifying the
microcalcification masses. The CV is the ratio of Standard
Deviation to Mean. The advantage of using CV is that the
TABLE 1. COEFFICIENT OF VARIATION (CV) OF VARIOUS REGIONS OF MAMMOGRAM WITH CANCER
Mammogram with
Region
Region
Region
Region
Region
Region
Region
Region
Region
Region
cancer
1
2
3
4
5
6
7
8
9
10
C-0001-1
0
0
0
0
0.09868
0.81863
0.03808
0.00224
0.020554
1.7683
C-0002-1
0.00036
0.00597
0.0734
0.0017
0.00068
0.00052
0.00013
0
0.000111
0.009343
C-0003-1
0
0
0
0.0002782 0.0158686
0
0.0002724
0.076804 0.0005980
C-0004-1
0.00503
0
0
0.0003
0
0.005735
0.014623
0.000552
0.002074
0.05588
C-0006-1
0
0.03225
0.0003
0
0.00613
0
0.001753
0
0.00049
0.052279
C-0007-1
0.001262
0
0.003033
0.0108
0.000244
0.000771
0.001624
0.001999
0.000514
0.1508
C-0009-1
0.01377
0.000319
0
0
0.0026
0
0
0
0.001878
0.07629
C-00010-1
0
0
0.0005
0
0
0.0075
0.01261
0.000227
0.000849
0.075549
[1]. K. Bovis, S. Singh, J. Fieldsend, C. Pinder, “Identification of
masses in digital mammograms with MLP and RBF nets”, in:
In Table 1, we show the CV for various regions present in
Proceedings of the IEEE-INNS-ENNS International Joint
cancer category of DDSM mammograms. The smaller
Conference on Neural Networks Com, 2000, pp. 342–347.
regions only are considered for the calculation of CV and
[2]. H.D. Cheng, X.P. Cai, X.W. Chen, L.M. Hu, X.L. Lou,
“Computer-aided
detection
and
classification
of
the region with higher value of CV is highlighted. From
microcalcifications in mammograms: a survey”, Pattern
our experimental results, it can be noticed that the regions
Recognition, Vol. 36, 2003, pp. 2967–2991.
with higher value of CV are identified as microcalcification
[3]. I. Christoyianni, E. Dermatas, G. Kokkinakis, “Fast detection
mass area compared to the rest of the regions of the
of masses in computer-aided mammography”, IEEE Signal
Process.Mag, Vol. 17 (1), 2000, pp. 54–64.
mammogram. This is due to the fact that the gray weight
[4]. A.S. Constantinidis, M.C. Fairhurst, F. Deravi, M. Hanson,
values of the pixels of the microcalcification mass region is
C.P. Wells, C. Chapman-Jones, “Evaluating classification
expected to show smooth texture behavior, which may not
strategies for detection of circumscribed masses in digital
be distinguished by human. The weighted gray value
mammograms”, in: Proceedings of 7th International
Conference on Image Processing and its Applications, 1999,
captures this smooth variation and is measured in terms of
pp.435–439.
CV. In the Table 1, the regions with zero value shows that
[5]. K. Doi, “Computer-aided diagnosis: potential usefulness in
either the mean or the standard deviation of the respective
diagnostic radiology and telemedicine”, in: Proceedings of
region is zero and thus the coefficient of variation is either
National Forum ’95, 1996, pp. 9–13.
[6]. R.L. Egan, R.C. Mosteller, “Breast cancer mammography
infinity or zero respectively.
patterns, Cancer”, Vol. 40, 1977, pp. 2087–2090.
[7]. T.J. Rissanen, H.P. Makarainen, M.A. Apaja-Sarkkinen, E.L.
Lindholm, “Mammography and ultrasound in the diagnosis of
V. CONCLUSION
contralateral breast cancer”, Acta Radiol. Vol, 36, 1995, pp.
Identification of microcalcification masses present in digital
358–366.
[8]. G.F. Robbins, J.W. Berg, “Bilateral primary breast cancers, A
mammogram is considered to be a difficult task. We have
Prospective Clinicopathol”, Study Cancer, Vol. 17, 1964, pp.
used a soft-decision approach for identifying the
1501–1527.
microcalcification mass present in digital mammograms.
[9]. G. Stockman and L. Shapiro, “Computer Vision”, Prentice
The Coefficient of Variation (CV) is used as a measure for
Hall, 2001.
[10]. H.H. Storm, O.M. Jensen, “Risk of contralateral breast cancer
identifying the region. The experimental results are
in Denmark” 1943-80, Br. J. Cancer, Vol. 54, 1986, pp. 483–
encouraging. As a future direction to this work, we will use
492.
other statistical measures such as smoothness, entropy,
[11]. A. Vadivel, Shamik Sural and A.K. Majumdar, “An Integrated
skew ness and uniformity and construct a statistical feature
Color and Intensity Co-Occurrence Matrix”, Pattern
Recognition Letters, Elsevier Science, Vol. 28(8), pp. 974-983,
vector. We will also propose Neural Network architecture
2007.
with suitable training and learning algorithm to classify the
[12]. A. Vadivel, Shamik Sural and A.K. Majumdar, “Robust
microcalcification regions.
Histogram Generation from the HSV Space based on Visual
Colour Perception”., International Journal of Signal and
Imaging Systems Engineering (in press).
Acknowledgement: The work done by Dr. A.Vadivel is supported by
[13] J.N. Wolfe, “Breast patterns as an index of risk for developing
research grants from Indo-US Science and Technology Forum
breast cance”r, Am. J. Roentgen. 126 (1976) 1130–1139.
[14]. B. Zheng, Y.H. Chang, X.H. Wang, W.F. Good, D. Gur,
(IUSSTF) Ref. No. IUSSTF/Fellowship/2007-08/5-2008 and the
“Application of a Bayesian belief network in a computerDepartment of Science and Technology, India, under Grant
assisted diagnosis scheme for mass detection”, SPIE
SR/FTP/ETA-46/07 dated 25th October, 2007.
Conference on Image Processing, Vol. 3661 (2), 1999, pp.
1553–156
References
PWASET VOLUME 36 DECEMBER 2008 ISSN 2070-3740
1239
© 2008 WASET.ORG
5. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 36 DECEMBER 2008 ISSN 2070-3740
B.Surendiran is a research scholar in the Department of Computer
Applications, National Institute of Technology, Tiruchirappalli, India. His
research interest includes digital mammogram analysis, image processing and
computer networks.
Dr. A Vadivel, is currently with the Department of Computer Applications,
National Institute of Technology, Tiruchirappalli, India. He has obtained his
both Masters and PhD from Indian Institute of Technology, Kharagpur, India.
His research interest includes medical image processing, content based image
and video retrieval, search engine architecture design with multi-feature
support and SAR image analysis. He is Indo-US Research Fellow 2008. Also,
he has been conferred Young Scientist Award by the Department of Science
and Technology, Govt. of India in 2007.
Dr. Henry Selvaraj is Chair and Professor of the Department of Electrical
and Computer Engineering, University of Nevada, Las Vegas, USA. He
earned his Ph.D. at Warsaw University of Technology, Warsaw, Poland.
Before joining UNLV in 1998, Dr Selvaraj was a faculty member in Monash
University, Australia. His research Interests includes Logic Synthesis, Digital
Design, Programmable Devices, Artificial Intelligence, Multiple Valued
Functions Digital Signal Processing, Bio-medical Image Processing,
Networks and Path Planning. He has conferred with UNLV Alumni Student
Centered Faculty Award for 2002. He is member of Curriculum Committee,
Desert Pines High School (AOIT), Las Vegas, member of UNLV Faculty
Senate (since 1999), Alternate member of Senate Tenure and Promotion
Committee. He is general chair for many International conferences.
PWASET VOLUME 36 DECEMBER 2008 ISSN 2070-3740
1240
© 2008 WASET.ORG