1. Computer Aided Automated Lung Nodules Detection
System in Computed Tomography
Tithi Vyas1
, Prof. Divyang Patel2
Electronics & Communication Department,
Shakersinh Vaghela Bapu Institute of Technology
Vasan, Gandhinagar, Gujarat
tithi.vyas@ymail.com
Panth Shah3
Masters of Science, Computer Engineering
Illinois Institute of Technology
Chicago, IL-60616, USA
Pshah80@hawk.iit.edu
Abstract Many people worldwide die due to cancer and one
of them is Lung Cancer. Many times it is caused because of the
late diagnosis of the malignant cells present which are difficult to
identify. Nodules or lumps present in the lung can be difficult to
observe as ribs and vessels are also detected in the radiograph or
in tomography. To determine the presence of nodules, a
Computer Aided Detection (CAD) system is used on the CT
(Computed Tomography) images.
Here, we introduce a methodology that makes it easy to
interpret whether the nodule is present or not, with the aid of
Rule-based scheme which offers the choices or deduction in its
approach method. The aim is to select effective features and
classify the localized abnormal structural change using Rule-
based system along with connected component labelling. This
accommodates practically removing the unwanted objects i.e. the
normal structures from the area of interest such as objects too
small or too large or objects that are not circular enough. In the
proposed methodology, one of the goals is the reduction in false
positive rate of nodule detection. The elimination of the unwanted
objects can make it possible and the observed nodule can further
be determined as malignant or benign by the doctor.
Index Terms Computer-aided diagnosis, Lung Nodules, Rule
based scheme, Laplacian of Gaussian filter, connected component
labeling algorithm
I. INTRODUCTION
Cancer is one of the most critical group of diseases which is
characterized by the growth of abnormal cell groups. Cancer is
a leading cause of death in United States and also worldwide.
U.S. around 500,000 deaths caused by only cancer. Out of
several cancers like breast cancer, blood cancer, lung cancer
and so on, the lung cancer is the most crucial one. Lung cancer
only results in an estimated of 150 000 deaths in the United
States in 2001[1]
which is a leading cause of cancer death. Out
of several medical conditions, cancer is a disease which causes
a critical condition of the patient is not detected in the earlier
stage of the disease. According to one analyses, around 40% of
potentially detectable lung cancer are not detected by the
radiologists as the conventional chest radiographs are not
efficient in showing the lung nodules accurately with the
position of that lung nodules in lungs. Use of computed
tomography is a preferable way of detecting lung nodules than
chest radiography. The reason behind this is sensitivity
parameter of computed tomography. The detection rate of lung
cancer using computed tomography is 2.5 to 10 times higher
than that of using analog chest radiography.
The major cause of cancer is abnormal group of cells which
eventually deforms the shape of nodules and blood vessels and
by detecting this deformation, cancer can be detected.
Distinguishing between nodules and vessels requires
comparison of several computed topographical sections. CT
data has to be analyzed by the radiologist and it will be a
workload for a doctor to analyze large number of CT results
and determine the deformations. Due to this, there are cases
data as well. Moreover, when there are multiple abnormal
structures present in the vessels, it is hard to determine which
one is a cancer tumor. This is the reason for the requirement of
computational system. Computer Aided Detection System
(CAD) is not only quick and efficient, but it also significantly
improves the chances to detect the lung nodules from the other
elements using several mathematical techniques associated
with shapes and orientations. There are two main types of
computational system developed by the researchers with the
aim to detect true and false lesion candidates and to give the
100% sensitivity to the lung nodule. The first one is CADe
(Computer-Aided Detection System) and CADx (Computer-
Aided Diagnosis System) [2]
.
Several researchers worked upon it from past decades and
developed some advanced computer vision for the automation
in the detection of false and true candidates. To identify the
vessels and potential nodules, fuzzy clustering algorithm was
developed. A rule based approach was used to measure the
distance from the lung boundary and circularity information
about the shape of the nodule is used to distinguish the nodules
from vessels in the CT section [1]
. To distinguish the nodules
from vessels within the lung region, geometric features of
nodules are analyzed for the test images and gray-level
thresholding is applied to enhance the lesion structure.
Advance MTANN technique was developed for the low dose
computed tomography images. [3]
MTANN is a multi-layered
artificial neural network which is trained by the large number
of sub-regions extracted from the input image given as a
training image. Substantial research currently focuses on
extraction of lung nodules using its characteristics and structure
ISBN: 978-93-5265-666-0
2. which can be categorized using imaging algorithms and using
feature extraction techniques. In this paper, a CAD system
protocol model has been designed using different image
processing algorithm which is having three core objectives of
this CAD software technology:
CAD system is developed to overcome the problems
associated with the conventional methods for lung
nodule detection.
CAD also improves the performance of radiologists,
by providing high sensitivity in diagnosis where the
sensitivity depends upon the true positive and false
negative result of the system.
The system should also detect different types of
shapes and distinguish them through their structural
parameters like area, perimeter, and roundness and so
on for feature extraction.
II. LITERATURE SURVEY
The literature survey has been done on different methods to
detect Nodules in lungs using Computer Aided Detection
system. The CT images have been operated on by different
algorithms and approach so far in order to be able to extract
useful information accurately and detect and diagnose the
presence of Nodules. The advantages and limitations of the
techniques are studied which is summarized in a table as
follows:
Year Author Method Disadvantage
2011 S. Arvind
Kumar, Dr. J.
Ramesh, Dr. P.
T. Vanathi, Dr.
K. Gunavathi
Mamdani Fuzzy
system, two
level wavelet
decomposition
Sensitivity and
specificity is
comparatively
less
2012 S. Ashwin, J.
Ramesh, S.
Arvind, K.
Gunavathi
Artificial
Neural Network
(ANN)
System becomes
complex and
validation error
increases after
iteration 4
2013 Ryoichi Nagata,
Tsuyoshi
Kawaguchi,
Hidetoshi
Miyake
Template
matching based
Classifier
Lower
performance
due to low
sensitivity,
higher execution
time (8.2sec)
2013 Mickias Assefa,
Ibrahima Faye,
Aamir Saeed
Malik,
Muhammad
Shoaib
Template
matching with
multi resolution
feature analysis
Accuracy is
comparatively
less
2014 Jaspinder Kaur,
Nidhi Garg,
Daljeet Kaur
Back
Propagation
Neural Network
Result generated
operating a few
database
2014 Shuji Tanaka, Temporal High false
Yuriko Ikeda,
Hyoungseop
Kim, Rie
Tachibana and
others along
with them
subtraction of
images
positive and
problem with
increasing
versatility of
identification
2015 Ritika Agarwal,
Ankit
Shankhadhar,
Raj Kumar
Sagar
Content Based
Medical Image
Retrieval with
SVM classifiers
Efficiency is
low
III. PROPOSED ALGORITHM FOR CAD SYSTEM
Figure 1. Functional Flowchart of Proposed Algorithm
The flow chart shown in Figure 1 is a process chart which
determines the functionality of the proposed Computer Aided
Detection System. Here, from the provided database any one
type of imaging modality is used. For this system, Computer
Original Medical Image (CT scan)
Segmentation of Lung from the Image
Hole filling in segmented Lung Images
Morphological operations to determine the shape of lung nodule
Enhancement of lesion candidate from segmented image
Detection and Segmentation of lesion candidates
Filtering and Edge Detection for removal of unwanted candidates
Labelling of detected lesion components
Determine the Pattern Features of characterizing objects
Classification of candidates using Rule-based scheme
Evaluate the performance of the proposed system
ISBN: 978-93-5265-666-0
3. Tomography is used to develop the CAD system. Step by step
processing of the algorithm is explained here.
Major challenges here are to determine the Lunge Nodules
and separate the Nodule and Non-Nodule components out of all
the results. Various combinations of computer vision
algorithms are used to develop the most effective system for
cancer detection in medical imaging. Here, the simulation
environment used for the system implementation is MATLAB.
MATLAB is a software which provides user-friendly
interface to work on and this software also provides dedicated
Image Processing toolbox for several operations such as
segmentation, edge detection, boundary tracing, labeling of
connected components and so on. There are dedicated libraries
provided for mentioned operations in MATLAB. So,
developing and implementing this CAD system is more
prolific.
In this paper, multiple CT scan medical images of different
patients with positive cancer detection are provided. Every CT
scan image has different Lung Nodule position and are of
different orientations. In this algorithm, from the original image
Lung is segmented. From that segmented lung Lesion
Candidates are detected through various Morphological
operations associated with the shape of that lesion.
After this stage, the lesion components are determined
separately by using connected component labeling. After this
step, individual objects in the image are determined. The
pattern feature of separated candidates/objects are determined
using properties of the object like area, perimeter, roundness,
and centroid and so on. After obtaining the data regarding the
features, rule-based scheme is applied to obtain the Nodule and
Non-Nodule components. This process is important to
determine the performance of the system. The performance of
the system is also called sensitivity which depends upon the
true and false detection of the candidates.
IV. DATABASE AND METHODS
1.Data Sets:
There are a huge amount of people suffering from Lung
Cancer worldwide and for that reason, many organizations
have provided the acquired Radiographs of Tomography
reports to researchers in order to develop techniques and
methods to detect and diagnose the presence of nodules at the
earliest stages. The Chest radiographs and CT are available for
study purpose on several public databases. The Lung CT of
patients can be acquired from these according to their various
features that one wishes to work upon. These databases are
Lung Image Database Consortium (LIDC), Early Lung Cancer
Action Project (ELCAP), and Japanese Society of Radiological
Technology (JSRT) etc. The nodules found in the screening
program had been observed and confirmed as either malignant
or benign by biopsy or surgery. The follow up examinations
showing no growth for 2 or more years were also there.
Here, in this project
But, the use of Computed Tomography is more effective
database for CAD system as it is generated digitally. Whereas,
the Chest Radiography is in an analog form and not in the
digital one. Moreover, the sensitivity of Chest Radiography is
lower than that of the Computer Tomography. So, to detect the
true lesion candidates, it is likely to use the Computer
Tomography in CAD. So, in terms of spatial and contrast
resolution, Computer Tomography is the best imaging modality
to be used for the lung nodule detection. The example of the
given CT and Chest Radiography database is given below:
Figure 2. (a) Computed Tomography (b) Chest Radiography
2. Methods:
A. Segmentation of Lung from CT Scan:
First step in this algorithm is to Segment the Lungs from
the CT scan image. It can be done using various techniques.
technique. [11]
This thresholding technique is used for the
optimum threshold selection to extract lungs from the medical
image. This technique uses the histogram analysis of the image
and using this analysis, an optimum global threshold is
obtained. Pixels above the threshold value will be 1 and
otherwise the value will be 0.
As this algorithm is implemented in MATLB, there is an
inbuilt function using which, global image threshold is
threshold value which can be used to segment the object with
certain pixel intensity. One more function is used with this
function to obtain the binary image with pixel values 1 or 0.
Holes in the segmented image are filled using Morphological
Filter.
For this, function fills the holes in the binary image
those left out background pixels which cannot be filter out
using any morphological structural element. Region growing
filter using structural element like line is also used to segment
the lung from the image. The output of segmented lung for the
CT scan image from the database is shown in Figure-3.
B. Detection and Segmentation of lesion candidates
In this step, the objective of the algorithm is to achieve a
high detection rate. As, in this step, lesion candidates from the
segmented lung should be detected and segmented. Here, after
(a) (b)
ISBN: 978-93-5265-666-0
4. the segmentation, the obtained image is not that high in
resolution so that lesion candidates can be enhanced.
Figure 3. (a) Original Lung Image (b) Binary mask generated
(c) Segmented Lung from
original image
It is requiring to enhance the lesion candidate to identify them
using thresholding. To enhance the lesion candidates, filter is
used.
Out of various filters, Laplacian of Gaussian filter is
used for this purpose. Laplacian of Gaussian filter removes the
noise from the image and then enhance the edges using
Laplacian filter. This approach is useful to detect the lesion
with the different size. In MATLAB, there is an inbuilt
function combination which can be used for the
implementation of this LOG filter for this operation.
The performance of this filter varies with the
controls the size of the objects; in this case they are lesion
candidates. By changing this parameter, lesion enhancement
can be varied. After getting the enhanced image using LOG
filter, thresholding is used to remove the unwanted part from
the image except lesion candidates.
Thresholding converts gray image into binary image.
The lesion candidates after this process can be calculated.
Using edge detection algorithm and filling the unwanted holes
in the binary image obtained, more false lesion candidates can
be eliminated.
Here, Skeletonization of boundaries is done using.
Skeletonization removes the pixels from the boundaries. The
same function is used to perform morphological closing
operation, filling up the isolated central pixels and to remove
the interior pixel to trace the boundary of the object in the
image. The output of this step for the CT scan image from the
database is shown in Figure-4:
Figure 4. (a) Lesion Enhancement by LOG (b) Detection of
Lesion candidates (c) Filtering of Lesion candidates using Edge
Detection (d) Filtering using Morphological filter (e) Detected
Lesion candidates
C. Connectivity Analysis and Feature Extraction from
Segmented Lesion Candidates
This is the most important step of this algorithm, in which
the detected lesion candidates are labeled using connectivity
analysis. Connectivity of the pixels in the object gives the
information about the boundary of the object. Here, Lung
Nodule is needed to be detected using this connectivity analysis
out of several lesions detected. Using the labeling method, each
lesion candidate can individually be identified and through
which the pattern features of those candidates be determined.
Connected-component labeling is a technique used here to
identify the objects. 4 connectivity and 8 connectivity
algorithms are widely implemented algorithms for this purpose.
The function labels the components in an image with the
connectivity analysis. If 4 connectivity is used, than the
algorithm will check the central pixel and 4 neighboring pixels
and label them as same object. If 8 connectivity is used, than
the algorithm will check the central and 8 connected pixel with
that and label it as one object. After implementing this
technique on MATLAB, it is observed that 8 connectivity pixel
analysis gives better result than 4 connectivity pixels. The
result is shown below. So, eventually use of 8 connectivity
pixel analysis technique gives less number of lesions than that
of the 4 connectivity pixel analysis.
In the given image, it is clear that the time taken for the
execution of 8 connectivity connected component algorithm is
very much less than that of the 4 connectivity. Moreover, 4
connectivity connected component algorithm give more
number of lesion candidates than that of the 8 connectivity.
Here, 17 lesion candidates are detected using connected
component labeling technique. Now, out of these 17
candidates, 1 must be the true nodule and other are false
observations. To find which one is the Lung Nodule, pattern
feature of these lesion candidates are needed to be find out. In
this project, area and perimeter these two pattern features are
found out using mathematical iterations. Area and Perimeter of
all 17 lesion candidates are stored in one array and later they
will be analyzed for the True Nodule detection. Also, centroid
of all the lesion candidates is obtained which will be applied to
determine the True and False detections. The result has been
analyzed by taking several cases into consideration.
(a) (b) (c)
(a) (b) (c)
(d) (e)
ISBN: 978-93-5265-666-0
5. D. Feature Selection and Classification of Lesion Candidates
by using Rule-Based Scheme
This step is the last step of the algorithm. In this step, two
rule-based pattern feature schemes were applied on all obtained
lesion candidate using 8 connectivity analysis for the removal
of false-positive first in the entire lung region and second in the
inside and outside regions. Here, area and perimeter are the
parameters plotted using scatter graph to analyze the range of
areas and perimeters of lesion candidates with the base
threshold limit of area and perimeter. Objective of Rule-Based
Scheme is to obtain the Nodule and Non-Nodule candidates.
Objects which are too small or too large and object which are
not circular are likely to get discarded as they are very less
probable of being a Nodule object. Using this plot it can be
observed that it is very likely that the Lung Nodule lies in the
range of 15 to 50 in the reference of Area and in the range of
15 to 30 in the reference of Perimeter. Using this criteria
number of expected True Nodules are given with the reference
of both Area and Perimeter using Rule Based Scheme.
V. RESULT AND DISCUSSION
For the experimental purpose of detecting Lesion
Candidates, 6 CT scan images were taken from database. For
one of the 6 CT scans, result of one CT scan is shown in figure
5. The result is obtained from the MATLAB software.
Figure 5. Final Algorithm Result for Lesion Detection
After analyzing the performance of the CAD prototype
algorithm developed and proposed in this paper, following is
the statistical analysis table of lesion candidates obtained using
4 connectivity connected component algorithm and 8
connectivity connected component algorithm.
CT scan
Lesion candidate obtained using
4 Connectivity
CCA
8 Connectivity
CCA
300 17
312 16
403 14
166 16
262 16
209 18
In the proposed algorithm for Lung Nodule detection, the
algorithm is successfully being able to determine the Lung
cancer. But there are some points which hare needed to be
discussed about the problems associated with every stage of the
algorithm.
In
thresholding algorithm. But, in some complex cases it is
possible that the technique mis-segments the organ and due to
this some lesion candidates are also mis-segmented. If this
takes place, True lung nodule can never be detected. Due to
this, the number of false candidates will increase than that of
the True candidates, which gives less sensitivity to the cancer
detection for that CT scan image. Moreover, due to mis-
segmentation, deformed lesions appear in the segmented image
which is not circular and filtered out in the Rule-based Scheme
of the algorithm.
In stage 2, the lesion candidates are classified by lesion
enhancement and thresholding. Here, if the mis-classification
occurs in the lesion candidates than it is likely to miss out the
true lesion candidates. Moreover, the pattern features are also
ISBN: 978-93-5265-666-0
6. obtained error sum which gives wrong plot for Rule-based
Scheme. Because of this, the performance degrades and could
not be able to find the Lung nodule.
In stage 3, connectivity of the pixel is observed for labeling
of the objects in the image. Here, lesion candidates are labeled
using 8 connectivity connected component labeling technique.
The feature of these candidates are obtained for the
classification purpose. If there is any issue associated with
feature extraction, then it will give false result for the
candidates and candidate could not be able to classify
accurately. Moreover, use of mathematical iteration to find the
object feature may give some error in the measurements. So it
is likely to use the image processing libraries to determine the
features such as area, perimeter, roundness and centroid.
In stage 4, the feature of objects is selected to obtain the True
and False candidates in the segmented image. But, for some
accurate information. So some other features such as solidity,
density, roundness are also considerable features to obtain
more accurate results. Secondly, the advantage of Rule-based
Scheme is that the use of Rule-based Scheme gives highly
accurate demonstration of classification of lesion candidates.
Moreover, it depends upon the object features which can be
obtained using mathematical iteration of image processing
libraries. But the major disadvantage of this system is that it
can just classify the candidates, but it is possible that in some
VI. CONCLUSION
In the proposed algorithm, lesion candidates are
successfully determ
thresholding is used which is able to segment the Lungs from
medical image with precision. Using hole filling, the
segmented image has omitted the unwanted data. For
enhancement of the lesions, Laplacian of Gaussian filter is used
which enhances the edges of the segmented image and enables
the effective Edge Detection. From the Morphological
operation, more unwanted information is eliminated. At the
end, the connected component labelling technique, using 8-
connectivity method for lesion candidate detection has
successfully yielded the detection results taking very less time
for execution. We have analyzed that the 8-connectivity takes
less time and generates an effective result for lesions
identification as it efficiently generates the lesions that are very
less in number compared to 4-connectivity. This labelling can
further help in feature extraction and classification of Lung
Nodules. Use of several image processing algorithms in this
algorithms give the complexity to the algorithm but it is though
robust and fast. The performance of this system can easily be
moderated by the variance of several parameters.
VII. REFERENCES
[1] S. G. Armato, 3rd, M. L. Giger, and H. MacMahon, "Automated
detection of lung nodules in CT scans: preliminary results,"
Medical Physics, vol. 28, pp. 1552-1561, Aug 2001.
[2] -Aided Diagnosis for Evaluating
Korean Journal of Radiology, V. 12(2); Mar-Apr 2011
[3] Kenji Suzuk
Specificity in Computer-Aided Diagnosis Using a Massive
Conference on Machine Learning and Applications, 2008.
[4] S. Arvind Kumar, Dr. J. Ramesh, Dr. P. T. Vanathi, Dr. K.
2011.
[5] Ritika Agarwal, Ankit Shankhadhar, Raj Kumar Sagar,
IEEE- 5th International Conference on Advanced
Computing & Communication Technologies, 2015.
[6] S. Ashwin, J. Ramesh, S. Arvind Kumar, K. Gunavathi,
International Conference on Emerging Trends in Electrical
Engineering and Energy Management, IEEE, 2012.
[7]
System for Early Detection of Lung Tumor using Back
Conference on Medical
Imaging, IEEE, 2014.
[8]
Computer-Aided Diagnosis System for Lung Nodule Detection
in Chest Radiographs using a Two-stage Classification method
based on Radial Gradient and Tem
International Conference on Biomedical Engineering and
Informatics (BMEI), 2013.
[9] Mickias Assefa, Ibrahima Faye, Aamir Saeed Malik,
-
-International Conference on
Complex Medical Engineering, China, 2013.
[10] Shuji Tanaka, Yuriko Ikeda, Hyoungseop Kim, Joo Kooi Tan,
Seiji Ishiwaka, Seiichi Murakami, Takatoshi Aoki, Rie
Identification of Lung Candidate Nodules on Chest CT Images
IEEE Transactions, 2014.
[11] -level
Cybernetics, Vol.9, No.1, 1979, pp 62-66.
ISBN: 978-93-5265-666-0