Report (1)

1
BRAIN IMAGE CLASSIFICATION USING MACHINE
LEARNING
A PROJECT REPORT
Submitted by
R.VIGNESH KUMAR - 20110106070
P.YUVARAJ - 20110106074
D.ARUN KUMAR - 20110106301
In partial fulfillment for the award of the degree
Of
BACHELOR OF ENGINEERING
In
ELECTRONICS AND COMMUNICATION ENGINEERING
ARIGNAR ANNA INSTITUTE OF SCIENCE AND TECHNOLOGY
SRIPERUMBUDUR-602 117
ANNA UNIVERSITY: CHENNAI 600 025
APRIL 2014

2
BRAIN IMAGE CLASSIFICATION USING
MACHINE LEARNING
A PROJECT REPORT
Submitted by
R.VIGNESH KUMAR 20110106070
P.YUVARAJ 20110106074
D.ARUN KUMAR 20110106301
In partial fulfillment for the award of the degree
Of
BACHELOR OF ENGINEERING
In
ELECTRONICS AND COMMUNICATION ENGINEERING
ARIGNAR ANNA INSTITUTE OF SCIENCE AND TECHNOLOGY
SRIPERUMBUDUR-602 117
APRIL 2014

3
BONAFIDE CERTIFICATE
Certified that this project report “BRAIN IMAGE CLASSIFICATION
USING MACHINE LEARNING” is the bonafide work of “R.VIGNESH
KUMAR (20110106070), P.YUVARAJ (20110106070), D.ARUN KUMAR
(20110106301)” who carried out the project work under my supervision.
SIGNATURE SIGNATURE
Mr. S.KARTHICK BABU M.E., MBA Ms. P. JAYASAKTHI M.E
HEAD OF THE DEPARTMENT ASSISTANT PROFESSOR
DEPARTMENT OF ECE DEPARTMENT OF ECE
Arignar Anna Institute of Science Arignar Anna Institute of Science
&Technology, Chennai- 602117. &Technology, Chennai -602117.
Submitted to Project and Viva Examination held on
…………………………………..
INTERNAL EXAMINER EXTERNAL EXAMINER

4
ACKNOWLEDGEMENT
We would like to express our deep sense of heartiest thanks to our beloved
chairman Dr. VIRGAI.G.JAYARAMAN B.A and vice chairman Mr.
J.KUMARAN M.E., MBA for giving an opportunity to do and complete this
project.
Principal Dr. BHAGWAN SHREERAM M.E., Ph.D(Engg)., MBA.,
Ph.D(MBA) and our Vice Principal, Mr. M.SUBI STALIN M.E.,(Ph.D) for
creating a beautiful atmosphere, which inspired us to take over this project.
We take opportunity to convey our heartiest thanks to our Head of the
Department Mr. S.KARTHICK BABU M.E., MBA and our project coordinator
Ms. P.MOHANAPRIYA M.E Assistant professor for his much valuable support,
unfledged attention and direction, which kept this project on track.
We extremely thanks to our project supervisor Mr. M.SIVAKUMAR M.E.,
Mr. R.DHINESHRAM M.E., Assistant professor of Electronics and
communication Engineering for this much needed guidance and specially for
entrusting us with our project.
We would like to express my heartiest thanks to lab assistants and to all
other faculties, non-teaching members of our department for their support and
pears for having stood by us and help us to complete this project.

5
ABSTRACT
The Automatic Support Intelligent System is used to detect Brain Tumor
through the combination of neural network and fuzzy logic system. It helps in the
diagnostic and aid in the treatment of the brain tumor. The detection of the Brain
Tumor is a challenging problem, due to the structure of the Tumor cells in the
brain. This project presents an analytical method that enhances the detection of
brain tumor cells in its early stages and to analyze anatomical structures by training
and classification of the samples in neural network system and tumor cell
segmentation of the sample using fuzzy clustering algorithm. The artificial neural
network will be used to train and classify the stage of Brain Tumor that would be
benign, malignant or normal.
The Fast discrete curvelet transformation is used to analysis texture of an
image. In brain structure analysis, the tissues which are WM and GM are extracted.
Probabilistic Neural Network with radial basis function is employed to implement
an automated Brain Tumor classification. Decision making is performed in two
stages: feature extraction using GLCM and the classification using PNN-RBF
network. The segmentation is performed by fuzzy logic system and its result would
be used as a base for early detection of Brain Tumor which would improves the
chances of survival for the patient. The performance of this automated intelligent
system evaluates in terms of training performance and classification accuracies to
provide the precise and accurate results. The simulated result enhances and shows
that classifier and segmentation algorithm provides better accuracy than previous
methodologies.

7
CHAPTER 1
INTRODUCTION
1 INTRODUCTION
Automated classification and detection of tumors indifferent medical images
is motivated by the necessity of high accuracy when dealing with a human life.
Also, the computer assistance is demanded in medical institutions due to the fact
that it could improve the results of humans in such a domain where the false
negative cases must be at a very low rate. It has been proven that double reading of
medical images could lead to better tumor detection. Butte cost implied in double
reading is very high, that’s why good software to assist humans in medical
institutions is of great interest nowadays. Conventional methods of monitoring and
diagnosing the diseases rely on detecting the presence of particular features by a
human observer. Due to large number of patients in intensive care units and the
need for continuous observation of such conditions, several techniques for
automated diagnostic systems have been developed in recent years to attempt to
solve this problem. Such techniques work by transforming the mostly qualitative
diagnostic criteria into a more objective quantitative feature classification problem
In this project the automated classification of brain magnetic resonance images by
using some prior knowledge like pixel intensity and some anatomical features is
proposed. Currently there are no methods widely accepted therefore automatic and
reliable methods for tumor detection are of great need and interest. The application
of PNN in the classification of data for MR images problems are not fully utilized
yet. These included the clustering and classification techniques especially for MR
images problems with huge scale of data and consuming times and energy if done

8
manually. Thus, fully understanding the recognition, classification or clustering
techniques is essential to the developments of Neural Network systems particularly
in medicine problems.
Segmentation of brain tissues in gray matter, white matter and tumor on
medical images is not only of high interest in serial treatment monitoring of
“disease burden” in oncologic imaging, but also gaining popularity with the
advance of image guided surgical approaches. Outlining the brain tumor contour is
a major step in planning spatially localized radiotherapy (e.g., Cyber knife, iMRT )
which is usually done manually on contrast enhanced T1-weighted magnetic
resonance images (MRI) in current clinical practice. On T1 MR Images acquired
after administration of a contrast agent (gadolinium), blood vessels and parts of the
tumor, where the contrast can pass the blood–brain barrier are observed as hyper
intense areas. There are various attempts for brain tumor segmentation in the
literature which use a single modality, combine multi modalities and use priors
obtained from population atlases.
1.1 IMAGE SEGMENTATION
1.1.1 OVERVIEW
Segmentation problems are the bottleneck to achieve object extraction,
object specific measurements, and fast object rendering from multi-dimensional
image data. Simple segmentation techniques are based on local pixel-neighborhood
classification. Such methods fail however to global objects rather than local
appearances and require often intensive operator assistance. The reason is that the
“logic” of a object does not necessarily follow that of its local image

9
representation. Local properties, such as textures, edginess, and ridgeness etc. do
not always represent connected features of a given object.
1.2REGION GROWING APPROACH
Region growing technique segments image pixels that are belong to an
object into regions. Segmentation is performed based on some predefined criteria.
Two pixels can be grouped together if they have the same intensity characteristics
or if they are close to each other. It is assumed that pixels that are closed to each
other and have similar intensity values are likely to belong to the same object. The
simplest form of the segmentation can be achieved through thresholding and
component labeling. Another method is to find region boundaries using edge
detection. Segmentation process, then, uses region boundary information to extract
the regions. The main disadvantage of region growing approach is that it often
requires a seed point as the starting point of the segmentation process. This
requires user interaction. Due to the variations in image intensities and noise,
region growing can result in holes and over segmentation. Thus, it sometimes
requires post-processing of the segmentation result.
1.3 CLUSTERING
Clustering can be considered the most important unsupervised learning
problem, so it deals with finding a structure in a collection of unlabeled data. A
cluster is therefore a collection of objects which are “similar” between them and
are “dissimilar” to the objects belonging to other clusters
Clustering algorithms may be classified as listed below

10
(1)Exclusive Clustering
(2)Overlapping Clustering
(3)Hierarchical Clustering
(4)Probabilistic Clustering
In the first case data are grouped in an exclusive way, so that if a certain datum
belongs to a definite cluster then it could not be included in another cluster. On the
contrary the second type, the overlapping clustering, uses fuzzy sets to cluster data,
so that each point may belong to two or more clusters with different degrees of
membership. In this case, data will be associated to an appropriate membership
value. A hierarchical clustering algorithm is based on the union between the two
nearest clusters. The beginning condition is realized by setting every datum as a
cluster. After a few iterations it reaches the final clusters wanted
1.4 K-MEANS SEGMENTATION
K-means is one of the simplest unsupervised learning algorithms that solve
the well-known clustering problem. The procedure follows a simple and easy way
to classify a given data set through a certain number of clusters (assume k clusters)
fixed a priori. The main idea is to define k centroids, one for each cluster. These
centroids should be placed in a cunning way because of different location causes
different result. So, the better choice is to place them as much as possible far away
from each other. The next step is to take each point belonging to a given data set
and associate it to the nearest centroids. When no point is pending, the first step is
completed and an early group age is done. At this point we need to re-calculate k
new centroids as centers of the clusters resulting from the previous step. After we
have these k new centroids, a new binding has to be done between the same data

11
set points and the nearest new centroids. A loop has been generated. As a result of
this loop we may notice that the k centroids change their location step by step until
no more changes are done. In other words centroids do not move any more.
Finally, this algorithm aims at minimizing an objective function, in this case a
squared error function.
1.5 HIERARCHICAL SEGMENTATION
A hierarchical set of image segmentations is a set of several image
segmentations of the same image at different levels of detail in which the
segmentations at coarser levels of detail can be produced from simple merges of
regions at finer levels of detail. A unique feature of hierarchical segmentation is
that the segment or region boundaries are maintained at the full image spatial
resolution for all segmentations. In a hierarchical segmentation, an object of
interest may be represented by multiple image segments in finer levels of detail in
the segmentation hierarchy, and may be merged into a surrounding region at
coarser levels of detail in the segmentation hierarchy. If the segmentation hierarchy
has sufficient resolution, the object of interest will be represented as a single region
segment at some intermediate level of segmentation detail.
A goal of the subject analysis of the segmentation hierarchy is to identify the
hierarchical level at which the object of interest is represented by a single region
segment. The object may then be identified through its spectral and spatial
characteristics. Additional clues for object identification may be obtained from the
behavior of the image segmentations at the hierarchical segmentation level above
and below the level at which the object of interest is represented by a single region.

12
1.6 THRESHOLDING
The simplest method of image segmentation is called the thresholding
method. This method is based on a clip-level (or a threshold value) to turn a gray-
scale image into a binary image. The key of this method is to select the threshold
value (or values when multiple-levels are selected). Several popular methods are
used in industry including the maximum entropy method, Otsu's method
(maximum variance), and k-means clustering. Recently, methods have been
developed for thresholding computed tomography (CT) images. The key idea is
that, unlike Otsu's method, the thresholds are derived from the radiographs instead
of the (reconstructed) image.
1.7 DESIGN STEPS
(1) Set the initial threshold T= (the maximum value of the image brightness
+ the minimum value of the image brightness)/2.
(2) Using T segment the image to get two sets of pixels B (all the pixel
values are less than T) and N (all the pixel values are greater than T);
(3) Calculate the average value of B and N separately, mean ub and un.
(4) Calculate the new threshold: T= (ub+un)/2
(5) Repeat Second steps to fourth steps up to iterative conditions are met and
get necessary region from the brain image.

13
1.8 SYSTEM ANALYSIS
1.8.1 EXISTING TECHNIQUES
(1)PCA and Wavelet based texture characterization
(2)K means clustering and Thresholding method
(3)Manual analysis - time consuming, inaccurate and requires intensive trained
person to avoid diagnostic errors.
1.8.2 PRINCIPAL COMPONENT ANALYSIS
PCA is a mathematical procedure that uses an orthogonal transformation to
convert a set of observations of possibly correlated variables into a set of values of
linearly uncorrelated variables called principal components. The number of
principal components is less than or equal to the number of original variables. This
transformation is defined in such a way that the first principal component has the
largest possible variance (that is, accounts for as much of the variability in the data
as possible), and each succeeding component in turn has the highest variance
possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the
preceding components. Principal components are guaranteed to be independent
only if the data set is jointly normally distributed. PCA is sensitive to the relative
scaling of the original variables. Depending on the field of application, it is also
named the discrete Karhunen–Loève transform (KLT), the Hotelling transform or
proper orthogonal decomposition (POD).

14
1.8.3 DRAWBACKS
(1)Poor discriminatory power
(2)High computational load
1.9 DISCRETE WAVELET TRANSFORM
In mathematics, a wavelet series is a representation of a square integrable
(real-or complex-valued) function by a certain Ortho normal series generated by
a wavelet. This article provides a formal, mathematical definition of an Ortho
normal wavelet and of the integral wavelet transform.
1.9.1 DEFINITION
A function is called an Ortho normal wavelet if it can be used
to define a Hilbert basis, that is a complete Ortho normal system, for the
Hilbert of square integrable functions. The Hilbert basis is constructed as
the family of functions for integers . This family is an Ortho normal
system if it is ortho normal under the inner product where is the Kronecker
delta and is the standard inner product on
The requirement of completeness is that every function may be
expanded in the basis as

15
With convergence of the series understood to be convergence in norm. Such a
representation of a function f is known as a wavelet series. This implies that an
ortho normal wavelet is self-dual.
1.9.2 DRAWBACKS
(1) Loss of edge details due to shift variant property.
1.10 KNN CLASSIFIER
In pattern recognition, the k-nearest neighbor algorithm (k-NN) is a method
for classifying objects based on closest training examples in the feature space. K-
NN is a type of instance-based learning, or lazy learning where the function is only
approximated locally and all computation is deferred until classification. The k-
nearest neighbor algorithm is amongst the simplest of all machine
learning algorithms: an object is classified by a majority vote of its neighbors, with
the object being assigned to the class most common amongst its k nearest
neighbors (k is a positive integer, typically small). If k = 1, then the object is
simply assigned to the class of its nearest neighbor.
The same method can be used for regression, by simply assigning the
property value for the object to be the average of the values of itsk nearest
neighbors. It can be useful to weight the contributions of the neighbors, so that the
nearer neighbors contribute more to the average than the more distant ones. (A
common weighting scheme is to give each neighbor a weight of 1/d, where d is the
distance to the neighbor. This scheme is a generalization of linear interpolation.)

16
1.10.1DRAWBACKS
(1)Less accuracy in classification
(2)Medical Resonance images contain a noise caused by operator performance
which can lead to serious inaccuracies classification.
1.10.2 PROPOSED METHOD
MRI Brain image Classification and anatomical structure analysis are proposed
based on,
(1)PNN-RBF Network for classification
(2)Fuzzy clustering for tumor detection and structural analysis
1.10.3 METHODOLOGIES
(1)Fast Discrete Curve let Decomposition
(2)GLCM Feature Extraction
(3)PNN -RBF Training and Classification
(4)Brain structural analysis
1.10.3 ADVANTAGES
(1)It can segment the Brain regions from the image accurately.
(2)It is useful to classify the Brain Tumor images for accurate detection.
(3)Brain Tumor will be detected in an early stages

18
CHAPTER 2
PROPOSED SYSTEM
2.1 FAST DISCRETE CURVELET TRANSFORMATION
Curvelets implementations are based on the original construction which
uses apre-processing step involving a special partitioning of phase-space followed
by the ridge let transform which is applied to blocks of data that are well localized
in space and frequency.
2.2 BLOCK DIAGRAM
In the last two or three years, however, Curvelets have actually been
redesigned in a effort to make them easier to use and understand. As a result, the
new construction is considerably simpler and totally transparent. What is
interesting here is that the new mathematical architecture suggests innovative
MRI Brain
Image
Segmentation
Process
Trained Network
Classification
If Input Is
Abnormal
Fast discrete
curvelet
Decomposition
Training
Samples
Feature Extraction &
PNN-RBF Training
Texture features
Extraction
Tumor and structural
part detection
Performance
analysis

19
algorithmic strategies, and provides the opportunity to improve upon earlier
implementations. The two new fast discrete curvelet transforms (FDCTs) which
are simpler, faster, and less redundant than existing proposals:
(1) Curvelets via USFFT, and
(2)Curvelets via Wrapping.
The block size can be changed at each scale level. The wrapping construction is
shown is taken to be a Cartesian array and ˆ f[n1, n2 ] denotes its 2-D discrete
Fourier transform, then the architecture of the FDCT via wrapping is as follows.
1) Apply the 2-D FFT and obtain Fourier samples,
2) For each scale j and angle l, form the product
3) Wrap this product around the origin and obtain
4) Apply the inverse 2-D FFT to each ˜ fj,l , hence collecting the discrete
coefficient .

20
2.3 TEXTURE ANALYSIS
2.3.1OVERVIEW
Texture is that innate property of all surfaces that describes visual patterns,
each having properties of homogeneity. It contains important information about the
structural arrangement of the surface, such as; clouds, leaves, bricks, fabric, etc. It
also describes the relationship of the surface to the surrounding environment. In
short, it is a feature that describes the distinctive physical composition of a surface.
Texture properties include:
(1)Coarseness
(2)Contrast
(3)Directionality
(4)Line-likeness
(5)Regularity
(6)Roughness
Texture is one of the most important defining features of an image. It is
characterized by the spatial distribution of gray levels in a neighborhood. In order
to capture the spatial dependence of gray-level values, which contribute to the
perception of texture, a two-dimensional dependence texture analysis matrix is
taken into consideration. This two-dimensional matrix is obtained by decoding the
image file; jpeg, bmp, etc.
2.4 METHODS OF REPRESENTATION
There are three principal approaches used to describe texture; statistical,
structural and spectral…

21
(1)Statistical techniques characterize textures using the statistical properties
of the grey levels of the points/pixels comprising a surface image.
Typically, these properties are computed using: the grey level co-
occurrence matrix of the surface, or the wavelet transformation of the
surface.
(2)Structural techniques characterize textures as being composed of simple
primitive structures called “Texel’s” (or texture elements). These are
arranged regularly on a surface according to some surface arrangement
rules.
(3)Spectral techniques are based on properties of the Fourier spectrum and
describe global periodicity of the grey levels of a surface by identifying
high-energy peaks in the Fourier spectrum.
For optimum classification purposes, what concern us are the statistical
techniques of characterization… This is because it is these techniques that result in
computing texture properties… The most popular statistical representations of
texture are:
(1)Co-occurrence Matrix
(2)Tamura Texture
(3)Wavelet Transform

22
2.4.1 CO-OCCURRENCE MATRIX
Originally proposed by R.M. Haralick, the co-occurrence matrix
representation of texture features explores the grey level spatial dependence of
texture. A mathematical definition of the co-occurrence matrix is as follows:
- Given a position operator P(i,j),
- let A be an n x n matrix
- Whose element A[i][j] is the number of times that points with grey level
(intensity) g[i] occur, in the position specified by P, relative to points
with grey level g[j].
- Let C be the n x n matrix that is produced by dividing A with the total
number of point pairs that satisfy P. C[i][j] is a measure of the joint
probability that a pair of points satisfying P will have values g[i], g[j].
- C is called a co-occurrence matrix defined by P.
Examples for the operator P are: “i above j”, or “i one position to the right and two
below j”, etc.
This can also be illustrated as follows… Let t be a translation, then a co-occurrence
matrix Ctof a region is defined for every grey-level (a, b) by [1]:
C a b card s s t R A s a A s t bt ( , ) {( , ) | [ ] , [ ] }2

23
Here, Ct(a, b) is the number of site-couples, denoted by (s, s + t) that are separated
by a translation vector t, with a being the grey-level of s, and b being the grey-level
of s + t.
For example; with an 8 grey-level image representation and a vector t that
considers only one neighbor, we would find [1]:
2.5 CLASSICAL CO-OCCURRENCE MATRIX
At first the co-occurrence matrix is constructed, based on the orientation and
distance between image pixels. Then meaningful statistics are extracted from the
matrix as the texture representation. Haralick proposed the following texture
features:
(1)Energy
(2)Contrast
(3)Correlation
(4)Homogeneity
Hence, for each Haralick texture feature, we obtain a co-occurrence matrix.
These co-occurrence matrices represent the spatial distribution and the dependence
of the grey levels within a local area. Each (i,j) th
entry in the matrices, represents
the probability of going from one pixel with a grey level of 'i' to another with a

24
grey level of 'j' under a predefined distance and angle. From these matrices, sets of
statistical measures are computed, called feature vectors.
2.5.1 ENERGY
It is a gray-scale image texture measure of homogeneity changing, reflecting
the distribution of image gray-scale uniformity of weight and texture..
p(x,y) is the GLC M
2.5.2 CONTRAST
Contrast is the main diagonal near the moment of inertia, which measure the
value of the matrix is distributed and images of local changes in number, reflecting
the image clarity and texture of shadow depth.
2.5.3 CORRELATION COEFFICIENT
Measures the joint probability occurrence of the specified pixel pairs
Correlation: sum (sum ((x- μx)(y-μy)p(x , y)/σxσy))
HOMOGENEITY
Measures the closeness of the distribution of elements in the GLCM to the
GLCM diagonal.
Homogenity = sum (sum(p(x , y)/(1 + [x-y])))

25
CHAPTER 3
PROBABILISTIC NEURAL NETWORKS

26
CHAPTER 3
PROBABILISTIC NEURAL NETWORKS
3.1 NEURAL NETWORK
Neural networks are predictive models loosely based on the action of
biological neurons. The selection of the name “neural network” was one of the
great PR successes of the Twentieth Century. It certainly sounds more exciting
than a technical description such as “A network of weighted, additive values with
nonlinear transfer functions”. However, despite the name, neural networks are far
from “thinking machines” or “artificial brains”. A typical artificial neural network
might have a hundred neurons. In comparison, the human nervous system is
believed to have about 3x1010
neurons. We are still light years from “Data”.
The original “Perception” model was developed by Frank Rosenblatt in 1958.
Rosenblatt’s model consisted of three layers, (1) a “retina” that distributed inputs
to the second layer, (2) “association units” that combine the inputs with weights
and trigger a threshold step function which feeds to the output layer, (3) the output
layer which combines the values. Unfortunately, the use of a step function in the
neurons made the perceptions difficult or impossible to train. A critical analysis of
perceptions published in 1969 by Marvin Musky and Seymour Paper pointed out a
number of critical weaknesses of perceptions, and, for a period of time, interest in
perceptions waned.
Interest in neural networks was revived in 1986 when David Rumelhart,
Geoffrey Hinton and Ronald Williams published “Learning Internal

27
Representations by Error Propagation”. They proposed a multilayer neural network
with nonlinear but differentiable transfer functions that avoided the pitfalls of the
original perception’s step functions. They also provided a reasonably effective
training algorithm for neural networks.
In this paper, we investigate the possibility of utilizing the detection
capability of the neural networks as a diagnostic tool for early breast cancer. Breast
cancer is one of leading causes of women death in the world. Detection and
localization of tumor at early stages is the only way to decrease the mortality rate.
X-ray mammography is the most used diagnostic tool for breast cancer screening.
However, it has important limitations with. For example, it has high false -negative
and -positive rates that may reach up to 34%. Thus, the development of imaging
modalities which enhance, complement, or replace X-ray mammography has been
a priority in the medical imaging research. Recently, several microwave imaging
methods have been developed. Microwave ultra-wideband (UWB) imaging is
currently one of promising methods that are under investigation by many research
groups around the world. This method involves transmitting UWB signals through
the breast tissue and records the scattered signals from different locations
surrounding the breast using compact UWB antennas. The basis of this method is
the difference in the electrical properties between the tumors and the healthy
tissues.
It has been shown by different research groups worldwide that the healthy
fat tissues of the breast have a dielectric constant that ranges between S and 9 and
conductivity between 0.02 Sims and 0.2 S/m. On the hand, the malignant tissues
have a dielectric constant of around 60 and conductivity around 2 Sim. Thus, it is
clear that there is a significant contrast between the electrical characteristics of the

28
healthy and malignant tissues. From the electromagnetic point of view, this
contrast means a significant difference in the scattering properties of those tissues.
From the neural network point of view, it is possible to say that the difference in
the scattering properties of healthy and malignant tissues means that the scattered
signals recorded in different places around the breast have the signature of
existence of tumor. The comparison between those received signals recorded in
different places gives an indication about the place of tumor.
Reviewing the literature shows that most of the research on using neural
networks for breast cancer detection were aimed at enabling radiologist to make
accurate diagnoses of breast cancer on X-ray mammograms. For example, several
neural network methods were used in [IS] for automated classification of clustered
micro-calcifications in mammograms.
Concerning the UWB systems, the neural network has been used extensively
to detect the direction of arrival of ultra wideband signals. It is also utilized in a
simple manner to detect and locate tumor of 2.S mm radius along certain axis
(one dimension) and also in two dimensions but by rotating the place of receiving
the signal by steps of 1 degree around the breast.
This paper presents the use of neural network to detect and locate tumors in
breast with sizes down to 1 mm in radius by using four permanent probes around
the breast. A three dimensional breast model is built for the purpose of this
investigation. A pulsed plane electromagnetic wave is sent towards the breast and
the scattered signals are collected by four antennas surrounding the breast. The
transmitted pulse occupies the UWB frequency range from 3.1 GHz to 10.6 GHz.

29
A simple feed-forward back-propagation neural network is then used to detect the
tumor and locate its position.
3.2 THE USED NEURAL NETWORK
Neural network is the best tool in recognition and discrimination between
different sets of signals. To get best results using the neural network, it is necessary
to choose a suitable architecture and learning algorithm. Unfortunately there is no
guaranteed method to do that. The best way to do that is to choose what is expected
to be suitable according to our previous experience and then to expand or shrink
the neural network size until a reasonable output is obtained. In this work we tried
different sizes for the neural network using MA TLAB and we found that the best
in our case is the model shown in Fig. 2. It has an input layer with 2000 inputs,
first hidden layer with 11 nodes, and T ANSIG transfer function, second hidden
layer with 7 nodes, and T ANSIG transfer function, and output layer with
PURELIN transfer function and 2 outputs. One of the two outputs is used for the
detection of tumor, and the other for the localization. T ANSIG transfer function is
selected to limit the signal between -1 and1.
For the output layer, PURELIN transfer function is chosen to give all the
possible cases for the location of tumor. The proposed neural network has been
used to detect and locate the tumor in two cases. The first case was to detect and
locate a tumor in a two-dimensional sector of the breast model. The location of
tumor was considered randomly at the center and in any of the four quadrature .
The second case was to detect and locate tumor anywhere in the three-dimensional
model. The neural network has been trained using 100 sets of inputs using the

30
training function (TRAINSCG). Additional 40sets of inputs were used to test the
performance of each neural network.
3.3 TYPES OF NEURAL NETWORKS
1) Artificial Neural Network
2) Probabilistic Neural Networks
3) General Regression Neural Networks
DTREG implements the most widely used types of neural networks:
(1) Multilayer Perception Networks (also known as multilayer feed-forward
network),
(2) Cascade Correlation Neural Networks,
(3) Probabilistic Neural Networks (PNN)
(4) General Regression Neural Networks (GRNN).
3.4 RADIAL BASIS FUNCTION NETWORKS
(1) Functional Link Networks,
(2) Kohonen networks,
(3) Gram-Charier networks,
(4) Hebb networks,
(5) Adeline networks,
(6) Hybrid Networks.

31
3.5 THE MULTILAYER PERCEPTRON NEURAL NETWORK MODEL
The following diagram illustrates a perceptions network with three layers:
This network has an input layer (on the left) with three neurons, one hidden
layer (in the middle) with three neurons and an output layer (on the right) with
three neurons.
There is one neuron in the input layer for each predictor variable. In the case
of categorical variables, N-1 neurons are used to represent the N categories of the
variable.
3.5.1 INPUT LAYER
A vector of predictor variable values (x1...xp) is presented to the input layer.
The input layer (or processing before the input layer) standardizes these values so

32
that the range of each variable is -1 to 1. The input layer distributes the values to
each of the neurons in the hidden layer. In addition to the predictor variables, there
is a constant input of 1.0, called the bias that is fed to each of the hidden layers; the
bias is multiplied by a weight and added to the sum going into the neuron.
3.5.2 HIDDEN LAYER
Arriving at a neuron in the hidden layer, the value from each input neuron is
multiplied by a weight (wji), and the resulting weighted values are added together
producing a combined value uj. The weighted sum (uj) is fed into a transfer
function, σ, which outputs a value hj. The outputs from the hidden layer are
distributed to the output layer.
3.5.3 OUTPUT LAYER
Arriving at a neuron in the output layer, the value from each hidden layer
neuron is multiplied by a weight (wkj), and the resulting weighted values are added
together producing a combined value vj. The weighted sum (vj) is fed into a
transfer function, σ, which outputs a value yk. The y values are the outputs of the
network.
If a regression analysis is being performed with a continuous target variable,
then there is a single neuron in the output layer, and it generates a single y value.
For classification problems with categorical target variables, there are N neurons in
the output layer producing N values, one for each of the N categories of the target
variable.

33
3.6 PROBABILISTIC NEURAL NETWORKS (PNN)
Probabilistic (PNN) and General Regression Neural Networks (GRNN) have
similar architectures, but there is a fundamental difference: Probabilistic networks
perform classification where the target variable is categorical, whereas general
regression neural networks perform regression where the target variable is
continuous. If you select a PNN/GRNN network, DTREG will automatically select
the correct type of network based on the type of target variable.
3.6.1 ARCHITECTURE OF A PNN
Fig 3.6
All PNN networks have four layers:
(1)Input layer - There is one neuron in the input layer for each predictor
variable. In the case of categorical variables, N-1 neurons are used where N
is the number of categories. The input neurons (or processing before the
input layer) standardizes the range of the values by subtracting the median

34
and dividing by the inter quartile range. The input neurons then feed the
values to each of the neurons in the hidden layer.
(2) Hidden layer - This layer has one neuron for each case in the training data
set. The neuron stores the values of the predictor variables for the case along
with the target value. When presented with the x vector of input values from
the input layer, a hidden neuron computes the Euclidean distance of the test
case from the neuron’s center point and then applies the RBF kernel function
using the sigma value(s). The resulting value is passed to the neurons in the
pattern layer.
(3)Pattern layer / Summation layer - The next layer in the network is different
for PNN networks and for GRNN networks. For PNN networks there is one
pattern neuron for each category of the target variable. The actual target
category of each training case is stored with each hidden neuron; the
weighted value coming out of a hidden neuron is fed only to the pattern
neuron that corresponds to the hidden neuron’s category. The pattern
neurons add the values for the class they represent (hence, it is a weighted
vote for that category).
For GRNN networks, there are only two neurons in the pattern layer.
One neuron is the denominator summation unit the other is the numerator
summation unit. The denominator summation unit adds up the weight values
coming from each of the hidden neurons. The numerator summation unit
adds up the weight values multiplied by the actual target value for each
hidden neuron.

35
(4)Decision layer - The decision layer is different for PNN and GRNN
networks. For PNN networks, the decision layer compares the weighted
votes for each target category accumulated in the pattern layer and uses the
largest vote to predict the target category.
For GRNN networks, the decision layer divides the value accumulated
in the numerator summation unit by the value in the denominator summation
unit and uses the result as the predicted target value.
The following diagram is actual diagram or propose network used in
our project
Fig 3.6b
(1)Input Layer:
The input vector, denoted as p, is presented as the black vertical bar. Its
dimension is R × 1. In this paper, R = 3.

36
(2) Radial Basis Layer:
In Radial Basis Layer, the vector distances between input vector p and the
weight vector made of each row of weight matrix W are calculated. Here, the
vector distance is defined as the dot product between two vectors [8]. Assume the
dimension of W is Q×R. The dot product between p and the i-th row of W
produces the i-th element of the distance vector ||W-p||, whose dimension is Q×1.
The minus symbol, “-”, indicates that it is the distance between vectors. Then, the
bias vector b is combined with ||W- p|| by an element-by-element multiplication,
.The result is denoted as n = ||W- p|| ..p. The transfer function in PNN has built into
a distance criterion with respect to a center. In this paper, it is defined as radbas (n)
= 2 n e- (1) each element of n is substituted into Eq. 1 and produces corresponding
element of a, the output vector of Radial Basis Layer. The i-th element of a can be
represented as ai = radbas (||Wi - p|| ..bi) (2) where Wi is the vector made of the i-
th row of W and bi is the i-th element of bias vector b.
Some characteristics of Radial Basis Layer:
The i-th element of a equals to 1 if the input p is identical to the ith row of input
weight matrix W. A radial basis neuron with a weight vector close to the input
vector p produces a value near 1 and then its output weights in the competitive
layer will pass their values to the competitive function. It is also possible that
several elements of a are close to 1 since the input pattern is close to several
training patterns.

37
(2)Competitive Layer:
There is no bias in Competitive Layer. In Competitive Layer, the vector a is
firstly multiplied with layer weight matrix M, producing an output vector d. The
competitive function, denoted as C in Fig. 2, produces a 1 corresponding to the
largest element of d, and 0’s elsewhere. The output vector of competitive function
is denoted as c. The index of 1 in c is the number of tumor that the system can
classify. The dimension of output vector, K, is 5 in this paper.
3.6.2 PNN NETWORK WORK
Although the implementation is very different, probabilistic neural networks
are conceptually similar to K-Nearest Neighbor (k-NN) models. The basic idea is
Fig 3.6.2 a

38
that a predicted target value of an item is likely to be about the same as other
items that have close values of the predictor variables. Consider this figure:
Assume that each case in the training set has two predictor variables, x and
y. The cases are plotted using their x,y coordinates as shown in the figure. Also
assume that the target variable has two categories, positive which is denoted by a
square and negative which is denoted by a dash. Now, suppose we are trying to
predict the value of a new case represented by the triangle with predictor values
x=6, y=5.1. Should we predict the target as positive or negative?
Notice that the triangle is position almost exactly on top of a dash
representing a negative value. But that dash is in a fairly unusual position
compared to the other dashes which are clustered below the squares and left of
center. So it could be that the underlying negative value is an odd case.
The nearest neighbor classification performed for this example depends on
how many neighboring points are considered. If 1-NN is used and only the closest
point is considered, then clearly the new point should be classified as negative
since it is on top of a known negative point. On the other hand, if 9-NN
classification is used and the closest 9 points are considered, then the effect of the
surrounding 8 positive points may overbalance the close negative point.
A probabilistic neural network builds on this foundation and generalizes it to
consider all of the other points. The distance is computed from the point being
evaluated to each of the other points, and a radial basis function (RBF) (also called
a kernel function) is applied to the distance to compute the weight (influence) for

39
each point. The radial basis function is so named because the radius distance is the
argument to the function.
Weight = RBF (distance)
The further some other point is from the new point, the less influence it has.
Fig 3.6.2b
3.7 RADIAL BASIS FUNCTION
Different types of radial basis functions could be used, but the most common
is the Gaussian function:
Fig 3.7

40
3.8 ADVANTAGES AND DISADVANTAGES OF PNN NETWORKS
(1) It is usually much faster to train a PNN/GRNN network than a multilayer
Perceptron network.
(2)PNN/GRNN networks often are more accurate than multilayer perceptron
Networks.
(3) PNN/GRNN networks are relatively insensitive to outliers (wild points).
(4) PNN networks generate accurate predicted target probability scores.
(5) PNN networks approach Bayes optimal classification.
(6) PNN/GRNN networks are slower than multilayer perceptron networks at
Classifying new cases.
(7)PNN/GRNN networks require more memory space to store the model.
3.9 REMOVING UNNECESSARY NEURONS
One of the disadvantages of PNN models compared to multilayer perceptron
networks is that PNN models are large due to the fact that there is one neuron for
each training row. This causes the model to run slower than multilayer perceptron
networks when using scoring to predict values for new rows.
DTREG provides an option to cause it remove unnecessary neurons from the
model after the model has been constructed.
Removing unnecessary neurons has three benefits:
(1)The size of the stored model is reduced.
(2)The time required to apply the model during scoring is reduced.
(3)Removing neurons often improves the accuracy of the model.

41
The process of removing unnecessary neurons is an iterative process. Leave-
one-out validation is used to measure the error of the model with each neuron
removed. The neuron that causes the least increase in error (or possibly the largest
reduction in error) is then removed from the model. The process is repeated with
the remaining neurons until the stopping criterion is reached.
When unnecessary neurons are removed, the “Model Size” section of the
analysis report shows how the error changes with different numbers of neurons.
You can see a graphical chart of this by clicking Chart/Model size.
There are three criteria that can be selected to guide the removal of neurons:
(1)Minimize error – If this option is selected, then DTREG removes neurons as
long as the leave-one-out error remains constant or decreases. It stops when
it finds a neuron whose removal would cause the error to increase above the
minimum found.
(2)Minimize neurons – If this option is selected, DTREG removes neurons until
the leave-one-out error would exceed the error for the model with all
neurons.
(3)# of neurons – If this option is selected, DTREG reduces the least significant
neurons until only the specified number of neurons remain.

42
3.10 PNN
NEWPNN creates a two layer network. The first layer has RADBAS
neurons, and calculates its weighted inputs with DIST, and its net input with
NETPROD. The second layer has COMPET neurons, and calculates its weighted
input with DOTPROD and its net inputs with NETSUM. Only the first layer has
biases. NEWPNN sets the first layer weights to P', and the first layer biases are
all set to 0.8326/SPREAD resulting in radial basis functions that cross 0.5 at
weighted inputs% of +/- SPREAD. The second layer weights W2 are set to T.
3.10.1 METHODOLOGY
A description of the derivation of the PNN classifier was given. PNNs had
been used for classification problems. The PNN classifier presented good
accuracy, very small training time, robustness to weight changes, and negligible
retraining time. There are 6 stages involved in the proposed model which are
starting from the data input to output. The first stage is should be the image
processing system. Basically in image processing system, image acquisition and
image enhancement are the steps that have to do. In this paper, these two steps are
skipped and all the images are collected from available resource. The proposed
model requires converting the image into a format capable of being manipulated by
the computer. The MR images are converted into matrices form by using
MATLAB. Then, the PNN is used to classify the MR images. Lastly, performance
based on the result will be analyzed at the end of the development phase.

43
CHAPTER 4
SPATIAL FUZZY CLUSTERING MODEL

44
SPATIAL FUZZY CLUSTERING MODEL
Fuzzy clustering plays an important role in solving problems in the areas of
pattern recognition and fuzzy model identification. A variety of fuzzy clustering
methods have been proposed and most of them are based upon distance criteria.
One widely used algorithm is the fuzzy c-means (FCM) algorithm. It uses
reciprocal distance to compute fuzzy weights. A more efficient algorithm is the
new FCFM. It computes the cluster center using Gaussian weights, uses large
initial prototypes, and adds processes of eliminating, clustering and merging. In the
following sections we discuss and compare the FCM algorithm and FCFM
algorithm.
Spatial Fuzzy C Means method incorporates spatial information, and the
membership weighting of each cluster is altered after the cluster distribution in the
neighborhood is considered. The first pass is the same as that in standard FCM
tocalculate the membership function in the spectral domain. In the second pass, the
membership information of each pixel is mapped to the spatial domain and the
spatial function is computed from that. The FCM iteration proceeds with the new
membership that is incorporated with the spatial function. The iteration is stopped
when the maximum difference between cluster centers or membership functions at
two successive iterations is less than a least threshold value.
The fuzzy c-means (FCM) algorithm was introduced by J. C. Bezdek [2].
The idea of FCM is using the weights that minimize the total weighted mean-
square error:
J(wqk, z(k)
) = (k=1,K) (k=1,K) (wqk)|| x(q)
- z(k)
||2
(k=1,K) (wqk) = 1

45
wqk = (1/(Dqk)2
)1/(p-1)
/ (k=1,K) (1/(Dqk)2
)1/(p-1)
, p > 1
The FCM allows each feature vector to belong to every cluster with a fuzzy
truth value (between 0 and 1), which is computed using Equation (4). The
algorithm assigns a feature vector to a cluster according to the maximum weight of
the feature vector over all clusters.
4.1 A NEW FUZZY C-MEANS IMPLEMENTATION
4.1.1 ALGORITHM FLOW
4.2 INITIALIZE THE FUZZY WEIGHTS
In order to comparing the FCM with FCFM, our implementation allows the
user to choose initializing the weights using feature vectors or randomly. The
process of initializing the weights using feature vectors assigns the first Kinit (user-
given) feature vectors to prototypes then computes the weights by Equation (4).

46
4.3 STANDARDIZE THE WEIGHTS OVER Q
During the FCM iteration, the computed cluster centers get closer and closer.
To avoid the rapid convergence and always grouping into one cluster, we use
w[q,k] = (w[q,k] – wmin)/( wmax – wmin)
before standardizing the weights over Q. Where wmax, wmin are maximum or
minimum weights over the weights of all feature vectors for the particular class
prototype.
4.4 ELIMINATING EMPTY CLUSTERS
After the fuzzy clustering loop we add a step (Step 8) to eliminate the empty
clusters. This step is put outside the fuzzy clustering loop and before calculation of
modified XB validity. Without the elimination, the minimum distance of prototype
pair used in Equation (8) may be the distance of empty cluster pair. We call the
method of eliminating small clusters by passing 0 to the process so it will only
eliminate the empty clusters.
After the fuzzy c-means iteration, for the purpose of comparison and to pick
the optimal result, we add Step 9 to calculate the cluster centers and the modified
Xie-Beni clustering validity :
The Xie-Beni validity is a product of compactness and separation measures.
The compactness-to-separation ratio is defined by Equation (6).
= {(1/K) (k=1,K) k
2
}/Dmin
2
k
2
= (q=1,Q) wqk || x(q)
– c(k)
||2
Dmin is the minimum distance between the cluster centers.

47
The variance of each cluster is calculated by summing over only the members of
each cluster rather than over all Q for each cluster, which contrasts with the
original Xie-Beni validity measure.
k
2
= {q: q is in cluster k} wqk || x(q)
– c(k)
||2
The spatial function is included into membership function as given in Equation
4.5 MORPHOLOGICAL PROCESS
Morphological image processing is a collection of non-linear operations
related to the shape or morphology of features in an image. Morphological
operations rely only on the relative ordering of pixel values, not on their numerical
values, and therefore are especially suited to the processing of binary images.
Morphological operations can also be applied to grey scale images such that their
light transfer functions are unknown and therefore their absolute pixel values are of
no or minor interest.
Morphological techniques probe an image with a small shape or template
called a structuring element. The structuring element is positioned at all possible
locations in the image and it is compared with the corresponding neighbourhood of
pixels. Some operations test whether the element "fits" within the neighbourhood,
while others test whether it "hits" or intersects the neighbourhood:
A morphological operation on a binary image creates a new binary image in which
the pixel has a non-zero value only if the test is successful at that location in the
input image.

48
A morphological operation on a binary image creates a new binary image in
which the pixel has a non-zero value only if the test is successful at that location in
the input image.
The structuring element is a small binary image, i.e. a small matrix of
pixels, each with a value of zero or one:
The matrix dimensions specify the size of the structuring element.
The pattern of ones and zeros specifies the shape of the structuring element.
An origin of the structuring element is usually one of its pixels, although generally
the origin can be outside the structuring element.
fig 4.5 Examples of simple structuring elements.
A common practice is to have odd dimensions of the structuring matrix and
the origin defined as the centre of the matrix. Structuring elements play in
morphological image processing the same role as convolution kernels in linear
image filtering.
When a structuring element is placed in a binary image, each of its pixels is
associated with the corresponding pixel of the neighbourhood under the structuring
element. The structuring element is said to fit the image if, for each of its pixels set
to 1, the corresponding image pixel is also 1. Similarly, a structuring element is

49
said to hit, or intersect, an image if, at least for one of its pixels set to 1 the
corresponding image pixel is also 1.
fig 4.5a Fitting and hitting of a binary image with structuring elements s1 and s2.
Zero-valued pixels of the structuring element are ignored, i.e. indicate points where
the corresponding image value is irrelevant.
4.6 FUNDAMENTAL OPERATIONS
More formal descriptions and examples of how basic morphological
operations work are given in the Hypermedia Image Processing Reference (HIPR)
developed by Dr. R. Fisher et al. at the Department of Artificial Intelligence in the
University of Edinburgh, Scotland, UK.
4.7 EROSION AND DILATION
The erosion of a binary image f by a structuring element s (denoted f s)
produces a new binary image g = f s with ones in all locations (x,y) of a
structuring element's origin at which that structuring element s fits the input image
f, i.e. g(x,y) = 1 is s fits f and 0 otherwise, repeating for all pixel coordinates (x,y).
Erosion with small (e.g. 2×2 - 5×5) square structuring elements shrinks an
image by stripping away a layer of pixels from both the inner and outer boundaries
of regions. The holes and gaps between different regions become larger, and small
details are eliminated:

50
fig 4.7 Erosion: a 3×3 square structuring element
Larger structuring elements have a more pronounced effect, the result of
erosion with a large structuring element being similar to the result obtained by
iterated erosion using a smaller structuring element of the same shape. If s1 and s2
are a pair of structuring elements identical in shape, with s2 twice the size of s1,
thenf s2 ≈ (f s1) s1.
Erosion removes small-scale details from a binary image but simultaneously
reduces the size of regions of interest, too. By subtracting the eroded image from
the original image, boundaries of each region can be found: b = f − (f s ) where f
is an image of the regions, s is a 3×3 structuring element, and b is an image of the
region boundaries.
The dilation of an image f by a structuring element s (denoted f s)
produces a new binary image g = f s with ones in all locations (x,y) of a
structuring element's orogin at which that structuring element s hits the the input
image f, i.e. g(x,y) = 1 if s hits f and 0 otherwise, repeating for all pixel coordinates
(x,y). Dilation has the opposite effect to erosion -- it adds a layer of pixels to both
the inner and outer boundaries of regions.
The holes enclosed by a single region and gaps between different regions
become smaller, and small intrusions into boundaries of a region are filled in:

51
fig 4.7a Dilation: a 3×3 square structuring element
Results of dilation or erosion are influenced both by the size and shape of a
structuring element. Dilation and erosion are dual operations in that they have
opposite effects. Let f c
denote the complement of an image f, i.e., the image
produced by replacing 1 with 0 and vice versa. Formally, the duality is written as
f s = f c
srot
where srot is the structuring element s rotated by 180 . If a structuring
element is symmetrical with respect to rotation, then srot does not differ from s. If a
binary image is considered to be a collection of connected regions of pixels set to 1
on a background of pixels set to 0, then erosion is the fitting of a structuring
element to these regions and dilation is the fitting of a structuring element (rotated
if necessary) into the background, followed by inversion of the result.

52
CHAPTER 5
SIMULATED RESULTS

53
5.1 MRI Brain Image with Benign Case
(1)Input Image with its curvelet Decomposition
IMAGE SEGMENTATION

54
Fig. 5.1.1 MRI Brain Image with Benign Case
5.2MRI Brain with Malignant Case
Fig 5.2.1MRI Brain with Malignant Case
IMAGE SEGMENTATION

55
Fig 5.2.2MRI Brain with Malignant Case
5.3 MRI Brain with Normal Case
Fig 5.3.1MRI Brain with Normal Case
IMAGE SEGMENTATION

56
Fig 5.3.2 MRI Brain with Normal Case
CONCLUSION
The project presented that automated brain image classification for early
stage abnormality detection with use of neural network classifier and spotting of
tumor was done with image segmentation. Pattern recognition was performed
using probabilistic neural network with radial basis function and pattern will be
characterized with the help of fast discrete curvelet transform and haralick features
analysis. Here. Spatial fuzzy clustering algorithm was utilized effectively for
accurate tumor detection to measure the area of abnormal region. From an
experiment, system proved that it provides better classification accuracy with
various stages of test samples and it consumed less time for process.

57
REFERENCES
[1] N. Kwak, and C. H. Choi, “Input Feature Selection for Classification
Problems”, IEEE Transactions on Neural Networks, 13(1), 143–159, 2002.
[2] E. D. Ubeyli and I. Guler, “Feature Extraction from Doppler Ultrasound
Signals for Automated Diagnostic Systems”, Computers in Biology and
Medicine, 35(9), 735–764, 2005.
[3] D.F. Specht, “Probabilistic Neural Networks for Classification, mapping, or
associative memory”, Proceedings of IEEE International Conference on
Neural Networks, Vol.1, IEEE Press, New York, pp. 525-532, June 1988.
[4] D.F. Speech, “Probabilistic Neural Networks” Neural Networks, vol. 3,
No.1, pp. 109-118, 1990.

58
[5] Georgia is. Et all , “ Improving brain tumor characterization on MRI by
probabilistic neural networks and non-linear transformation of textural
features”, Computer Methods and program in biomedicine, vol 89, pp24-32,
2008
[6] Klaus M., “Automated segmentation of MRI brain tumors”, Journal of
Radiology ,vol. 218, pp. 585-591,2001
[7] Kernel, P., Bela, M., Rainer, S., Zalan, D., Zsolt, T. and Janos, F.,
“Application of neural network in medicine”, Diag. Med. Tech.,vol. 4,issue
3,pp: 538-54 ,1998.
[8] Mammadagha Mammadov, Engin tas ,”An improved version of back
propagation algorithm with effective dynamic learning rate and momentum “
Inter.Conference on Applied Mathematics ,pp:356-361, 2006.
[9] Messen W, Wehrens R, Buydens L, “Supervised Kohonen networks for
classification problems“,Chemometrics and Intelligent Laboratory Systems,
vol.83,pp:99-113,2006.

Report (1)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Report (1)

Ähnlich wie Report (1) (20)

Report (1)