SlideShare a Scribd company logo
1 of 9
Download to read offline
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSNIN –
    INTERNATIONAL JOURNAL OF ADVANCED RESEARCH 0976
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME
             ENGINEERING AND TECHNOLOGY (IJARET)

ISSN 0976 - 6480 (Print)
ISSN 0976 - 6499 (Online)
                                                                   IJARET
Volume 3, Issue 1, January- June (2012), pp. 01-09
© IAEME: www.iaeme.com/ijaret.html                                ©IAEME
Journal Impact Factor (2011): 0.7315 (Calculated by GISI)
www.jifactor.com




    SAMPLE SIZE DETERMINATION FOR CLASSIFICATION OF
      EEG SIGNALS USING POWER ANALYSIS IN MACHINE
                   LEARNING APPROACH
                   V. Indira1, R. Vasanthakumari2 , V. Sugumaran3
1                                 2                           3
 Research Scholar,                 Principal                   Associate professor, School
Deptrtment of Mathematics,        Kasthurba college for       of Mechanical and Building
Karpagam University               women                       Sciences, VIT university
Coimbatore,                       Puducherry-605008.          Chennai campus, Vandalur-
India.                            vasunthara1@yahoo.com       keelambakkam road,
indira_manik@yahoo.co.in                                      Chennai-600048
                                                              v_sugu@yahoo.com


ABSTRACT
Recent advances in computer hardware and signal processing have made possible the use
of electroencephalogram (EEG) signals or “brain waves” for communication between
humans and computers. Also, the signals that are extracted from brain through EEG are
required to improve brain computer interface (BCI). EEG signal analysis and
classification is one of the prominent researches in the field of BCI. In recent days,
machine learning approach is used for classification of EEG signals. This method
consists of a chain of activities namely, data acquisition, feature extraction, feature
selection, and feature classification. In order to extract the features representing the EEG
signals, the spectral analysis of the EEG signals was performed by using the three model-
based methods (Burg autoregressive – AR, least squares modified Yule–Walker
autoregressive methods. It is easy to collect EEG signals from healthy volunteers whereas
it is too difficult to collect a sizable number of EEG recordings of those who are having
one particular problem. Hence it is essential to know the minimum number of samples
required to train the classifier. The objective of this paper is to determine the minimum
number of signals required for training the classifier in machine learning approach to
EEG signal analysis so as to get the best classification accuracy. A data set consists of
500 EEG signals with five different types was considered. A method for determination of

                                             1
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

minimum sample size using power analysis has been proposed and the results are
validated using J48 classifier (A java implementation of C4.5 algorithm).
Keywords: Auto regressive method,. EEG signals; Machine learning approach; Power
analysis; Sample size;

1. Introduction
Study of brain computer interface (BCI) has mainly involved in recording of
electroencephalographic signals. The electroencephalogram (EEG), a highly complex
signal is one of the most common sources of information used to study brain function and
neurological diseases and has great potential for the diagnosis and treatment of mental
diseases and abnormalities. In clinical contexts, EEG refers to the recording of the brain’s
spontaneous electrical activity over a short period of time, usually 20-40 minutes, as
recorded from multiple electrodes placed on the scalp. Also EEG signals are widely used
in intensive care units for brain function monitoring. Hence the classification of EEG
signals has become very essential. Larger and larger techniques are being used to
analyze the EEG signal and extract useful information out of it. Machine learning
approach may be considered as a candidate to analyze EEG signal.The dataset described
in [1] which is publicly available was used for the present study. This data set consists of
five sets denoted by (A, B, C, D and E), each containing 100 single-channel EEG signals.
Each signal has been selected after visual inspection for artifacts and has passed a weak
stationarity criterion. Sets A and B have been taken from surface EEG recordings of five
healthy volunteers with eyes open and closed. Signals in two sets have been measured in
seizure-free intervals from five patients in the epileptogenic zone (D) and from
hippocampal formation of the opposite hemisphere of the brain (C). Set E contains
seizure activity, selected from all recording sites exhibiting ictal activity. Sets A and B
have been recorded extra cranially, whereas sets C, D and E have been recorded
intracranially. Here only a short description of the data set is presented and refer to [1] for
further details.
        Much work has been performed in classification of EEG signals by various
methods namely, linear and non linear methods [2], support vector machine employing
model based method [3-4], artificial neural network method [5-6], Hidden Markov model
[7-8], non-iterative method based on Ambiguity function [9]. The cited works researched
for better features and classifiers and reported the results. However, much work have not
been anticipated towards the number of EEG signals to be used to train the classifier. In
the present study, a statistical method is proposed to find the required minimum number
of samples to be trained so as to get good classification accuracy with statistical stability.
        Many works on minimum sample size determination have been reported in the
field of bioinformatics and other clinical studies, to name a few, micro array data [10],
cDNA arrays [11], transcription level [12] etc. Based on these works, data-driven
hypotheses could be developed which in turn furthers EEG signal analysis research. An
appropriate guideline was not proposed so far to choose minimum sample size of EEG
signals for classification using machine learning approach. Hence, one has to resort to
                                              2
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

some thumb rules which lack mathematical reasoning or blindly follow some previous
work as basis for fixing the sample size. This is the fundamental motivation for taking up
this study.
        In this paper, F-test based statistical power analysis is used to find the minimum
number of samples required to separate the classes with statistical stability. The minimum
sample size is also determined using an entropy based algorithm called ‘J48’ algorithm.
The results are compared and sample size guidelines are presented for EEG signals
classification in the conclusion section.

2. Feature Extraction and Feature Selection
The feature extraction is performed through three techniques namely, Burg AR, Yule,
Yule walker. Three features from each technique were obtained and thus totally nine
features are available. Following the foot steps of [69], one can perform the feature
selection using J48 decision tree algorithm to reduce the dimension of the dataset. In the
present study, the most important five features were selected; they are f2, f4, f6, f7, f9.
For the rest of the study, only these five AR features were considered.

3. Determination of Minimum Sample Size
 This study uses a set of Burg AR, yule, and yule-walker features extracted from the EEG
signals, instead of using EEG signals directly. With the selected set of five features along
with their class labels form the data set to determine the minimum sample size using
power analysis. The method of performing power analysis is discussed in section 4. The
results of power analysis were verified with the help of a functional test namely J48
algorithm, as J48 algorithm could be used as a classifier with ten-fold cross validation
method to validate the results. The results are presented and discussed in section 6.

4. Power analysis
Sample size has a great influence in any experimental study, because the result of an
experiment is based on the sample size. Power analysis has been used in many
applications [13-15]. It is based on two measures of statistical reliability in the hypothesis
test, namely the confidence interval (1-α) and power (1-β). The test compares null
hypothesis (H0) against the alternative hypothesis (H1). The null hypothesis is defined as
the means of the classes are the same whereas alternative hypothesis is defined as the
means of the classes are not the same. In hypothesis test, the probability of accepting null
hypothesis is the confidence level and that of alternative hypothesis is the power of the
test [10]. The estimation of sample size in power analysis is done such that the
confidence and the power (statistical reliability measures) in hypothesis test could reach
predefined values. The confidence level and the power are calculated from the
distributions of the null hypothesis and alternative hypothesis. In case of multi-class
problem (number of classes greater than two), instead of t-statistic, the F-statistic measure
derived from Pillai’s V formula [16] is used for the estimation of sample size. Pillai’s V
is the trace of the matrix defined by the ratio of between-group variance (B) to total
variance (T). It is a statistical measure often used in multivariate analysis of variance
                                              3
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

                                                                                 h
                                                                                        λi
(MANOVA) [16]. The Pillai’s V trace is given by V = trace( BT −1 ) = ∑                          Where λi
                                                                                i =1   λi + 1
is the ith eigen value of W-1B in which W is the within-group variance and h is the
number of factors being considered in MANOVA, defined by h = c-1. A high Pillai’s V
means a high amount of separation between the samples of classes, with the between-
group variance being relatively large compared to the total variance. The hypothesis test
can be designed as follows using F statistic transformed from Pillai’s V.

H 0 : µ1 = µ 2 = µ3 .... = µ c ;
                                                                  (V / s ) /( ph)
H 1 :There exists i, j such that µi − µ j ≠ 0 H 0 : F =
                                                       (1 − (V / s)) [ s( N − c − p + s)]
                                                                       (V / s ) /( ph)
~ F ( ph, s ( N − c − p + s ))                    H1 : F =
                                                             (1 − (V / s)) [s( N − c − p + s)]
~ F [ ph, s ( N − c − p + s ), ∆ = s∆ e N ]

               Vcrit
with ∆ e =               , where p and c are the number of variables and the number of classes,
           ( s − Vcrit )
respectively. ‘s’ is defined by min(p, h ). By using these defined distributions of H0 and
H1, the confidence level and the power could be calculated for a given sample size and
effect size. The method used for two-class problem is used here to estimate the minimum
sample size for statistical stability whereby the sample size is increased until the
calculated power reaches the predefined threshold of 1- β .

5. J48 algorithm
The classification is done through a decision tree with its leaves representing the different
brain conditions of the human beings. The sequential branching process ending up with
the leaves here is based on conditional probabilities associated with individual features. It
is reported that C4.5 model introduced by J.R. Quinlan has good predictive accuracy,
good speed, minimum computational cost and better level of understanding. Decision tree
algorithm (C4.5) has two phases: Building and Pruning. In building phase, the
construction of decision tree depends very much on how a test attribute X is selected.
C4.5 uses information entropy evaluation function as the selection criteria. Gain Ratio(X)
compensates for the weak point of Gain(X) which represents the quantity of information
provided by X in the training set. Therefore, an attribute with the highest Gain Ratio ( X )
is taken as the root of the decision tree. In pruning phase, the C4.5 algorithm uses an
error-based post-pruning strategy to deal with over-training problem. For each
classification node C4.5 calculates a kind of predicted error rate based on the total
aggregate of misclassifications at that particular node. The error-based pruning technique
essentially reduces to the replacement of vast sub-trees in the classification structure by
singleton nodes or simple branch collections if these actions contribute to a drop in the
overall error rate of the root node.

                                                 4
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

6. Results and Discussion
In the present study, F-test based power analysis was used to determine the required
sample size of the EEG signals to train the classifier. In this study, the test namely, priori
sample size computation of multivariate analysis of variance (MANOVA) with repeated
measures and within-between interactions has been performed on the data set. The
expected power level was set to be 95% (it is equivalent to α = 0.05 and β = 0.05). As a
basis for this problem, a data set consisting of 100 samples from each class was
considered. The data set was tested for normality, homogeneity and independence in
standard error. The central and non-central distributions with critical F value of the result
are shown in Fig. 1. While calculating sample size in power analysis, an important
parameter namely, Pillai’s V would be found as 0.3396. Power analysis was applied and
sample size has been found for various set of confidence and power level. It was found
that 81 samples are required to train the classifier so as to get good statistical stability
with power level (1-β) of 95% . A series of experiments were carried out to calculate the
sample sizes for 90%, 85%, 80%, 75%. of power level and the results are tabulated in
Table.1. The results of these experiments are presented as curves in a graph (Refer Fig.2).
In another set of experiments, sample size was computed for various power levels with α
error probability 5% and it was also found to be 16 per class. The experiment was
repeated with the α error probabilities 10%, 20%, and so on up to 40%. The values in the
Table 1 and the graph in Fig. 2 would enable the one working in this field to fix the
sample size for training the classifier. The required total sample size could be viewed as a
function of effect size for various α error probabilities (0.05 to 0.4 in steps of 0.05) and
power level 1- β (0.95 to 0.6 in steps of 0.05). To study the influence of the effect size on
the required sample size, a set of experiments were carried out and it is found that effect
size increase is inversely proportional to the sample size with increase in α error
probability whereas power level increase is directly proportional to the total sample size
with increase in effect size. Hence it is clear that effect size is also an important
parameter while computing the sample size. Alternatively, one may be interested in
knowing the results for α error probability as a function of effect size for various power
levels. For a given power level, as the α error probability increases the effect size
decreases. For a given α error probability, as power level increases the effect size also
increases. It is to be noted that these effects are for a fixed sample size (81 in this case).
Table 1 Power analysis test results with Effect size f(V)= 0.3046038, Number of
groups=5, Repetitions=5 with Pilli’s V formula value = 0.339623, Numerator df =16

                          α=0.05,      α=0.10,     α=0.15,       α=0.20,        α=0.25,
   Output parameter
                         1-β=0.95     1-β=0.90     1-β=0.85     1-β=0.80       1-β=0.75

     Noncentrality      30.061846      21.8969    17.072159 13.360820         10.391749
      parameter λ

                                              5
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

       Critical F        1.676863     1.503896     1.394382    1.3312269       1.245602

    Denominator df          304          216          164          124             92

   Total sample size         81          59           46            36             28
     for 5 classes

    Sample size per         ≈16          ≈12          ≈9            ≈7             ≈6
       classes

     Actual power        0.952907     0.903578     0.856716     0.804849       0.751608



        The results of power analysis are validated through a decision tree algorithm
namely J48 algorithm. The classifier’s job here is two-fold process: feature selection and
feature classification. To study the effect of the sample size on classification accuracy,
the sample size was constantly reduced with a decrement of 5 samples and the
corresponding classification accuracies were noted. The variation in classification
accuracy with respect to the sample size is shown in Fig. 3. One can observe that the
classification accuracy falls abruptly when the sample size is reduced below 15 per class.
This means if one has just 16 samples per class, it is possible to train a classifier however;
the objective of the study is one step further – what is the minimum sample size that will
have statistical stability. One can notice that the oscillation of the classification accuracy
is more when the sample is low and it tends to stabilize as sample size increases. When
sample size reaches 100, the oscillation is very small. In classification, the root mean
square error and mean absolute error as a function of sample size is plotted and shown in
Fig. 4 and Fig. 5 respectively.
        One should note that the maximum classification accuracy of J48 algorithm for
100 samples per class data set was found to be 80%. It is implied that the reduction in
classification accuracy is not due to variation in the input data. It may be due to lack of
ability of the classifier for the given data set. For computing relative change in
classification accuracy, 85% was used as reference value. For comparing the results
obtained in power analysis and J48 algorithm, from Table 1, a representative value
corresponding to 15% of α error probability was taken. As per power analysis result,
from Table 1, if one can accommodate 15% of α error probability and willing to accept
85% of power level, then for given data set the minimum required sample size is 9. This
means, if 9 samples were used for training the classifier, the maximum α error
probability that likely to happen would be 15%. This has to be validated with J48
classifier’s classification results. From Fig.5, it is evident that the mean absolute error is
below 15% for cases whose sample size is greater than or equal to 10. The corresponding
root mean square error is shown in Fig. 4.


                                               6
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME




 Fig. 1 F Test- MANOVA: Repeated measures, within-between interaction: Sample
size calculation with Effect size f(V)= 0.3046038, α error probability=0.05, Power (1-
     β error probability)=0.95, Number of groups=5, Repetitions=5 with Pilli’s V
                                        formula.




Fig. 2 Total sample size as a function of α error probability for various power levels
                                                   90

                                                   80

                                                   70
                         Classification Accuracy




                                                   60

                                                   50

                                                   40

                                                   30

                                                   20

                                                   10

                                                   0
                                                        0   20   40        60        80   100   120
                                                                      Sam ple Size




            Fig. 3 Total sample size as a function of classification accuracy


                                                                       7
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

                                                        0.6


                                                        0.5




                          Root mean Squared Error
                                                        0.4


                                                        0.3


                                                        0.2


                                                        0.1


                                                          0
                                                               0   20   40        60        80   100   120
                                                                             Sam ple Size




           Fig. 4 Total sample size as a function of root mean squared error
                                                        0.35

                                                         0.3
                         M ea n A b s o lu te E rro r




                                                        0.25

                                                         0.2

                                                        0.15

                                                         0.1

                                                        0.05

                                                          0
                                                               0   20   40        60        80   100   120
                                                                              Sampl Size



              Fig. 5 Total sample size as a function of mean absolute error
        In classification problem, the mean absolute error of the classifier is a measure of
type I error (α error probability). Type I error is an error due to misclassification of the
classifier. In general, type I error is rejecting a null hypothesis when it is true. α error
probability is a measure of type I error in hypothesis testing and hence, the equivalence is
obvious. From the above discussion, the results of power analysis are true and the actual
error did not exceed the upper bound (15%) found in power analysis. A similar exercise
of validating the results at other points also assures the validity of the power analysis test.
Thus one can confidently use the sample size suggested by power analysis for machine
learning approach to classify EEG signals.
7. Conclusion
        In the present study, machine learning approach was applied to classify the EEG
signals of a data set consisting of 500 EEG signals with five different types. A statistical
method for determination of minimum sample size using power analysis has been
proposed and the results are validated by using a functional test J48 algorithm. From the
results and discussions, it is evident that 16 samples per class are required in order to
have a power level of 95% and an α error probability of 5%. For other paired values of
power level and α error probability, the required minimum sample sizes were tabulated in
Table 1. The effects of α error probability and power level on sample size were also
discussed in detail. These graphs would definitely serve as a guideline for researchers
working with EEG signals to fix the sample size (machine learning approach).
                                                                              8
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME

8. References
[1]. Andrzejak, R.G., Lehnertz, K., Mormann, F., Rieke, C., David, P., Elger, C.E., 2001,
‘Indications of nonlinear deterministic and finite-dimensional structures in time series of
brain electrical activity: Dependence on recording region and brain state’, The American
Physical Society, Physical Review E64, 061907, pp. 14-16.
[2]. Deon Garrett, David A. Peterson, Charles W. Anderson, Michael H. Thaut, 2003,
‘Comparison of linear and nonlinear methods for EEG signal classification’. IEEE
Transactions on Neural Systems and Rehabilitative Engineering, 11, pp. 141-144.
[3]. Derya Ubeyli, 2010, ‘Least squares support vector machine employing model-based
methods coefficients for analysis of EEG signals’, Expert Systems with Applications, 37,
pp. 233–239,.
[4]. Shiliang Sun – changshni zhang, 2006 ‘Adaptive feature extraction for EEG signal
classification’, Mcd Bio engg comp, 44, pp. 931-935.
[5]. Vijay Khare, Jayashree Santhosh and Sneh Anand, 2008, ‘Classification of EEG
signals based on neural network to discriminate five mental states’, Proceedings of SPIT-
IEEE Colloquium and International Conference, 11, Feb. 4-5, Mumbai, India.
[6]. Alexandre Frédéric, Bedoui Mohamed Hedi, Kerkeni Nizar, Ben Khalifa Khaled,
2006, ‘Supervised neuronal approaches for EEG signal classification: experimental
studies’, The 10th IASTED International Conference on Artificial Intelligence and Soft
Computing – ASC, Aug, 29, Palma de Mallorca, Spain.
[7]. Jakub stastny , Pavel sovka , Andrej stancak, 2003, ‘EEG Signal Classification:
Introduction to the Problem’, Radio Engineering, 12, pp. 51-55.
[8]. Stastny, J., Sovka, P., Staneak, A, 2002, ‘EEG signal classification and
segmentation by means of hidden markov models’, Bio signal, pp. 415-417.
[9]. Ecole polytechnique federale de Lausanne, 2002, ‘EEG Signal Classification for
brain computer interface applications’, March, 28th, Jorge Baztarrica Ochoa.
[10].Daehee Hwang, William A. Schmitt, George Stephanopoulos and Gregory
Stephanopoulos, 2002, ‘Determination of sample size and discriminatory expression
patterns in microarray data’, Bioinformatics,18, pp. 1184-1193,.
[11]. Schena, M., Shalon, D., Davis, R.W. and Brown, P.O, 1995, ‘Quatitative
monitoring of gene-expression patterns with a complementary-DNA microarray’,
Science, 270, pp.467-470.
[12]. Lockhart.D.J.,et al, 1996, ‘Expression monitoring by hybridization to high density
oligonucleotide arrays’, Nat.Biotechnol, 14, pp.1675-1680.
[13]. Kraemer.H.C. and Thiemann.S., 1987 ‘How many subjects? Sage’, Newbury Park,
CA,.
[14]. Mace.A.E., 1974, ‘Sample size determination’, Krieger, Huntington, NY,.
[15]. Pillai, K. C. S., and Mijares, T. A, 1959, ‘On the moments of the trace of a matrix
and approximations to its distribution’, Annals of Mathematical Statistics, 30, 1135-1140.
[16]. Cohen, J., 1969, ‘Statistical Power Analysis for the Behavioral Sciences’, New
York: Academic Press,.




                                              9

More Related Content

What's hot

Improved feature exctraction process to detect seizure using CHBMIT-dataset ...
Improved feature exctraction process to detect seizure  using CHBMIT-dataset ...Improved feature exctraction process to detect seizure  using CHBMIT-dataset ...
Improved feature exctraction process to detect seizure using CHBMIT-dataset ...IJECEIAES
 
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...IJECEIAES
 
Segmentation
SegmentationSegmentation
SegmentationKavitha A
 
rob 537 final paper(fourth modify)
rob 537 final paper(fourth modify)rob 537 final paper(fourth modify)
rob 537 final paper(fourth modify)Huanchi Cao
 
IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...
IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...
IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...IRJET Journal
 
IRJET- Depression Prediction System using Different Methods
IRJET- Depression Prediction System using Different MethodsIRJET- Depression Prediction System using Different Methods
IRJET- Depression Prediction System using Different MethodsIRJET Journal
 
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAA BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAIJSCAI Journal
 
Feature Extraction and Classification of NIRS Data
Feature Extraction and Classification of NIRS DataFeature Extraction and Classification of NIRS Data
Feature Extraction and Classification of NIRS DataPritam Mondal
 
Review on classification based on artificial
Review on classification based on artificialReview on classification based on artificial
Review on classification based on artificialijasa
 
Mri brain tumour detection by histogram and segmentation
Mri brain tumour detection by histogram and segmentationMri brain tumour detection by histogram and segmentation
Mri brain tumour detection by histogram and segmentationiaemedu
 
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...ijsc
 
Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...
Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...
Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...Atrija Singh
 
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...sipij
 
A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data
A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data
A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data ijscai
 

What's hot (16)

Improved feature exctraction process to detect seizure using CHBMIT-dataset ...
Improved feature exctraction process to detect seizure  using CHBMIT-dataset ...Improved feature exctraction process to detect seizure  using CHBMIT-dataset ...
Improved feature exctraction process to detect seizure using CHBMIT-dataset ...
 
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...
 
Segmentation
SegmentationSegmentation
Segmentation
 
rob 537 final paper(fourth modify)
rob 537 final paper(fourth modify)rob 537 final paper(fourth modify)
rob 537 final paper(fourth modify)
 
IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...
IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...
IRJET- Analysis of Autism Spectrum Disorder using Deep Learning and the Abide...
 
IRJET- Depression Prediction System using Different Methods
IRJET- Depression Prediction System using Different MethodsIRJET- Depression Prediction System using Different Methods
IRJET- Depression Prediction System using Different Methods
 
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAA BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
 
Feature Extraction and Classification of NIRS Data
Feature Extraction and Classification of NIRS DataFeature Extraction and Classification of NIRS Data
Feature Extraction and Classification of NIRS Data
 
SAM 40
SAM 40SAM 40
SAM 40
 
Review on classification based on artificial
Review on classification based on artificialReview on classification based on artificial
Review on classification based on artificial
 
Mri brain tumour detection by histogram and segmentation
Mri brain tumour detection by histogram and segmentationMri brain tumour detection by histogram and segmentation
Mri brain tumour detection by histogram and segmentation
 
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
 
Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...
Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...
Recognition of Epilepsy from Non Seizure Electroencephalogram using combinati...
 
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...
 
I365358
I365358I365358
I365358
 
A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data
A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data
A Binary Bat Inspired Algorithm for the Classification of Breast Cancer Data
 

Similar to Sample size determination for classification of eeg signals using power analysis in machine learning approach

Survey analysis for optimization algorithms applied to electroencephalogram
Survey analysis for optimization algorithms applied to electroencephalogramSurvey analysis for optimization algorithms applied to electroencephalogram
Survey analysis for optimization algorithms applied to electroencephalogramIJECEIAES
 
CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...
CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...
CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...IRJET Journal
 
Health electroencephalogram epileptic classification based on Hilbert probabi...
Health electroencephalogram epileptic classification based on Hilbert probabi...Health electroencephalogram epileptic classification based on Hilbert probabi...
Health electroencephalogram epileptic classification based on Hilbert probabi...IJECEIAES
 
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal ClassificationA Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal Classificationsipij
 
A COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATION
A COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATIONA COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATION
A COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATIONsipij
 
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal ClassificationA Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal Classificationsipij
 
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal ClassificationA Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal Classificationsipij
 
IRJET- Machine Learning Approach for Emotion Recognition using EEG Signals
IRJET-  	  Machine Learning Approach for Emotion Recognition using EEG SignalsIRJET-  	  Machine Learning Approach for Emotion Recognition using EEG Signals
IRJET- Machine Learning Approach for Emotion Recognition using EEG SignalsIRJET Journal
 
The second seminar
The second seminarThe second seminar
The second seminarAhmedMahany
 
IRJET-A Survey on Effect of Meditation on Attention Level Using EEG
IRJET-A Survey on Effect of Meditation on Attention Level Using EEGIRJET-A Survey on Effect of Meditation on Attention Level Using EEG
IRJET-A Survey on Effect of Meditation on Attention Level Using EEGIRJET Journal
 
METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...
METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...
METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...IAEME Publication
 
Effective electroencephalogram based epileptic seizure detection using suppo...
Effective electroencephalogram based epileptic seizure detection  using suppo...Effective electroencephalogram based epileptic seizure detection  using suppo...
Effective electroencephalogram based epileptic seizure detection using suppo...IJECEIAES
 
Efficient electro encephelogram classification system using support vector ma...
Efficient electro encephelogram classification system using support vector ma...Efficient electro encephelogram classification system using support vector ma...
Efficient electro encephelogram classification system using support vector ma...nooriasukmaningtyas
 
Wavelet-Based Approach for Automatic Seizure Detection Using EEG Signals
Wavelet-Based Approach for Automatic Seizure Detection Using EEG SignalsWavelet-Based Approach for Automatic Seizure Detection Using EEG Signals
Wavelet-Based Approach for Automatic Seizure Detection Using EEG SignalsIRJET Journal
 
Classification of physiological diseases using eeg signals and machine learni...
Classification of physiological diseases using eeg signals and machine learni...Classification of physiological diseases using eeg signals and machine learni...
Classification of physiological diseases using eeg signals and machine learni...eSAT Journals
 
EEG Classification using Semi Supervised Learning
EEG Classification using Semi Supervised LearningEEG Classification using Semi Supervised Learning
EEG Classification using Semi Supervised Learningijtsrd
 
5. detection and separation of eeg artifacts using wavelet transform nov 11, ...
5. detection and separation of eeg artifacts using wavelet transform nov 11, ...5. detection and separation of eeg artifacts using wavelet transform nov 11, ...
5. detection and separation of eeg artifacts using wavelet transform nov 11, ...IAESIJEECS
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Sarvesh Kumar
 

Similar to Sample size determination for classification of eeg signals using power analysis in machine learning approach (20)

Survey analysis for optimization algorithms applied to electroencephalogram
Survey analysis for optimization algorithms applied to electroencephalogramSurvey analysis for optimization algorithms applied to electroencephalogram
Survey analysis for optimization algorithms applied to electroencephalogram
 
CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...
CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...
CLASSIFICATION OF ELECTROENCEPHALOGRAM SIGNALS USING XGBOOST ALGORITHM AND SU...
 
Health electroencephalogram epileptic classification based on Hilbert probabi...
Health electroencephalogram epileptic classification based on Hilbert probabi...Health electroencephalogram epileptic classification based on Hilbert probabi...
Health electroencephalogram epileptic classification based on Hilbert probabi...
 
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal ClassificationA Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
 
A COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATION
A COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATIONA COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATION
A COMPARATIVE STUDY OF MACHINE LEARNING ALGORITHMS FOR EEG SIGNAL CLASSIFICATION
 
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal ClassificationA Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
 
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal ClassificationA Comparative Study of Machine Learning Algorithms for EEG Signal Classification
A Comparative Study of Machine Learning Algorithms for EEG Signal Classification
 
IRJET- Machine Learning Approach for Emotion Recognition using EEG Signals
IRJET-  	  Machine Learning Approach for Emotion Recognition using EEG SignalsIRJET-  	  Machine Learning Approach for Emotion Recognition using EEG Signals
IRJET- Machine Learning Approach for Emotion Recognition using EEG Signals
 
The second seminar
The second seminarThe second seminar
The second seminar
 
IRJET-A Survey on Effect of Meditation on Attention Level Using EEG
IRJET-A Survey on Effect of Meditation on Attention Level Using EEGIRJET-A Survey on Effect of Meditation on Attention Level Using EEG
IRJET-A Survey on Effect of Meditation on Attention Level Using EEG
 
METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...
METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...
METHODS FOR IMPROVING THE CLASSIFICATION ACCURACY OF BIOMEDICAL SIGNALS BASED...
 
Effective electroencephalogram based epileptic seizure detection using suppo...
Effective electroencephalogram based epileptic seizure detection  using suppo...Effective electroencephalogram based epileptic seizure detection  using suppo...
Effective electroencephalogram based epileptic seizure detection using suppo...
 
20120140506005
2012014050600520120140506005
20120140506005
 
Efficient electro encephelogram classification system using support vector ma...
Efficient electro encephelogram classification system using support vector ma...Efficient electro encephelogram classification system using support vector ma...
Efficient electro encephelogram classification system using support vector ma...
 
40120130405012
4012013040501240120130405012
40120130405012
 
Wavelet-Based Approach for Automatic Seizure Detection Using EEG Signals
Wavelet-Based Approach for Automatic Seizure Detection Using EEG SignalsWavelet-Based Approach for Automatic Seizure Detection Using EEG Signals
Wavelet-Based Approach for Automatic Seizure Detection Using EEG Signals
 
Classification of physiological diseases using eeg signals and machine learni...
Classification of physiological diseases using eeg signals and machine learni...Classification of physiological diseases using eeg signals and machine learni...
Classification of physiological diseases using eeg signals and machine learni...
 
EEG Classification using Semi Supervised Learning
EEG Classification using Semi Supervised LearningEEG Classification using Semi Supervised Learning
EEG Classification using Semi Supervised Learning
 
5. detection and separation of eeg artifacts using wavelet transform nov 11, ...
5. detection and separation of eeg artifacts using wavelet transform nov 11, ...5. detection and separation of eeg artifacts using wavelet transform nov 11, ...
5. detection and separation of eeg artifacts using wavelet transform nov 11, ...
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
 

More from iaemedu

Tech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech inTech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech iniaemedu
 
Integration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniquesIntegration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniquesiaemedu
 
Effective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using gridEffective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using gridiaemedu
 
Effect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routingEffect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routingiaemedu
 
Adaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow applicationAdaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow applicationiaemedu
 
Survey on transaction reordering
Survey on transaction reorderingSurvey on transaction reordering
Survey on transaction reorderingiaemedu
 
Semantic web services and its challenges
Semantic web services and its challengesSemantic web services and its challenges
Semantic web services and its challengesiaemedu
 
Website based patent information searching mechanism
Website based patent information searching mechanismWebsite based patent information searching mechanism
Website based patent information searching mechanismiaemedu
 
Revisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modificationRevisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modificationiaemedu
 
Prediction of customer behavior using cma
Prediction of customer behavior using cmaPrediction of customer behavior using cma
Prediction of customer behavior using cmaiaemedu
 
Performance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presencePerformance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presenceiaemedu
 
Performance measurement of different requirements engineering
Performance measurement of different requirements engineeringPerformance measurement of different requirements engineering
Performance measurement of different requirements engineeringiaemedu
 
Mobile safety systems for automobiles
Mobile safety systems for automobilesMobile safety systems for automobiles
Mobile safety systems for automobilesiaemedu
 
Efficient text compression using special character replacement
Efficient text compression using special character replacementEfficient text compression using special character replacement
Efficient text compression using special character replacementiaemedu
 
Agile programming a new approach
Agile programming a new approachAgile programming a new approach
Agile programming a new approachiaemedu
 
Adaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environmentAdaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environmentiaemedu
 
A survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow applicationA survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow applicationiaemedu
 
A survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networksA survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networksiaemedu
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyiaemedu
 
A self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imageryA self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imageryiaemedu
 

More from iaemedu (20)

Tech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech inTech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech in
 
Integration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniquesIntegration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniques
 
Effective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using gridEffective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using grid
 
Effect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routingEffect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routing
 
Adaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow applicationAdaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow application
 
Survey on transaction reordering
Survey on transaction reorderingSurvey on transaction reordering
Survey on transaction reordering
 
Semantic web services and its challenges
Semantic web services and its challengesSemantic web services and its challenges
Semantic web services and its challenges
 
Website based patent information searching mechanism
Website based patent information searching mechanismWebsite based patent information searching mechanism
Website based patent information searching mechanism
 
Revisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modificationRevisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modification
 
Prediction of customer behavior using cma
Prediction of customer behavior using cmaPrediction of customer behavior using cma
Prediction of customer behavior using cma
 
Performance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presencePerformance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presence
 
Performance measurement of different requirements engineering
Performance measurement of different requirements engineeringPerformance measurement of different requirements engineering
Performance measurement of different requirements engineering
 
Mobile safety systems for automobiles
Mobile safety systems for automobilesMobile safety systems for automobiles
Mobile safety systems for automobiles
 
Efficient text compression using special character replacement
Efficient text compression using special character replacementEfficient text compression using special character replacement
Efficient text compression using special character replacement
 
Agile programming a new approach
Agile programming a new approachAgile programming a new approach
Agile programming a new approach
 
Adaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environmentAdaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environment
 
A survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow applicationA survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow application
 
A survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networksA survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networks
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classify
 
A self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imageryA self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imagery
 

Sample size determination for classification of eeg signals using power analysis in machine learning approach

  • 1. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSNIN – INTERNATIONAL JOURNAL OF ADVANCED RESEARCH 0976 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) IJARET Volume 3, Issue 1, January- June (2012), pp. 01-09 © IAEME: www.iaeme.com/ijaret.html ©IAEME Journal Impact Factor (2011): 0.7315 (Calculated by GISI) www.jifactor.com SAMPLE SIZE DETERMINATION FOR CLASSIFICATION OF EEG SIGNALS USING POWER ANALYSIS IN MACHINE LEARNING APPROACH V. Indira1, R. Vasanthakumari2 , V. Sugumaran3 1 2 3 Research Scholar, Principal Associate professor, School Deptrtment of Mathematics, Kasthurba college for of Mechanical and Building Karpagam University women Sciences, VIT university Coimbatore, Puducherry-605008. Chennai campus, Vandalur- India. vasunthara1@yahoo.com keelambakkam road, indira_manik@yahoo.co.in Chennai-600048 v_sugu@yahoo.com ABSTRACT Recent advances in computer hardware and signal processing have made possible the use of electroencephalogram (EEG) signals or “brain waves” for communication between humans and computers. Also, the signals that are extracted from brain through EEG are required to improve brain computer interface (BCI). EEG signal analysis and classification is one of the prominent researches in the field of BCI. In recent days, machine learning approach is used for classification of EEG signals. This method consists of a chain of activities namely, data acquisition, feature extraction, feature selection, and feature classification. In order to extract the features representing the EEG signals, the spectral analysis of the EEG signals was performed by using the three model- based methods (Burg autoregressive – AR, least squares modified Yule–Walker autoregressive methods. It is easy to collect EEG signals from healthy volunteers whereas it is too difficult to collect a sizable number of EEG recordings of those who are having one particular problem. Hence it is essential to know the minimum number of samples required to train the classifier. The objective of this paper is to determine the minimum number of signals required for training the classifier in machine learning approach to EEG signal analysis so as to get the best classification accuracy. A data set consists of 500 EEG signals with five different types was considered. A method for determination of 1
  • 2. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME minimum sample size using power analysis has been proposed and the results are validated using J48 classifier (A java implementation of C4.5 algorithm). Keywords: Auto regressive method,. EEG signals; Machine learning approach; Power analysis; Sample size; 1. Introduction Study of brain computer interface (BCI) has mainly involved in recording of electroencephalographic signals. The electroencephalogram (EEG), a highly complex signal is one of the most common sources of information used to study brain function and neurological diseases and has great potential for the diagnosis and treatment of mental diseases and abnormalities. In clinical contexts, EEG refers to the recording of the brain’s spontaneous electrical activity over a short period of time, usually 20-40 minutes, as recorded from multiple electrodes placed on the scalp. Also EEG signals are widely used in intensive care units for brain function monitoring. Hence the classification of EEG signals has become very essential. Larger and larger techniques are being used to analyze the EEG signal and extract useful information out of it. Machine learning approach may be considered as a candidate to analyze EEG signal.The dataset described in [1] which is publicly available was used for the present study. This data set consists of five sets denoted by (A, B, C, D and E), each containing 100 single-channel EEG signals. Each signal has been selected after visual inspection for artifacts and has passed a weak stationarity criterion. Sets A and B have been taken from surface EEG recordings of five healthy volunteers with eyes open and closed. Signals in two sets have been measured in seizure-free intervals from five patients in the epileptogenic zone (D) and from hippocampal formation of the opposite hemisphere of the brain (C). Set E contains seizure activity, selected from all recording sites exhibiting ictal activity. Sets A and B have been recorded extra cranially, whereas sets C, D and E have been recorded intracranially. Here only a short description of the data set is presented and refer to [1] for further details. Much work has been performed in classification of EEG signals by various methods namely, linear and non linear methods [2], support vector machine employing model based method [3-4], artificial neural network method [5-6], Hidden Markov model [7-8], non-iterative method based on Ambiguity function [9]. The cited works researched for better features and classifiers and reported the results. However, much work have not been anticipated towards the number of EEG signals to be used to train the classifier. In the present study, a statistical method is proposed to find the required minimum number of samples to be trained so as to get good classification accuracy with statistical stability. Many works on minimum sample size determination have been reported in the field of bioinformatics and other clinical studies, to name a few, micro array data [10], cDNA arrays [11], transcription level [12] etc. Based on these works, data-driven hypotheses could be developed which in turn furthers EEG signal analysis research. An appropriate guideline was not proposed so far to choose minimum sample size of EEG signals for classification using machine learning approach. Hence, one has to resort to 2
  • 3. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME some thumb rules which lack mathematical reasoning or blindly follow some previous work as basis for fixing the sample size. This is the fundamental motivation for taking up this study. In this paper, F-test based statistical power analysis is used to find the minimum number of samples required to separate the classes with statistical stability. The minimum sample size is also determined using an entropy based algorithm called ‘J48’ algorithm. The results are compared and sample size guidelines are presented for EEG signals classification in the conclusion section. 2. Feature Extraction and Feature Selection The feature extraction is performed through three techniques namely, Burg AR, Yule, Yule walker. Three features from each technique were obtained and thus totally nine features are available. Following the foot steps of [69], one can perform the feature selection using J48 decision tree algorithm to reduce the dimension of the dataset. In the present study, the most important five features were selected; they are f2, f4, f6, f7, f9. For the rest of the study, only these five AR features were considered. 3. Determination of Minimum Sample Size This study uses a set of Burg AR, yule, and yule-walker features extracted from the EEG signals, instead of using EEG signals directly. With the selected set of five features along with their class labels form the data set to determine the minimum sample size using power analysis. The method of performing power analysis is discussed in section 4. The results of power analysis were verified with the help of a functional test namely J48 algorithm, as J48 algorithm could be used as a classifier with ten-fold cross validation method to validate the results. The results are presented and discussed in section 6. 4. Power analysis Sample size has a great influence in any experimental study, because the result of an experiment is based on the sample size. Power analysis has been used in many applications [13-15]. It is based on two measures of statistical reliability in the hypothesis test, namely the confidence interval (1-α) and power (1-β). The test compares null hypothesis (H0) against the alternative hypothesis (H1). The null hypothesis is defined as the means of the classes are the same whereas alternative hypothesis is defined as the means of the classes are not the same. In hypothesis test, the probability of accepting null hypothesis is the confidence level and that of alternative hypothesis is the power of the test [10]. The estimation of sample size in power analysis is done such that the confidence and the power (statistical reliability measures) in hypothesis test could reach predefined values. The confidence level and the power are calculated from the distributions of the null hypothesis and alternative hypothesis. In case of multi-class problem (number of classes greater than two), instead of t-statistic, the F-statistic measure derived from Pillai’s V formula [16] is used for the estimation of sample size. Pillai’s V is the trace of the matrix defined by the ratio of between-group variance (B) to total variance (T). It is a statistical measure often used in multivariate analysis of variance 3
  • 4. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME h λi (MANOVA) [16]. The Pillai’s V trace is given by V = trace( BT −1 ) = ∑ Where λi i =1 λi + 1 is the ith eigen value of W-1B in which W is the within-group variance and h is the number of factors being considered in MANOVA, defined by h = c-1. A high Pillai’s V means a high amount of separation between the samples of classes, with the between- group variance being relatively large compared to the total variance. The hypothesis test can be designed as follows using F statistic transformed from Pillai’s V. H 0 : µ1 = µ 2 = µ3 .... = µ c ; (V / s ) /( ph) H 1 :There exists i, j such that µi − µ j ≠ 0 H 0 : F = (1 − (V / s)) [ s( N − c − p + s)] (V / s ) /( ph) ~ F ( ph, s ( N − c − p + s )) H1 : F = (1 − (V / s)) [s( N − c − p + s)] ~ F [ ph, s ( N − c − p + s ), ∆ = s∆ e N ] Vcrit with ∆ e = , where p and c are the number of variables and the number of classes, ( s − Vcrit ) respectively. ‘s’ is defined by min(p, h ). By using these defined distributions of H0 and H1, the confidence level and the power could be calculated for a given sample size and effect size. The method used for two-class problem is used here to estimate the minimum sample size for statistical stability whereby the sample size is increased until the calculated power reaches the predefined threshold of 1- β . 5. J48 algorithm The classification is done through a decision tree with its leaves representing the different brain conditions of the human beings. The sequential branching process ending up with the leaves here is based on conditional probabilities associated with individual features. It is reported that C4.5 model introduced by J.R. Quinlan has good predictive accuracy, good speed, minimum computational cost and better level of understanding. Decision tree algorithm (C4.5) has two phases: Building and Pruning. In building phase, the construction of decision tree depends very much on how a test attribute X is selected. C4.5 uses information entropy evaluation function as the selection criteria. Gain Ratio(X) compensates for the weak point of Gain(X) which represents the quantity of information provided by X in the training set. Therefore, an attribute with the highest Gain Ratio ( X ) is taken as the root of the decision tree. In pruning phase, the C4.5 algorithm uses an error-based post-pruning strategy to deal with over-training problem. For each classification node C4.5 calculates a kind of predicted error rate based on the total aggregate of misclassifications at that particular node. The error-based pruning technique essentially reduces to the replacement of vast sub-trees in the classification structure by singleton nodes or simple branch collections if these actions contribute to a drop in the overall error rate of the root node. 4
  • 5. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME 6. Results and Discussion In the present study, F-test based power analysis was used to determine the required sample size of the EEG signals to train the classifier. In this study, the test namely, priori sample size computation of multivariate analysis of variance (MANOVA) with repeated measures and within-between interactions has been performed on the data set. The expected power level was set to be 95% (it is equivalent to α = 0.05 and β = 0.05). As a basis for this problem, a data set consisting of 100 samples from each class was considered. The data set was tested for normality, homogeneity and independence in standard error. The central and non-central distributions with critical F value of the result are shown in Fig. 1. While calculating sample size in power analysis, an important parameter namely, Pillai’s V would be found as 0.3396. Power analysis was applied and sample size has been found for various set of confidence and power level. It was found that 81 samples are required to train the classifier so as to get good statistical stability with power level (1-β) of 95% . A series of experiments were carried out to calculate the sample sizes for 90%, 85%, 80%, 75%. of power level and the results are tabulated in Table.1. The results of these experiments are presented as curves in a graph (Refer Fig.2). In another set of experiments, sample size was computed for various power levels with α error probability 5% and it was also found to be 16 per class. The experiment was repeated with the α error probabilities 10%, 20%, and so on up to 40%. The values in the Table 1 and the graph in Fig. 2 would enable the one working in this field to fix the sample size for training the classifier. The required total sample size could be viewed as a function of effect size for various α error probabilities (0.05 to 0.4 in steps of 0.05) and power level 1- β (0.95 to 0.6 in steps of 0.05). To study the influence of the effect size on the required sample size, a set of experiments were carried out and it is found that effect size increase is inversely proportional to the sample size with increase in α error probability whereas power level increase is directly proportional to the total sample size with increase in effect size. Hence it is clear that effect size is also an important parameter while computing the sample size. Alternatively, one may be interested in knowing the results for α error probability as a function of effect size for various power levels. For a given power level, as the α error probability increases the effect size decreases. For a given α error probability, as power level increases the effect size also increases. It is to be noted that these effects are for a fixed sample size (81 in this case). Table 1 Power analysis test results with Effect size f(V)= 0.3046038, Number of groups=5, Repetitions=5 with Pilli’s V formula value = 0.339623, Numerator df =16 α=0.05, α=0.10, α=0.15, α=0.20, α=0.25, Output parameter 1-β=0.95 1-β=0.90 1-β=0.85 1-β=0.80 1-β=0.75 Noncentrality 30.061846 21.8969 17.072159 13.360820 10.391749 parameter λ 5
  • 6. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME Critical F 1.676863 1.503896 1.394382 1.3312269 1.245602 Denominator df 304 216 164 124 92 Total sample size 81 59 46 36 28 for 5 classes Sample size per ≈16 ≈12 ≈9 ≈7 ≈6 classes Actual power 0.952907 0.903578 0.856716 0.804849 0.751608 The results of power analysis are validated through a decision tree algorithm namely J48 algorithm. The classifier’s job here is two-fold process: feature selection and feature classification. To study the effect of the sample size on classification accuracy, the sample size was constantly reduced with a decrement of 5 samples and the corresponding classification accuracies were noted. The variation in classification accuracy with respect to the sample size is shown in Fig. 3. One can observe that the classification accuracy falls abruptly when the sample size is reduced below 15 per class. This means if one has just 16 samples per class, it is possible to train a classifier however; the objective of the study is one step further – what is the minimum sample size that will have statistical stability. One can notice that the oscillation of the classification accuracy is more when the sample is low and it tends to stabilize as sample size increases. When sample size reaches 100, the oscillation is very small. In classification, the root mean square error and mean absolute error as a function of sample size is plotted and shown in Fig. 4 and Fig. 5 respectively. One should note that the maximum classification accuracy of J48 algorithm for 100 samples per class data set was found to be 80%. It is implied that the reduction in classification accuracy is not due to variation in the input data. It may be due to lack of ability of the classifier for the given data set. For computing relative change in classification accuracy, 85% was used as reference value. For comparing the results obtained in power analysis and J48 algorithm, from Table 1, a representative value corresponding to 15% of α error probability was taken. As per power analysis result, from Table 1, if one can accommodate 15% of α error probability and willing to accept 85% of power level, then for given data set the minimum required sample size is 9. This means, if 9 samples were used for training the classifier, the maximum α error probability that likely to happen would be 15%. This has to be validated with J48 classifier’s classification results. From Fig.5, it is evident that the mean absolute error is below 15% for cases whose sample size is greater than or equal to 10. The corresponding root mean square error is shown in Fig. 4. 6
  • 7. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME Fig. 1 F Test- MANOVA: Repeated measures, within-between interaction: Sample size calculation with Effect size f(V)= 0.3046038, α error probability=0.05, Power (1- β error probability)=0.95, Number of groups=5, Repetitions=5 with Pilli’s V formula. Fig. 2 Total sample size as a function of α error probability for various power levels 90 80 70 Classification Accuracy 60 50 40 30 20 10 0 0 20 40 60 80 100 120 Sam ple Size Fig. 3 Total sample size as a function of classification accuracy 7
  • 8. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME 0.6 0.5 Root mean Squared Error 0.4 0.3 0.2 0.1 0 0 20 40 60 80 100 120 Sam ple Size Fig. 4 Total sample size as a function of root mean squared error 0.35 0.3 M ea n A b s o lu te E rro r 0.25 0.2 0.15 0.1 0.05 0 0 20 40 60 80 100 120 Sampl Size Fig. 5 Total sample size as a function of mean absolute error In classification problem, the mean absolute error of the classifier is a measure of type I error (α error probability). Type I error is an error due to misclassification of the classifier. In general, type I error is rejecting a null hypothesis when it is true. α error probability is a measure of type I error in hypothesis testing and hence, the equivalence is obvious. From the above discussion, the results of power analysis are true and the actual error did not exceed the upper bound (15%) found in power analysis. A similar exercise of validating the results at other points also assures the validity of the power analysis test. Thus one can confidently use the sample size suggested by power analysis for machine learning approach to classify EEG signals. 7. Conclusion In the present study, machine learning approach was applied to classify the EEG signals of a data set consisting of 500 EEG signals with five different types. A statistical method for determination of minimum sample size using power analysis has been proposed and the results are validated by using a functional test J48 algorithm. From the results and discussions, it is evident that 16 samples per class are required in order to have a power level of 95% and an α error probability of 5%. For other paired values of power level and α error probability, the required minimum sample sizes were tabulated in Table 1. The effects of α error probability and power level on sample size were also discussed in detail. These graphs would definitely serve as a guideline for researchers working with EEG signals to fix the sample size (machine learning approach). 8
  • 9. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 3, Number 1, January - June (2012), © IAEME 8. References [1]. Andrzejak, R.G., Lehnertz, K., Mormann, F., Rieke, C., David, P., Elger, C.E., 2001, ‘Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state’, The American Physical Society, Physical Review E64, 061907, pp. 14-16. [2]. Deon Garrett, David A. Peterson, Charles W. Anderson, Michael H. Thaut, 2003, ‘Comparison of linear and nonlinear methods for EEG signal classification’. IEEE Transactions on Neural Systems and Rehabilitative Engineering, 11, pp. 141-144. [3]. Derya Ubeyli, 2010, ‘Least squares support vector machine employing model-based methods coefficients for analysis of EEG signals’, Expert Systems with Applications, 37, pp. 233–239,. [4]. Shiliang Sun – changshni zhang, 2006 ‘Adaptive feature extraction for EEG signal classification’, Mcd Bio engg comp, 44, pp. 931-935. [5]. Vijay Khare, Jayashree Santhosh and Sneh Anand, 2008, ‘Classification of EEG signals based on neural network to discriminate five mental states’, Proceedings of SPIT- IEEE Colloquium and International Conference, 11, Feb. 4-5, Mumbai, India. [6]. Alexandre Frédéric, Bedoui Mohamed Hedi, Kerkeni Nizar, Ben Khalifa Khaled, 2006, ‘Supervised neuronal approaches for EEG signal classification: experimental studies’, The 10th IASTED International Conference on Artificial Intelligence and Soft Computing – ASC, Aug, 29, Palma de Mallorca, Spain. [7]. Jakub stastny , Pavel sovka , Andrej stancak, 2003, ‘EEG Signal Classification: Introduction to the Problem’, Radio Engineering, 12, pp. 51-55. [8]. Stastny, J., Sovka, P., Staneak, A, 2002, ‘EEG signal classification and segmentation by means of hidden markov models’, Bio signal, pp. 415-417. [9]. Ecole polytechnique federale de Lausanne, 2002, ‘EEG Signal Classification for brain computer interface applications’, March, 28th, Jorge Baztarrica Ochoa. [10].Daehee Hwang, William A. Schmitt, George Stephanopoulos and Gregory Stephanopoulos, 2002, ‘Determination of sample size and discriminatory expression patterns in microarray data’, Bioinformatics,18, pp. 1184-1193,. [11]. Schena, M., Shalon, D., Davis, R.W. and Brown, P.O, 1995, ‘Quatitative monitoring of gene-expression patterns with a complementary-DNA microarray’, Science, 270, pp.467-470. [12]. Lockhart.D.J.,et al, 1996, ‘Expression monitoring by hybridization to high density oligonucleotide arrays’, Nat.Biotechnol, 14, pp.1675-1680. [13]. Kraemer.H.C. and Thiemann.S., 1987 ‘How many subjects? Sage’, Newbury Park, CA,. [14]. Mace.A.E., 1974, ‘Sample size determination’, Krieger, Huntington, NY,. [15]. Pillai, K. C. S., and Mijares, T. A, 1959, ‘On the moments of the trace of a matrix and approximations to its distribution’, Annals of Mathematical Statistics, 30, 1135-1140. [16]. Cohen, J., 1969, ‘Statistical Power Analysis for the Behavioral Sciences’, New York: Academic Press,. 9