WEI_slides_Igarss11_July10-11.mc.final.pptx

•Als PPTX, PDF herunterladen•

0 gefällt mir•211 views

This document proposes two active learning methods, SVM-CC and SVM-CCMS, for hyperspectral image classification that focus on identifying and sampling from critical classes. The methods use a shifting hyperplane model to identify critical class pairs with high probability of being difficult to classify. SVM-CC randomly samples from the critical class set, while SVM-CCMS samples points closest to the decision margin within critical classes. Experimental results on two hyperspectral datasets show the proposed methods outperform random sampling and concentrate samples on support vectors, particularly improving performance for hard classes.

Technologie Bildung

Critical Class Oriented Active Learning for Hyperspectral Image Classification Wei Di and Melba Crawford School of Civil Engineering, Purdue University and Laboratory for Applications of Remote Sensing Email: {wdi@purdue.edu1, mcrawford2}@purdue.edu July 28, 2011 IEEE International Geoscience and Remote Sensing Symposium

Outline ,[object Object],Critical Class Oriented Active Learning(AL) ,[object Object],Guided & ActiveLearning Critical Class Oriented Margin Sampling Based ,[object Object]

Conclusions & Future Work,[object Object]

Motivation Sampling Strategy DL Pool Intelligent sampling strategy Training Data Supervised Classifier ,[object Object]

Economically allocate resources for labeling

Focus on a specific task or requirement Target H

Active Learning Active Learning (AL) - Iterative learning circle Passive Learning Supervised Classifier Query Strategy DL Pool New xL DU Pool Output Classifier Training xU f(xu) Uncertainty & Critical Class

Classification: Tuia et al. [2009], Patraand Bruzzone [2011] Demiret al. [2011], Di and Crawford [2011], .

Critical Class oriented Active Learning- Shifting hyperplaneby pair-wise SVM ,[object Object]

GoalProvide concept level guidance for building training set favoring “difficult” classes

Key Idea: Shifting Hyperplane Pair-wise Class A and B Changing Hyperplane Hyperplane w Hyperplane Margin Margin Support Vectors Class A Class B New Samples

Critical Class Identification ,[object Object],wk - hyperplane vector by SVM for kth binary class at the t thquery. ,[object Object], Measure the cumulative changes ,[object Object],  Rank class pairs:  Prob. of the kth class pair at critical level CL :

SVM-CCRandom Query From Critical Class Set ,[object Object],Query Sample within Critical Class set and closest to margin Critical Class Set Critical Class Identification Higherprobability Critical Class Pair Critical Class Set

Data Description Kennedy Space Center & Botswana Data ,[object Object]

Empfohlen

Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang

COMBINED CLASSIFIERS FOR TIME SERIES SHAPELETScsandit

CS 402 DATAMINING AND WAREHOUSING -MODULE 4NIMMYRAJU

3.5 model based clusteringKrish_ver2

Investigating the 3D structure of the genome with Hi-C data analysistuxette

A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal

SYNOPSIS on Parse representation and Linear SVM.bhavinecindus

Educational Data Mining to Analyze Students Performance – Concept PlanIRJET Journal

Empfohlen

Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang

COMBINED CLASSIFIERS FOR TIME SERIES SHAPELETScsandit

CS 402 DATAMINING AND WAREHOUSING -MODULE 4NIMMYRAJU

3.5 model based clusteringKrish_ver2

Investigating the 3D structure of the genome with Hi-C data analysistuxette

A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal

SYNOPSIS on Parse representation and Linear SVM.bhavinecindus

Educational Data Mining to Analyze Students Performance – Concept PlanIRJET Journal

Review: Incremental Few-shot Instance Segmentation [CDM]Dongmin Choi

Prototype-based classifiers and their applications in the life sciencesUniversity of Groningen

Correlation based feature selection (cfs) technique to predict student perfro...IJCNCJournal

CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...IJCNCJournal

IRJET - Student Pass Percentage Dedection using Ensemble LearninngIRJET Journal

Analysis on Student Admission Enquiry SystemIJSRD

IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal

IRJET- Student Placement Prediction using Machine LearningIRJET Journal

IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...csandit

IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...cscpconf

A COMPREHENSIVE STUDY FOR IDENTIFICATION OF FAST AND SLOW LEARNERS USING MACH...IRJET Journal

IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET Journal

Manta ray optimized deep contextualized bi-directional long short-term memor...IJECEIAES

IGARSS2011-I-Ling.pptgrssieee

Predicting Students Performance using K-Median ClusteringIIRindia

Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...ataloadane

support vector machine and associative classificationrajshreemuthiah

A new model for iris data set classification based on linear support vector m...IJECEIAES

Tangent height accuracy of Superconducting Submillimeter-Wave Limb-Emission S...grssieee

SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELgrssieee

Weitere ähnliche Inhalte

Ähnlich wie WEI_slides_Igarss11_July10-11.mc.final.pptx

Review: Incremental Few-shot Instance Segmentation [CDM]Dongmin Choi

Prototype-based classifiers and their applications in the life sciencesUniversity of Groningen

Correlation based feature selection (cfs) technique to predict student perfro...IJCNCJournal

CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...IJCNCJournal

IRJET - Student Pass Percentage Dedection using Ensemble LearninngIRJET Journal

Analysis on Student Admission Enquiry SystemIJSRD

IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal

IRJET- Student Placement Prediction using Machine LearningIRJET Journal

IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...csandit

IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...cscpconf

A COMPREHENSIVE STUDY FOR IDENTIFICATION OF FAST AND SLOW LEARNERS USING MACH...IRJET Journal

IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET Journal

Manta ray optimized deep contextualized bi-directional long short-term memor...IJECEIAES

IGARSS2011-I-Ling.pptgrssieee

Predicting Students Performance using K-Median ClusteringIIRindia

Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...ataloadane

support vector machine and associative classificationrajshreemuthiah

A new model for iris data set classification based on linear support vector m...IJECEIAES

Ähnlich wie WEI_slides_Igarss11_July10-11.mc.final.pptx (20)

Review: Incremental Few-shot Instance Segmentation [CDM]

Prototype-based classifiers and their applications in the life sciences

Correlation based feature selection (cfs) technique to predict student perfro...

CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...

IRJET - Student Pass Percentage Dedection using Ensemble Learninng

Analysis on Student Admission Enquiry System

IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime

IRJET- Student Placement Prediction using Machine Learning

IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...

A COMPREHENSIVE STUDY FOR IDENTIFICATION OF FAST AND SLOW LEARNERS USING MACH...

IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...

Manta ray optimized deep contextualized bi-directional long short-term memor...

IGARSS2011-I-Ling.ppt

Predicting Students Performance using K-Median Clustering

Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...

support vector machine and associative classification

A new model for iris data set classification based on linear support vector m...

Mehr von grssieee

Tangent height accuracy of Superconducting Submillimeter-Wave Limb-Emission S...grssieee

SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELgrssieee

TWO-POINT STATISTIC OF POLARIMETRIC SAR DATA TWO-POINT STATISTIC OF POLARIMET...grssieee

THE SENTINEL-1 MISSION AND ITS APPLICATION CAPABILITIESgrssieee

GMES SPACE COMPONENT:PROGRAMMATIC STATUSgrssieee

PROGRESSES OF DEVELOPMENT OF CFOSAT SCATTEROMETERgrssieee

DEVELOPMENT OF ALGORITHMS AND PRODUCTS FOR SUPPORTING THE ITALIAN HYPERSPECTR...grssieee

EO-1/HYPERION: NEARING TWELVE YEARS OF SUCCESSFUL MISSION SCIENCE OPERATION A...grssieee

Testgrssieee

test 34mb wo animationsgrssieee

Test 70MBgrssieee

2011_Fox_Tax_Worksheets.pdfgrssieee

DLR open housegrssieee

Tana_IGARSS2011.pptgrssieee

Solaro_IGARSS_2011.pptgrssieee

Mehr von grssieee (20)

Tangent height accuracy of Superconducting Submillimeter-Wave Limb-Emission S...

SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL

TWO-POINT STATISTIC OF POLARIMETRIC SAR DATA TWO-POINT STATISTIC OF POLARIMET...

THE SENTINEL-1 MISSION AND ITS APPLICATION CAPABILITIES

GMES SPACE COMPONENT:PROGRAMMATIC STATUS

PROGRESSES OF DEVELOPMENT OF CFOSAT SCATTEROMETER

DEVELOPMENT OF ALGORITHMS AND PRODUCTS FOR SUPPORTING THE ITALIAN HYPERSPECTR...

EO-1/HYPERION: NEARING TWELVE YEARS OF SUCCESSFUL MISSION SCIENCE OPERATION A...

Test

test 34mb wo animations

Test 70MB

2011_Fox_Tax_Worksheets.pdf

DLR open house

Tana_IGARSS2011.ppt

Solaro_IGARSS_2011.ppt

Kürzlich hochgeladen

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

How to Remove Document Management Hurdles with X-Docs?XfilesPro

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Presentation on how to chat with PDF using ChatGPT code interpreter

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

A Domino Admins Adventures (Engage 2024)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Salesforce Community Group Quito, Salesforce 101

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Human Factors of XR: Using Human Factors to Design XR Systems

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Azure Monitor & Application Insight to monitor Infrastructure & Application

Maximizing Board Effectiveness 2024 Webinar.pptx

Benefits Of Flutter Compared To Other Frameworks

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Unblocking The Main Thread Solving ANRs and Frozen Frames

How to Remove Document Management Hurdles with X-Docs?

WEI_slides_Igarss11_July10-11.mc.final.pptx

1. Critical Class Oriented Active Learning for Hyperspectral Image Classification Wei Di and Melba Crawford School of Civil Engineering, Purdue University and Laboratory for Applications of Remote Sensing Email: {wdi@purdue.edu1, mcrawford2}@purdue.edu July 28, 2011 IEEE International Geoscience and Remote Sensing Symposium

5. Higher utility, low redundancy

6. Economically allocate resources for labeling

7. Focus on a specific task or requirement Target H

8. Active Learning Active Learning (AL) - Iterative learning circle Passive Learning Supervised Classifier Query Strategy DL Pool New xL DU Pool Output Classifier Training xU f(xu) Uncertainty & Critical Class

10. Classification: Tuia et al. [2009], Patraand Bruzzone [2011] Demiret al. [2011], Di and Crawford [2011], .

11. Segmentation: Jun et al. [2010]

12.

13. Category based query & margin sampling

14. GoalProvide concept level guidance for building training set favoring “difficult” classes

15. II. PROPOSED METHOD

16. Key Idea: Shifting Hyperplane Pair-wise Class A and B Changing Hyperplane Hyperplane w Hyperplane Margin Margin Support Vectors Class A Class B New Samples

17.

18.

19.

20. III. EXPERIMENTAL RESULTS

21.

22. Acquired on March, 1996

23. 176 of total 224 bands

24. Spectral bandwidth 10nm

25. Spatial resolution 18m* Denotes the hard classes

26.

27. Per-Class Improvement vs RSDU

28.

29. Ratio of Support VectorsCCMS SVMMS CC RS

30. IV. CONCLUSIONS AND FUTURE WORK

31.

32. Shifting Hyperplane – Provides valuable information for identifying difficult classes.

33. Critical Class Oriented Margin Sampling– Focuses on difficult classes, as well as informative samples, improve performance in multi-class problem.

34. Support Vectors - Concentrate on samples likely to be support vectors.

35. Future work

36. Investigation of feature subspaces for identifying the critical classes.

37.

38.

39. Shifting Hyperplane – Provides valuable information for identifying difficult classes.

40. Critical Class Oriented Margin Sampling– Focuses on difficult classes, as well as informative samples; improves performance in multi-class problem.

41. Support Vectors - Concentrate on samples likely to be support vectors.

42. Future work

43. Investigation of the feature subspace for identifying the critical classes.

44.

45.

46.

47.

48. Ratio of Support VectorsSVM-CCSVM-CCMS

Hinweis der Redaktion

The added earth logo is from the website: http://rst.gsfc.nasa.gov/Sect19/Sect19_2a.html
Background – Introduce the motivation and concepts related to Active Learning, Guided learning. Proposed Method :Two proposed method: SVM-CC, SVM-CCMSThey are guided active learning, which aims to focus sampling difficult classes to improve the performance in multi-class problem. SVM-CC is simply critical class oriented class querySVM-CC-MS further incorporate the margin sampling idea into the critical class based category sampling. It uses the margin sampling as the uncertainty criteria, thus, combines the guided learning with informative sampling. Experiments are on two data set: KSC and BOT
Supervised classifier depends on the quality of the training data So, raise the question : what is the appropriate sampling strategy that can most explore the information in the data, and construct the best training data set for a given problem. - Simple random sampling which often results in a uniformly approximation of the given data proportion is unlikely to be the best strategy, especially in many real applications with complex data sets. The constructed training set may consist of a lot of redundancy and not be suitable for the chosen classifier. This has been evidenced a lot in the active learning field. - A desired good strategy should:Achieve better performance with less expenses. Economically/ realizably allocate resources for labelingAccomplish for a specific task or customer purpose. One example is “active learning”
Active learning is an iterative learning circle. An additional component “Query strategy” is added into the supervised learning process, which brings feedback information from the performance of the classifier on the data, then guides the sampling strategy to select the most useful training data for labeling. Active learning helps to build a smaller training data set with higher training utility, which reduces the expenses and time on labeling the data. The Query Strategy:Beyond the normal strategies (e.g. uncertainty criterion), it can also be designed to guide the learning for a specific task or purpose, or concept level query.Here we focus on sampling difficult classes in a multi-class problem. The basic assumption is that, classes in the multi-class problem often vary in terms of their difficulty level to obtain good classification result. Thus, by focusing on these classes, we could potentially adjust the proportion of the training data to favor those hard classes to improve the performance. Backup: Motivation for Active Learning Problems in passive learning: Labeling – expensive & time consumingLimited labeled data vs. abundant unlabeled dataManually selection Dtrain- subjective & redundancyGoal of AL: Smaller training set with higher training utility
Recently, there has been increasing interest in this Active learning topic. In many multi-class problems, class complexity is highly skewed. Certain classes have more complex distribution that is hard to be well represented. Associated with this individual property, significant differences also exist for discriminating any class pair. Certain class pair may have verycomplex boundaries that require more training data to achieve a good modeling. Those are often the classes that can dramatically damage the overall classification results, and typically are of most interest. Thus, rather than guiding for which instance to label, guidance for querying additional training samples for those “critical classes” may help to tighten the worst classification boundary and yield a better overall performance [9]. Our proposed method is critical class oriented learning:It uses the shifting hyperplane to identify the “hard classes” in multi-class classification problem. Class query combined with margin sampling based uncertainty query. (Using Naïve random sampling and margin based uncertainty sampling to query candidate samples.)Combine “Guided Learning” & “Active Learning”Shifting hyperplane by pair-wise SVMIdentify “trouble classes”Concept level class based queryGuided Learning Reference:1. R. Lomasky, C. E. Brodley, M. Aernecke, and S. Bencic, “Guiding class selection for an artificial nose,” in Proc. NIPS, 2006. 2. R. Lomasky, C.E. Brodley, M. Aernecke, D. Walt, and M. Friedl, “Active class Selection,” in Proc. ECML, 2007.3. J. Attenberg and F. Provost, “Why label when you can search? alternatives to active learning for applying human resources to build classification models under extreme class imbalance,” in Proc. KDD 2010.Most recent paper:[1] S. Patra and L. Bruzzone, “A fast cluster-assumption based active learning technique for classification of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, 2011. In press.[2] B. Demir, C. Persello, and L. Bruzzone, “Batch mode active learning methods for the interactive classification of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol.49, no.3, pp.1014-1031, 2011.[3] J. Li, J. Bioucas-Dias and A. Plaza, “Semi-Supervised Hyperspectral Image Segmentation Using Multinomial Logistic Regression with Active Learning”, IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 11, pp. 4085-4098, 2011.
1.We use one-versus-one SVM classifier to learn binary classifiers for Nc classes (e.g. 45 for 10)s2.The hyperplanew of each class pair can be expressed as the weighted sum of the support vectors. It also represents the inverse of the margin which naturally provides the information about the separability between the class pair. 3. In a sequence learning scenario, if the decision boundary between two classes is well established, fewer additional support vectors will be required, leading to less change of the hyperplane. A bigger change of w indicates insufficiency of the corresponding pair-wise hyperplane, or greater complexity of this class pair. 4. To query samples from those classes may help to concentrate on the most critical classes in the multi-class problem, and tighten the lowest classification accuracy, leading to the improvement of the overall classification accuracy. 5. Our goal is to use the cumulative changes in hyperplane to estimate the difficulty level to rank each pair-wise class in this multi-class classification problem.
This slide shows the critical class identification process:Scaled changes in hyperplane (w) for each query step is computed 2. To integrate the learning sequence information, accumulated changes are obtain for each pair-wise classes. 3. Define an order statistics to rank all class pairs:phi only has limited values over the discrete grid of (critical level), which stands for the ranking information of the changes of hyperplane for all class pairs. A larger value indicates that class pair k is more critical. 4. The probability of a pair-class k has difficulty level CL is estimated based on its frequencies that occur at level CL.
Higher Probability at CL, indicating class k is likely have this level (CL) of difficulty in this multi-class classification problem (likely to be ranked as CL in multi-class)A class-pair with higher probability at top difficulty level (e.g. CL=45 for 10 classes) are identified as critical class pairThe critical class set (CCs) is then obtained by the union of classes in each selected class pair . Based on this Critical class set, query is conducted in two ways:SVM-CC Randomly select next query sample from samples with the estimated labels that belong to the critical class set. Querying from this contention pool, we may either learn from the samples that truly belong to those CC, or learn from mistaken samples that are incorrectly classified into these critical classes. b. SVM-CCMS, - SVM-CC only conducts concept level query strategy, which cannot guarantee the learner to focus on the most informative samples - Thus, borrow the idea in SVMMS and propose SVM-CCMS, whereby samples in are further ranked by the distance towards the hyperplane in the kernel space. -For each sample, the minimum distance towards all the hyperplanes is used to represent its uncertainty. Samples with smallest distance, which indicates greater uncertainty, are selected for next query.
Figure – KSCTable – KSC & BOTHard classes are tagged with *.KSC : NASA Airborne Visible/Infrared Imaging Spectrometer (AVIRIS)  176 remaining bands18m spatial resolution the Kennedy Space Center (KSC), 1996 224 bands of 10nm width from 400-2500nm. BOT: NASA EO-1 satellite over the Okavango Delta,  145 remaining bands 30m spatial resolution 242 bands of 10nm width covering 400-2500nm. Initial Data Set KSC : 3 samples x 10 Classes = 30 BOT: 6 Samples x 9 classes = 54Runs: KSC : 870 run BOT : 400 run
Upper-left figure:1.show the examples of the AMI values for the SVM-CC approach for KSC at the 10th and 30th query, with each class pair corresponding to one bar. 2. It can be seen that higher AMI values correspond to pairs: 18{C3,C4}, 25-28 {C4 vs. C5,C6,C7,C8}, which is consistent with our previous study that those classes are the most difficult classes in this data set. 3. Low values refer to class pairs such as 9, 15-17, and 22-24, which all relate to the easiest classes 8-10 in this data. Comparing Fig.(a) and Fig.(b), AMI changes at the different learning stages, and the values that relate to the hard classes increase as the learning progresses.Bottom figure:Shows the AMI value as learning process for all class pairs for KSC and BOT respectively.
Table: 1.Compares the per-class classification accuracy improvement for DT relative to RS at the 600th and 400th query step for KSC and BOT, respectively.2. The proposed methods clearly lead to better results especially for those classes as compared to SVMMS and RS.3. SVM-CCMS performed better than SVM-CC since it further incorporates the uncertainty measurement, thus is able to target on the most informative samples. 4. Some classes got worse since fewer samples were acquired for them, but not significant. 5. Note that for all AL methods, water class (C10 in KSC, C1 in BOT) gains zero improvement, since it is the easiest class to be discriminated from the others.Figure:Shows an example of the learning curves of KSC by SVM-CCMS. Although we are more interested in per-class performance, improvements are still achieved in terms of the overall evaluation.
Table:1. Per-class Sampling Ratio of KSC and BOT data, compare with 4 different methods, the proposed method more concentrate on the hard classes2. The lowest sampling ratio is 23% for KSC and 0% for BOT indicating that sampling complexity is quite low for this class and much fewer samples are needed to achieve a good modeling for classifying this class. By querying fewer samples for this class, a lot of redundancy can be eliminated without scarifying the performance and also saves the space in training set for focusing on sampling other hard classes.Figure: Figure (right-middle) - SVs Ratio (KSC) plots the ratio of the total SVs to the size of the training data as the learning process. Figure at the bottom shows the no. of support vectors (SVs) of each class as learning process. Our methods clearly yield more SVs for hard classes, and the overall ratio is high. Since the SVM decision function depends only on SVs, higher ratio indicates the higher utility of the constructed training data set by the proposed methods [16]Reference:J. Wang, P. Neskovic, and L. N Cooper, “Training data selection for support vector machines,” in Proc. ICNC, 2005, pp.554-564.
Thanks very much.
Fig.1 Examples of ELsmI as learning for all class pairs (a)KSC, (e)BOT; examples of the estimated probability for each class pair at different critical levels (highest at the bottom) of the last query: (b)KSC, (f)BOT; examples of ELsmI by SVM-CC for KSC at the 10th query (c) and 30th query (d).Fig.1(a)(e) show the examples of the accumulated hyperplane change represented by ELsmI as learning process by SVM-CC for KSC and BOT data, respectively. It can be seen that higher ELsmI values correspond to class pairs: 18{C3,C4}, 25-28 {C4 vs. C5,C6,C7,C8} for KSC, and 18{C3,C6} for BOT, which is consistent with our previous study that those classes are poor class pairs in each data set (also see Fig. 2). Fig.1 (c)(d) are the ELsmI by SVM-CC for KSC at the 10th and 30th query respectively with each class pair corresponding to one bar. Several obvious valleys refer to class pairs such as 9, 15-17, 22-24, which all relate to the easiest classes 8-10 in this data. Fig.1 (b)(f) shows the examples of the estimated probability in Eq.5 for each class pair (x-axis) at different critical levels (y-axis) of the last query for KSC and BOT data, respectively. Higher value (brighter color) indicates higher possibility of a pair of classes at a certain critical level. Class pair 18 of KSC again has the highest probability at the highest critical level (45th, bottom), followed by class pair 26 and 25 with high possibility at the level 44 and 43, respectively. As the critical level goes down, more class pairs show comparable probabilities. For BOT data, class pair 18 beats all the others at the highest level.