Reverse nearest neighbors in unsupervised distance based outlier detection

•

0 gefällt mir•1,005 views

Reverse nearest neighbors in unsupervised distance based outlier detection +91-9994232214,8144199666, ieeeprojectchennai@gmail.com, www.projectsieee.com, www.ieee-projects-chennai.com IEEE PROJECTS 2015-2016 ----------------------------------- Contact:+91-9994232214,+91-8144199666 Email:ieeeprojectchennai@gmail.com Support: ------------- Projects Code Documentation PPT Projects Video File Projects Explanation Teamviewer Support

Bildung

Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier
Detection
Abstract:
Outlier detection in high-dimensional data presents various challenges
resulting from the “curse of dimensionality.” A prevailing view is that
distance concentration, i.e., the tendency of distances in high-dimensional
data to become indiscernible, hinders the detection of outliers by making
distance-based methods label all points as almost equally good outliers. In
this paper we provide evidence supporting the opinion that such a view is
too simple, by demonstrating that distance-based methods can produce
more contrasting outlier scores in high-dimensional settings. Furthermore,
we show that high dimensionality can have a different impact, by
reexamining the notion of reverse nearest neighbors in the unsupervised
outlier-detection context. Namely, it was recently observed that the
distribution of points’ reverse-neighbor counts becomes skewed in high
dimensions, resulting in the phenomenon known as hubness. We provide
insight into how some points (antihubs) appear very infrequently in k-NN
lists of other points, and explain the connection between antihubs, outliers,
and existing unsupervised outlier-detection methods. By evaluating the
classic k-NN method, the angle-based technique (ABOD) designed for
high-dimensional data, the density-based local outlier factor (LOF) and

influenced outlierness (INFLO) methods, and antihub-based methods on
various synthetic and real-world data sets, we offer novel insight into the
usefulness of reverse neighbor counts in unsupervised outlier detection.
Existing System:
The task of detecting outliers can be categorized as supervised, semi-
supervised, and unsupervised, depending on the existence of labels for
outliers and/or regular instances. Among these categories, unsupervised
methods are more widely applied, because the other categories require
accurate and representative labels that are often prohibitively expensive to
obtain.
Unsupervised methods include distance-based methods that mainly rely
on a measure of distance or similarity in order to detect outliers.
Proposed System:
It is crucial to understand how the increase of dimensionality impacts
outlier detection. As explained the actual challenges posed by the “curse of
dimensionality” differ from the commonly accepted view that every point
becomes an almost equally good outlier in high-dimensional space [9]. We
will present further evidence which challenges this view, motivating the
(re)examination of methods.
Hardware Requirements:
• System : Pentium IV 2.4 GHz.

• Hard Disk : 40 GB.
• Floppy Drive : 1.44 Mb.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• RAM : 256 Mb.
Software Requirements:
• Operating system : - Windows XP.
• Front End : - JSP
• Back End : - SQL Server
Software Requirements:
• Operating system : - Windows XP.
• Front End : - .Net
• Back End : - SQL Server

Weitere ähnliche Inhalte

Andere mochten auch

"Outliers" - Malcolm Gladwell Book ReviewArchit Rathi

OutliersAlexandru Dorobantu

OutliersNikesh Patra

Chapter 12 outlierHouw Liong The

OutliersChidirala Anil Shankar

Final Year IEEE Project 2013-2014 - Cloud Computing Project Title and Abstractelysiumtechnologies

3.7 outlier analysisKrish_ver2

Anomaly detection Meetup SlidesQuantUniversity

Multivariate data analysisSetia Pramana

Deep learning - Part IQuantUniversity

Data Mining: Outlier analysisDataminingTools Inc

Outliers, The story of success by Malcom GladwellPrathamesh Malaikar

Slideshare Powerpoint presentationelliehood

Slideshare pptMandy Suzanne

Andere mochten auch (14)

"Outliers" - Malcolm Gladwell Book Review

Outliers

Chapter 12 outlier

Outliers

Final Year IEEE Project 2013-2014 - Cloud Computing Project Title and Abstract

3.7 outlier analysis

Anomaly detection Meetup Slides

Multivariate data analysis

Deep learning - Part I

Data Mining: Outlier analysis

Outliers, The story of success by Malcom Gladwell

Slideshare Powerpoint presentation

Slideshare ppt

Mehr von ieeepondy

Demand aware network function placementieeepondy

Service description in the nfv revolution trends, challenges and a way forwardieeepondy

Secure optimization computation outsourcing in cloud computing a case study o...ieeepondy

Spatial related traffic sign inspection for inventory purposes using mobile l...ieeepondy

Standards for hybrid cloudsieeepondy

Rfhoc a random forest approach to auto-tuning hadoop's configurationieeepondy

Resource and instance hour minimization for deadline constrained dag applicat...ieeepondy

Reliable and confidential cloud storage with efficient data forwarding functi...ieeepondy

Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...ieeepondy

Scalable cloud–sensor architecture for the internet of thingsieeepondy

Scalable algorithms for nearest neighbor joins on big trajectory dataieeepondy

Robust workload and energy management for sustainable data centersieeepondy

Privacy preserving deep computation model on cloud for big data feature learningieeepondy

Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...ieeepondy

Protection of big data privacyieeepondy

Power optimization with bler constraint for wireless fronthauls in c ranieeepondy

Performance aware cloud resource allocation via fitness-enabled auctionieeepondy

Performance limitations of a text search application running in cloud instancesieeepondy

Performance analysis and optimal cooperative cluster size for randomly distri...ieeepondy

Predictive control for energy aware consolidation in cloud datacentersieeepondy

Mehr von ieeepondy (20)

Demand aware network function placement

Service description in the nfv revolution trends, challenges and a way forward

Secure optimization computation outsourcing in cloud computing a case study o...

Spatial related traffic sign inspection for inventory purposes using mobile l...

Standards for hybrid clouds

Rfhoc a random forest approach to auto-tuning hadoop's configuration

Resource and instance hour minimization for deadline constrained dag applicat...

Reliable and confidential cloud storage with efficient data forwarding functi...

Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...

Scalable cloud–sensor architecture for the internet of things

Scalable algorithms for nearest neighbor joins on big trajectory data

Robust workload and energy management for sustainable data centers

Privacy preserving deep computation model on cloud for big data feature learning

Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...

Protection of big data privacy

Power optimization with bler constraint for wireless fronthauls in c ran

Performance aware cloud resource allocation via fitness-enabled auction

Performance limitations of a text search application running in cloud instances

Performance analysis and optimal cooperative cluster size for randomly distri...

Predictive control for energy aware consolidation in cloud datacenters

Kürzlich hochgeladen

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1

Difference Between Search & Browse Methods in Odoo 17Celine George

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

Earth Day Presentation wow hello nice greatYousafMalik24

Global Lehigh Strategic Initiatives (without descriptions)cama23

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection

FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña

THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

How to do quick user assign in kanban in Odoo 17 ERPCeline George

Influencing policy (training slides from Fast Track Impact)Mark Reed

Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99

Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2

What is Model Inheritance in Odoo 17 ERPCeline George

ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1

4.16.24 21st Century Movements for Black Lives.pptxmary850239

ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing

Kürzlich hochgeladen (20)

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...

Difference Between Search & Browse Methods in Odoo 17

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

Earth Day Presentation wow hello nice great

Global Lehigh Strategic Initiatives (without descriptions)

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...

FILIPINO PSYCHology sikolohiyang pilipino

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx

THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

How to do quick user assign in kanban in Odoo 17 ERP

Influencing policy (training slides from Fast Track Impact)

Choosing the Right CBSE School A Comprehensive Guide for Parents

Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf

What is Model Inheritance in Odoo 17 ERP

ANG SEKTOR NG agrikultura.pptx QUARTER 4

4.16.24 21st Century Movements for Black Lives.pptx

ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY

Reverse nearest neighbors in unsupervised distance based outlier detection

1. Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection Abstract: Outlier detection in high-dimensional data presents various challenges resulting from the “curse of dimensionality.” A prevailing view is that distance concentration, i.e., the tendency of distances in high-dimensional data to become indiscernible, hinders the detection of outliers by making distance-based methods label all points as almost equally good outliers. In this paper we provide evidence supporting the opinion that such a view is too simple, by demonstrating that distance-based methods can produce more contrasting outlier scores in high-dimensional settings. Furthermore, we show that high dimensionality can have a different impact, by reexamining the notion of reverse nearest neighbors in the unsupervised outlier-detection context. Namely, it was recently observed that the distribution of points’ reverse-neighbor counts becomes skewed in high dimensions, resulting in the phenomenon known as hubness. We provide insight into how some points (antihubs) appear very infrequently in k-NN lists of other points, and explain the connection between antihubs, outliers, and existing unsupervised outlier-detection methods. By evaluating the classic k-NN method, the angle-based technique (ABOD) designed for high-dimensional data, the density-based local outlier factor (LOF) and

2. influenced outlierness (INFLO) methods, and antihub-based methods on various synthetic and real-world data sets, we offer novel insight into the usefulness of reverse neighbor counts in unsupervised outlier detection. Existing System: The task of detecting outliers can be categorized as supervised, semi- supervised, and unsupervised, depending on the existence of labels for outliers and/or regular instances. Among these categories, unsupervised methods are more widely applied, because the other categories require accurate and representative labels that are often prohibitively expensive to obtain. Unsupervised methods include distance-based methods that mainly rely on a measure of distance or similarity in order to detect outliers. Proposed System: It is crucial to understand how the increase of dimensionality impacts outlier detection. As explained the actual challenges posed by the “curse of dimensionality” differ from the commonly accepted view that every point becomes an almost equally good outlier in high-dimensional space [9]. We will present further evidence which challenges this view, motivating the (re)examination of methods. Hardware Requirements: • System : Pentium IV 2.4 GHz.

3. • Hard Disk : 40 GB. • Floppy Drive : 1.44 Mb. • Monitor : 15 VGA Colour. • Mouse : Logitech. • RAM : 256 Mb. Software Requirements: • Operating system : - Windows XP. • Front End : - JSP • Back End : - SQL Server Software Requirements: • Operating system : - Windows XP. • Front End : - .Net • Back End : - SQL Server

Reverse nearest neighbors in unsupervised distance based outlier detection

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (14)

Mehr von ieeepondy

Mehr von ieeepondy (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Reverse nearest neighbors in unsupervised distance based outlier detection