SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Downloaden Sie, um offline zu lesen
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
Rank-Based Similarity Search: Reducing
the Dimensional Dependence
ABSTRACT
Rank based introduces a data structure for k-NN
search, the Rank Cover Tree (RCT), whose pruning tests rely solely on
the comparison of similarity values; other properties of the underlying
space, such as the triangle inequality, are not employed. Objects are
selected according to their ranks with respect to the query object,
allowing much tighter control on the overall execution costs. A formal
theoretical analysis shows that with very high probability, the RCT
returns a correct query result in time that depends very competitively on
a measure of the intrinsic dimensionality of the data set. The
experimental results for the RCT show that non-metric pruning
strategies for similarity search can be practical even when the
representational dimension of the data is extremely high. They also show
that the RCT is capable of meeting or exceeding the level of
performance of state-of-the-art methods that make use of metric pruning
or other selection tests involving numerical constraints on distance
values.
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
EXISTING SYSTEM:.
Virtually all existing indices make use of numerical
constraints for pruning and selection. Such constraints include the
triangle inequality (a linear constraint on three distance values), other
bounding surfaces defined in terms of distance (such as hypercubes or
hyperspheres range queries involving approximation factors as in
Locality- Sensitive Hashing (LSH) .or absolute quantities as additive
distance terms . One serious drawback of such operations based on
numerical constraints such as the triangle inequality or distance ranges is
that the number of objects actually examined can be highly variable, so
much so that the overall execution time cannot be easily predicted. In an
attempt to improve the scalability of applications that depend upon
similarity search, researchers and practitioners have investigated
practical methods for speeding up the computation of neighborhood
information at the expense of accuracy. For data mining applications, the
approaches considered have included feature sampling for local outlier
detection data sampling for clustering and approximate similarity search
for k-NN classification approximate similarity search indices include
the BD-Tree, a widely-recognized benchmark for approximate k-NN
search; it makes use of splitting rules and early termination to improve
upon the performance of the basic KD-Tree. One of the most popular
methods for indexing, Locality-Sensitive Hashing can also achieve
good practical search performance for range queries by managing
parameters that influence a tradeoff between accuracy and time. The
spatial approximation sample hierarchy (SASH) similarity search index
has had practical success in accelerating the performance of a shared-
neighbor clustering algorithm , for a variety of data types.
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
PROPOSED SYSTEM:
The proposed Rank Cover Tree blends some of the design features of the
SASH similarity search structure and the Cover Tree. Like the SASH
(and unlike the Cover Tree), we shall see that its use of ordinal pruning
allows for tight control on the execution costs associated with
approximate search queries. By restricting the number of neighboring
nodes to be visited at each level of the structure, the user can reduce the
average execution time at the expense of query accuracy. The
algorithmic descriptions of RCT construction and query processing are
outlined in respectively. Tree-based strategies for proximity search
typically use a distance metric in two different ways as a numerical
constraint on the distances among three data objects as exemplified by
the triangle inequality, or as an numerical (absolute) constraint on the
distance of candidates from a reference point. The proposed Rank
Cover Tree differs from most other search structures in that it makes use
of the distance metric solely for ordinal pruning, thereby avoiding many
of the difficulties associated with traditional approaches in high-
dimensional settings, such as the loss of effectiveness of the triangle
inequality for pruning search paths. proposes a rankbased hashing
scheme, in which similarity computation relies on rank averaging and
other arithmetic operations. No experimental results for this method
have appeared in the research literature as yet. Another algorithm we
considered, the combinatorial random connection graph search method
RanWalk , is mainly of theoretical interest, since the preprocessing time
and space required is quadratic in the data set size.
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
MODULE DESCRIPTION:
1. User.
2. Ranking based search.
3. Admin.
4. Nearest neighbor search.
5. Intrinsic dimensionality.
1. User.
User registration ,after login file upload view to specify
their requirements and assign their priorities at varied levels of the
hierarchical representation in order and view for Map.The performance
of different approaches to rank provider allocation.
2. Ranking based Search.
The RCT is the first practical rank-based similarity
search for small choices of parameter h, its fixed-height variant achieves
a polynomial dependence on the expansion rate of much smaller degree
than attained by the only other practical polynomially-dependent
structure known to date (the Cover Tree).
3. Admin.
Admin provider for location details and allocation map
search to place images.the chart using for different date sum ot time
based on used.
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
4. Nearest neighbor search.
Nearest neighbor search (NNS), also known as
proximity search, similarity search , is an optimization problem for
finding closest points. Closeness is typically expressed in terms of a
dissimilarity function: the less similar the objects, the larger the function
values. Formally, the nearest-neighbor (NN) search problem is defined
as follows: given a set S of points in a space M and a query point find
the closest point. Referring to an application of assigning to a residence
the nearest post office. A direct generalization of this problem is a k-NN
search, where we need to find the k closest points.
5. Intrinsic dimensionality.
The cover tree has a theoretical bound that is based on the dataset's
doubling constant . The bound on search time expansion constant of
the dataset.
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
Scope:
The accuracy of FLANN may further be influenced by
varying a parameter that governs the maximum number of leaf nodes
that can be searched. As a baseline, we also tested the performance of
sequential search over random samples of the data, of varying sizes.
.
Problem Statement:
Similarity search on problems in data mining, machine learning, pattern
recognition, and statistics, the design and analysis of scalable and
effective similarity search structures has been the subject of intensive
research for many decades. Until relatively recently, most data structures
for similarity search targeted low-dimensional real vector space
representations and the euclidean or other Lp distance metrics. However,
many public and commercial data sets available today are more naturally
represented as vectors spanning many hundreds or thousands of feature
attributes, that can be real or integer-valued, ordinal or categorical, or
even a mixture of these types.
Softroniics
Softroniics www.softroniics.com
Calicut||Coimbatore||Palakkad 9037291113,9037061113
CONCLUSION:
We have presented a new structure for similarity search, the Rank Cover
Tree, whose ordinal pruning strategy makes use only of direct
comparisons between distance values. The RCT construction and query
execution costs do not explicitly depend on the representational
dimension of the data, but can be analyzed probabilistically in terms of a
measure of intrinsic dimensionality, the expansion rate. The RCT is the
first practical rank-based similarity search index with a formal
theoretical performance analysis in terms of the expansion rate; for small
choices of parameter h, its fixed-height variant achieves a polynomial
dependence on the expansion rate of much smaller degree than attained
by the only other practical polynomially-dependent structure known to
date (the Cover Tree), while still maintaining sublinear dependence on
the number of data objects . An estimation of the values of the expansion
rates several of the data sets considered in the experimentation they
show that in most cases, the ability to trade away many factors of the
expansion rate more than justifies the acceptance of a polynomial cost in
terms of n. The experimental results support the theoretical analysis, as
they clearly indicate that the RCT outperforms its two closest relatives
the Cover Tree and SASH structures in many cases, and consistently
outperforms the E2LSH implementation of LSH, classical indices such
as the KD-Tree and BD-Tree, and for data sets of high dimensionality
the KD-Tree ensemble method FLANN.

Weitere ähnliche Inhalte

Andere mochten auch

IDCC 3105 Avenant n°3 annexe 2 clause de sauvegarde - v2
IDCC 3105 Avenant n°3 annexe 2   clause de sauvegarde - v2IDCC 3105 Avenant n°3 annexe 2   clause de sauvegarde - v2
IDCC 3105 Avenant n°3 annexe 2 clause de sauvegarde - v2Société Tripalio
 
The role of radio broadcasting in public enlightenment (a case study of port...
The role of radio broadcasting in public enlightenment  (a case study of port...The role of radio broadcasting in public enlightenment  (a case study of port...
The role of radio broadcasting in public enlightenment (a case study of port...Newman Enyioko
 
November hero of the month: Paul
November hero of the month: PaulNovember hero of the month: Paul
November hero of the month: PaulMyWonderStudio
 
Nivea - world largest skin care brand
Nivea -  world largest skin care brandNivea -  world largest skin care brand
Nivea - world largest skin care brandKatharina_H
 
Supply chain management ppt
Supply chain management pptSupply chain management ppt
Supply chain management pptShivani Singh
 
Airtightness and insulation with solar study
Airtightness and insulation with solar studyAirtightness and insulation with solar study
Airtightness and insulation with solar studyJonathan Flanagan
 
IBM WebSphere Application Server version to version comparison
IBM WebSphere Application Server version to version comparisonIBM WebSphere Application Server version to version comparison
IBM WebSphere Application Server version to version comparisonejlp12
 

Andere mochten auch (10)

IDCC 3105 Avenant n°3 annexe 2 clause de sauvegarde - v2
IDCC 3105 Avenant n°3 annexe 2   clause de sauvegarde - v2IDCC 3105 Avenant n°3 annexe 2   clause de sauvegarde - v2
IDCC 3105 Avenant n°3 annexe 2 clause de sauvegarde - v2
 
The role of radio broadcasting in public enlightenment (a case study of port...
The role of radio broadcasting in public enlightenment  (a case study of port...The role of radio broadcasting in public enlightenment  (a case study of port...
The role of radio broadcasting in public enlightenment (a case study of port...
 
Jefff mpob
Jefff mpobJefff mpob
Jefff mpob
 
November hero of the month: Paul
November hero of the month: PaulNovember hero of the month: Paul
November hero of the month: Paul
 
Nivea - world largest skin care brand
Nivea -  world largest skin care brandNivea -  world largest skin care brand
Nivea - world largest skin care brand
 
Supply chain management ppt
Supply chain management pptSupply chain management ppt
Supply chain management ppt
 
Airtightness and insulation with solar study
Airtightness and insulation with solar studyAirtightness and insulation with solar study
Airtightness and insulation with solar study
 
IBM WebSphere Application Server version to version comparison
IBM WebSphere Application Server version to version comparisonIBM WebSphere Application Server version to version comparison
IBM WebSphere Application Server version to version comparison
 
Internal Branding
Internal BrandingInternal Branding
Internal Branding
 
とある色の決め方
とある色の決め方とある色の決め方
とある色の決め方
 

Mehr von shanofa sanu

Dynamic google-remote-data-collection-docx
Dynamic google-remote-data-collection-docxDynamic google-remote-data-collection-docx
Dynamic google-remote-data-collection-docxshanofa sanu
 
Mobile phone-based-drunk-driving-detection-system-docx
Mobile phone-based-drunk-driving-detection-system-docxMobile phone-based-drunk-driving-detection-system-docx
Mobile phone-based-drunk-driving-detection-system-docxshanofa sanu
 
Bluetooth based-chatting-system-using-android-docx
Bluetooth based-chatting-system-using-android-docxBluetooth based-chatting-system-using-android-docx
Bluetooth based-chatting-system-using-android-docxshanofa sanu
 
Face to-face proximity estimation using bluetooth on smartphones
Face to-face proximity estimation using bluetooth on smartphonesFace to-face proximity estimation using bluetooth on smartphones
Face to-face proximity estimation using bluetooth on smartphonesshanofa sanu
 
Context based access control systems for mobile devices
Context based access control systems for mobile devicesContext based access control systems for mobile devices
Context based access control systems for mobile devicesshanofa sanu
 
Collaborative policy administration
Collaborative policy administrationCollaborative policy administration
Collaborative policy administrationshanofa sanu
 
Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...shanofa sanu
 
electronics-embedded-project-topics-list-softroniics
electronics-embedded-project-topics-list-softroniicselectronics-embedded-project-topics-list-softroniics
electronics-embedded-project-topics-list-softroniicsshanofa sanu
 
11.paqcs physical design aware fault-tolerant quantum circuit synthesis
11.paqcs physical design aware fault-tolerant quantum circuit synthesis11.paqcs physical design aware fault-tolerant quantum circuit synthesis
11.paqcs physical design aware fault-tolerant quantum circuit synthesisshanofa sanu
 
11.1 automatic moving object extraction (1)
11.1 automatic moving object extraction  (1)11.1 automatic moving object extraction  (1)
11.1 automatic moving object extraction (1)shanofa sanu
 
272465451 raspberry-pi-based-project-abstracts
272465451 raspberry-pi-based-project-abstracts272465451 raspberry-pi-based-project-abstracts
272465451 raspberry-pi-based-project-abstractsshanofa sanu
 
Discovering latent semantics in web documents
Discovering latent semantics in web documentsDiscovering latent semantics in web documents
Discovering latent semantics in web documentsshanofa sanu
 

Mehr von shanofa sanu (14)

Dynamic google-remote-data-collection-docx
Dynamic google-remote-data-collection-docxDynamic google-remote-data-collection-docx
Dynamic google-remote-data-collection-docx
 
Mobile phone-based-drunk-driving-detection-system-docx
Mobile phone-based-drunk-driving-detection-system-docxMobile phone-based-drunk-driving-detection-system-docx
Mobile phone-based-drunk-driving-detection-system-docx
 
Bluetooth based-chatting-system-using-android-docx
Bluetooth based-chatting-system-using-android-docxBluetooth based-chatting-system-using-android-docx
Bluetooth based-chatting-system-using-android-docx
 
Face to-face proximity estimation using bluetooth on smartphones
Face to-face proximity estimation using bluetooth on smartphonesFace to-face proximity estimation using bluetooth on smartphones
Face to-face proximity estimation using bluetooth on smartphones
 
Context based access control systems for mobile devices
Context based access control systems for mobile devicesContext based access control systems for mobile devices
Context based access control systems for mobile devices
 
Collaborative policy administration
Collaborative policy administrationCollaborative policy administration
Collaborative policy administration
 
A novel high step
A novel high stepA novel high step
A novel high step
 
Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...
 
electronics-embedded-project-topics-list-softroniics
electronics-embedded-project-topics-list-softroniicselectronics-embedded-project-topics-list-softroniics
electronics-embedded-project-topics-list-softroniics
 
Power decoupling
Power decouplingPower decoupling
Power decoupling
 
11.paqcs physical design aware fault-tolerant quantum circuit synthesis
11.paqcs physical design aware fault-tolerant quantum circuit synthesis11.paqcs physical design aware fault-tolerant quantum circuit synthesis
11.paqcs physical design aware fault-tolerant quantum circuit synthesis
 
11.1 automatic moving object extraction (1)
11.1 automatic moving object extraction  (1)11.1 automatic moving object extraction  (1)
11.1 automatic moving object extraction (1)
 
272465451 raspberry-pi-based-project-abstracts
272465451 raspberry-pi-based-project-abstracts272465451 raspberry-pi-based-project-abstracts
272465451 raspberry-pi-based-project-abstracts
 
Discovering latent semantics in web documents
Discovering latent semantics in web documentsDiscovering latent semantics in web documents
Discovering latent semantics in web documents
 

Kürzlich hochgeladen

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 

Kürzlich hochgeladen (20)

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

Rank based similarity search reducing

  • 1. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 Rank-Based Similarity Search: Reducing the Dimensional Dependence ABSTRACT Rank based introduces a data structure for k-NN search, the Rank Cover Tree (RCT), whose pruning tests rely solely on the comparison of similarity values; other properties of the underlying space, such as the triangle inequality, are not employed. Objects are selected according to their ranks with respect to the query object, allowing much tighter control on the overall execution costs. A formal theoretical analysis shows that with very high probability, the RCT returns a correct query result in time that depends very competitively on a measure of the intrinsic dimensionality of the data set. The experimental results for the RCT show that non-metric pruning strategies for similarity search can be practical even when the representational dimension of the data is extremely high. They also show that the RCT is capable of meeting or exceeding the level of performance of state-of-the-art methods that make use of metric pruning or other selection tests involving numerical constraints on distance values.
  • 2. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 EXISTING SYSTEM:. Virtually all existing indices make use of numerical constraints for pruning and selection. Such constraints include the triangle inequality (a linear constraint on three distance values), other bounding surfaces defined in terms of distance (such as hypercubes or hyperspheres range queries involving approximation factors as in Locality- Sensitive Hashing (LSH) .or absolute quantities as additive distance terms . One serious drawback of such operations based on numerical constraints such as the triangle inequality or distance ranges is that the number of objects actually examined can be highly variable, so much so that the overall execution time cannot be easily predicted. In an attempt to improve the scalability of applications that depend upon similarity search, researchers and practitioners have investigated practical methods for speeding up the computation of neighborhood information at the expense of accuracy. For data mining applications, the approaches considered have included feature sampling for local outlier detection data sampling for clustering and approximate similarity search for k-NN classification approximate similarity search indices include the BD-Tree, a widely-recognized benchmark for approximate k-NN search; it makes use of splitting rules and early termination to improve upon the performance of the basic KD-Tree. One of the most popular methods for indexing, Locality-Sensitive Hashing can also achieve good practical search performance for range queries by managing parameters that influence a tradeoff between accuracy and time. The spatial approximation sample hierarchy (SASH) similarity search index has had practical success in accelerating the performance of a shared- neighbor clustering algorithm , for a variety of data types.
  • 3. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 PROPOSED SYSTEM: The proposed Rank Cover Tree blends some of the design features of the SASH similarity search structure and the Cover Tree. Like the SASH (and unlike the Cover Tree), we shall see that its use of ordinal pruning allows for tight control on the execution costs associated with approximate search queries. By restricting the number of neighboring nodes to be visited at each level of the structure, the user can reduce the average execution time at the expense of query accuracy. The algorithmic descriptions of RCT construction and query processing are outlined in respectively. Tree-based strategies for proximity search typically use a distance metric in two different ways as a numerical constraint on the distances among three data objects as exemplified by the triangle inequality, or as an numerical (absolute) constraint on the distance of candidates from a reference point. The proposed Rank Cover Tree differs from most other search structures in that it makes use of the distance metric solely for ordinal pruning, thereby avoiding many of the difficulties associated with traditional approaches in high- dimensional settings, such as the loss of effectiveness of the triangle inequality for pruning search paths. proposes a rankbased hashing scheme, in which similarity computation relies on rank averaging and other arithmetic operations. No experimental results for this method have appeared in the research literature as yet. Another algorithm we considered, the combinatorial random connection graph search method RanWalk , is mainly of theoretical interest, since the preprocessing time and space required is quadratic in the data set size.
  • 4. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 MODULE DESCRIPTION: 1. User. 2. Ranking based search. 3. Admin. 4. Nearest neighbor search. 5. Intrinsic dimensionality. 1. User. User registration ,after login file upload view to specify their requirements and assign their priorities at varied levels of the hierarchical representation in order and view for Map.The performance of different approaches to rank provider allocation. 2. Ranking based Search. The RCT is the first practical rank-based similarity search for small choices of parameter h, its fixed-height variant achieves a polynomial dependence on the expansion rate of much smaller degree than attained by the only other practical polynomially-dependent structure known to date (the Cover Tree). 3. Admin. Admin provider for location details and allocation map search to place images.the chart using for different date sum ot time based on used.
  • 5. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 4. Nearest neighbor search. Nearest neighbor search (NNS), also known as proximity search, similarity search , is an optimization problem for finding closest points. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values. Formally, the nearest-neighbor (NN) search problem is defined as follows: given a set S of points in a space M and a query point find the closest point. Referring to an application of assigning to a residence the nearest post office. A direct generalization of this problem is a k-NN search, where we need to find the k closest points. 5. Intrinsic dimensionality. The cover tree has a theoretical bound that is based on the dataset's doubling constant . The bound on search time expansion constant of the dataset.
  • 6. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 Scope: The accuracy of FLANN may further be influenced by varying a parameter that governs the maximum number of leaf nodes that can be searched. As a baseline, we also tested the performance of sequential search over random samples of the data, of varying sizes. . Problem Statement: Similarity search on problems in data mining, machine learning, pattern recognition, and statistics, the design and analysis of scalable and effective similarity search structures has been the subject of intensive research for many decades. Until relatively recently, most data structures for similarity search targeted low-dimensional real vector space representations and the euclidean or other Lp distance metrics. However, many public and commercial data sets available today are more naturally represented as vectors spanning many hundreds or thousands of feature attributes, that can be real or integer-valued, ordinal or categorical, or even a mixture of these types.
  • 7. Softroniics Softroniics www.softroniics.com Calicut||Coimbatore||Palakkad 9037291113,9037061113 CONCLUSION: We have presented a new structure for similarity search, the Rank Cover Tree, whose ordinal pruning strategy makes use only of direct comparisons between distance values. The RCT construction and query execution costs do not explicitly depend on the representational dimension of the data, but can be analyzed probabilistically in terms of a measure of intrinsic dimensionality, the expansion rate. The RCT is the first practical rank-based similarity search index with a formal theoretical performance analysis in terms of the expansion rate; for small choices of parameter h, its fixed-height variant achieves a polynomial dependence on the expansion rate of much smaller degree than attained by the only other practical polynomially-dependent structure known to date (the Cover Tree), while still maintaining sublinear dependence on the number of data objects . An estimation of the values of the expansion rates several of the data sets considered in the experimentation they show that in most cases, the ability to trade away many factors of the expansion rate more than justifies the acceptance of a polynomial cost in terms of n. The experimental results support the theoretical analysis, as they clearly indicate that the RCT outperforms its two closest relatives the Cover Tree and SASH structures in many cases, and consistently outperforms the E2LSH implementation of LSH, classical indices such as the KD-Tree and BD-Tree, and for data sets of high dimensionality the KD-Tree ensemble method FLANN.