SlideShare ist ein Scribd-Unternehmen logo
1 von 12
An Adaptive Algorithm for Detection of Duplicate Records Presented By: Rama kanta Behera  IT200127207 Under the guidance of : Miss Ipsita Mishra
INTRODUCTION ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OBJECTIVES ,[object Object],[object Object],[object Object],[object Object]
PREVALENT METHODS   ,[object Object],[object Object],[object Object],[object Object]
OUTLINE OF THE PROPOSED SOLUTION   The central idea behind the present algorithm is based on the fundamental property of primality of numbers  I f(x) Record set Integer number space Fig: hashing I P Record set Integer number  Prime number  f(x) g(x) Fig: Extended hashing into prime space
r1 r2 … rn I1 I2 … In P1 P2 … Pn PRODUCT( P prior) f(x) g(x) P1*p2 …*pn= P prior Fig: The complete algorithm
REALIZATION OF THE ALGORITHM  ,[object Object],[object Object],[object Object]
STEPS OF THE ALGORITHM   Step 1  : For each new record, hash is performed and unique hash value (Hnew) for each distinct record is obtained.  Step 2  : Hnew is mapped to its corresponding unique prime (Pnew). Step 3  : Pprior is divided with Pnew. If Pnew exactly divides Pprior, then the corresponding record to Pnew is a duplicate and already exists in Pprior. Else, Pnew is a distinct record. Step 4  : If Pnew is a distinct record, Pprior is multiplied with Pnew and the result is stored back in Pprior. Thus updating Pprior renders the algorithm adaptive.
Fig: Flowchart
IMPLEMENTATIONS There are three important implementation details that need to be discussed  ,[object Object],[object Object],[object Object]
CONCLUSION ,[object Object],[object Object]
THANK YOU !!!

Weitere ähnliche Inhalte

Was ist angesagt?

Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMYComputer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMYklirantga
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
150970116028 2140705
150970116028 2140705150970116028 2140705
150970116028 2140705Manoj Shahu
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsSam Bowne
 
DCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceYasuo Tabei
 
05 heap 20161110_jintaeks
05 heap 20161110_jintaeks05 heap 20161110_jintaeks
05 heap 20161110_jintaeksJinTaek Seo
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsSam Bowne
 
Faster persistent data structures through hashing
Faster persistent data structures through hashingFaster persistent data structures through hashing
Faster persistent data structures through hashingJohan Tibell
 
High Performance Python - Marc Garcia
High Performance Python - Marc GarciaHigh Performance Python - Marc Garcia
High Performance Python - Marc GarciaMarc Garcia
 

Was ist angesagt? (20)

Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMYComputer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
Computer Science Engineering : Data structure & algorithm, THE GATE ACADEMY
 
Essential NumPy
Essential NumPyEssential NumPy
Essential NumPy
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Big O Notation
Big O NotationBig O Notation
Big O Notation
 
150970116028 2140705
150970116028 2140705150970116028 2140705
150970116028 2140705
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflows
 
Stack Data structure
Stack Data structureStack Data structure
Stack Data structure
 
S1140183 Presentation
S1140183 PresentationS1140183 Presentation
S1140183 Presentation
 
Plotting data with python and pylab
Plotting data with python and pylabPlotting data with python and pylab
Plotting data with python and pylab
 
Stack
StackStack
Stack
 
DCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant Space
 
Sortsearch
SortsearchSortsearch
Sortsearch
 
05 heap 20161110_jintaeks
05 heap 20161110_jintaeks05 heap 20161110_jintaeks
05 heap 20161110_jintaeks
 
Ch 5: Introduction to heap overflows
Ch 5: Introduction to heap overflowsCh 5: Introduction to heap overflows
Ch 5: Introduction to heap overflows
 
Lo18
Lo18Lo18
Lo18
 
Faster persistent data structures through hashing
Faster persistent data structures through hashingFaster persistent data structures through hashing
Faster persistent data structures through hashing
 
High Performance Python - Marc Garcia
High Performance Python - Marc GarciaHigh Performance Python - Marc Garcia
High Performance Python - Marc Garcia
 
Heap_Sort1.pptx
Heap_Sort1.pptxHeap_Sort1.pptx
Heap_Sort1.pptx
 
Cs 62
Cs 62Cs 62
Cs 62
 

Andere mochten auch

Progressive duplicate detection
Progressive duplicate detectionProgressive duplicate detection
Progressive duplicate detectionieeepondy
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismseSAT Journals
 
The Duplicitous Duplicate
The Duplicitous DuplicateThe Duplicitous Duplicate
The Duplicitous DuplicateAnish Raivadera
 
Duplicate detection
Duplicate detectionDuplicate detection
Duplicate detectionjonecx
 
Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Kira
 
Record matching over query results from Web Databases
Record matching over query results from Web DatabasesRecord matching over query results from Web Databases
Record matching over query results from Web Databasestusharjadhav2611
 
novel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawlingnovel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawlingVipin Kp
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiersLars Marius Garshol
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationPradeeban Kathiravelu, Ph.D.
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsPradeeban Kathiravelu, Ph.D.
 

Andere mochten auch (12)

Progressive duplicate detection
Progressive duplicate detectionProgressive duplicate detection
Progressive duplicate detection
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanisms
 
The Duplicitous Duplicate
The Duplicitous DuplicateThe Duplicitous Duplicate
The Duplicitous Duplicate
 
Duplicate detection
Duplicate detectionDuplicate detection
Duplicate detection
 
Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)
 
Progressive Texture
Progressive TextureProgressive Texture
Progressive Texture
 
Record matching over query results from Web Databases
Record matching over query results from Web DatabasesRecord matching over query results from Web Databases
Record matching over query results from Web Databases
 
novel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawlingnovel and efficient approch for detection of duplicate pages in web crawling
novel and efficient approch for detection of duplicate pages in web crawling
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiers
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and Deduplication
 
Deduplication
DeduplicationDeduplication
Deduplication
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
 

Ähnlich wie An adaptive algorithm for detection of duplicate records

A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005Jules Krdenas
 
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologiesPolad Saruxanov
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxPJS KUMAR
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfJaithoonBibi
 
PyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersPyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersBayu Aldi Yansyah
 
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...iosrjce
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notationirdginfo
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of HadoopAsif Ali
 
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...AM Publications
 
Rapport_Cemracs2012
Rapport_Cemracs2012Rapport_Cemracs2012
Rapport_Cemracs2012Jussara F.M.
 
QUEUE in data-structure (classification, working procedure, Applications)
QUEUE in data-structure (classification, working procedure, Applications)QUEUE in data-structure (classification, working procedure, Applications)
QUEUE in data-structure (classification, working procedure, Applications)Mehedi Hasan
 
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...BRNSSPublicationHubI
 
data structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysoredata structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysoreambikavenkatesh2
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsAbdullah Al-hazmy
 
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...IJERA Editor
 

Ähnlich wie An adaptive algorithm for detection of duplicate records (20)

A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005
 
Intellectual technologies
Intellectual technologiesIntellectual technologies
Intellectual technologies
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
 
final
finalfinal
final
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
 
PyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersPyTorch for Deep Learning Practitioners
PyTorch for Deep Learning Practitioners
 
R01732105109
R01732105109R01732105109
R01732105109
 
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
Analytical Study and Newer Approach towards Frequent Pattern Mining using Boo...
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notation
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of Hadoop
 
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
Improvement in Traditional Set Partitioning in Hierarchical Trees (SPIHT) Alg...
 
Rapport_Cemracs2012
Rapport_Cemracs2012Rapport_Cemracs2012
Rapport_Cemracs2012
 
QUEUE in data-structure (classification, working procedure, Applications)
QUEUE in data-structure (classification, working procedure, Applications)QUEUE in data-structure (classification, working procedure, Applications)
QUEUE in data-structure (classification, working procedure, Applications)
 
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
 
data structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysoredata structures using C 2 sem BCA univeristy of mysore
data structures using C 2 sem BCA univeristy of mysore
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis tools
 
Analysis.ppt
Analysis.pptAnalysis.ppt
Analysis.ppt
 
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
 
Chapter two
Chapter twoChapter two
Chapter two
 
Cs2251 daa
Cs2251 daaCs2251 daa
Cs2251 daa
 

Mehr von Likan Patra

Sewn Product Machinary & Equipments
Sewn Product Machinary & EquipmentsSewn Product Machinary & Equipments
Sewn Product Machinary & EquipmentsLikan Patra
 
SMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz QuestionsSMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz QuestionsLikan Patra
 
RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15Likan Patra
 
Quiz about Google and its Products
Quiz about Google and its ProductsQuiz about Google and its Products
Quiz about Google and its ProductsLikan Patra
 
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)Likan Patra
 
Everything you want to know about Liquid Lenses
Everything you want to know about Liquid LensesEverything you want to know about Liquid Lenses
Everything you want to know about Liquid LensesLikan Patra
 
Seminar on Cyber Crime
Seminar on Cyber CrimeSeminar on Cyber Crime
Seminar on Cyber CrimeLikan Patra
 
What is Optical fiber ?
What is Optical fiber ?What is Optical fiber ?
What is Optical fiber ?Likan Patra
 
Tech 101: Understanding Firewalls
Tech 101: Understanding FirewallsTech 101: Understanding Firewalls
Tech 101: Understanding FirewallsLikan Patra
 
Holographic Data Storage
Holographic Data StorageHolographic Data Storage
Holographic Data StorageLikan Patra
 
A Technical Seminar on OSI model
A Technical Seminar on OSI modelA Technical Seminar on OSI model
A Technical Seminar on OSI modelLikan Patra
 
Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?Likan Patra
 
Computer Tomography (CT Scan)
Computer Tomography (CT Scan)Computer Tomography (CT Scan)
Computer Tomography (CT Scan)Likan Patra
 
Akshaya patra foundation - In Depth
Akshaya patra foundation - In DepthAkshaya patra foundation - In Depth
Akshaya patra foundation - In DepthLikan Patra
 
So, He got a JOB through LinkedIn
So, He got a JOB through LinkedInSo, He got a JOB through LinkedIn
So, He got a JOB through LinkedInLikan Patra
 
Qr code (quick response code)
Qr code (quick response code)Qr code (quick response code)
Qr code (quick response code)Likan Patra
 
Blue ray disc seminar representation
Blue ray disc seminar representationBlue ray disc seminar representation
Blue ray disc seminar representationLikan Patra
 
Brain finger printing
Brain finger printingBrain finger printing
Brain finger printingLikan Patra
 
Audio watermarking
Audio watermarkingAudio watermarking
Audio watermarkingLikan Patra
 

Mehr von Likan Patra (20)

Sewn Product Machinary & Equipments
Sewn Product Machinary & EquipmentsSewn Product Machinary & Equipments
Sewn Product Machinary & Equipments
 
SMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz QuestionsSMArt Contest- Smart Quiz Questions
SMArt Contest- Smart Quiz Questions
 
RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15RC Shri Jagannath Dham- Club Activity Report 2014-15
RC Shri Jagannath Dham- Club Activity Report 2014-15
 
Quiz about Google and its Products
Quiz about Google and its ProductsQuiz about Google and its Products
Quiz about Google and its Products
 
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
e-ENERGY METERING BOX (Smart Meter by KPMP Electronics)
 
Everything you want to know about Liquid Lenses
Everything you want to know about Liquid LensesEverything you want to know about Liquid Lenses
Everything you want to know about Liquid Lenses
 
Seminar on Cyber Crime
Seminar on Cyber CrimeSeminar on Cyber Crime
Seminar on Cyber Crime
 
What is Optical fiber ?
What is Optical fiber ?What is Optical fiber ?
What is Optical fiber ?
 
Tech 101: Understanding Firewalls
Tech 101: Understanding FirewallsTech 101: Understanding Firewalls
Tech 101: Understanding Firewalls
 
Holographic Data Storage
Holographic Data StorageHolographic Data Storage
Holographic Data Storage
 
A Technical Seminar on OSI model
A Technical Seminar on OSI modelA Technical Seminar on OSI model
A Technical Seminar on OSI model
 
Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?Who are the INTERNET SERVICE PROVIDERS?
Who are the INTERNET SERVICE PROVIDERS?
 
Computer Tomography (CT Scan)
Computer Tomography (CT Scan)Computer Tomography (CT Scan)
Computer Tomography (CT Scan)
 
Akshaya patra foundation - In Depth
Akshaya patra foundation - In DepthAkshaya patra foundation - In Depth
Akshaya patra foundation - In Depth
 
So, He got a JOB through LinkedIn
So, He got a JOB through LinkedInSo, He got a JOB through LinkedIn
So, He got a JOB through LinkedIn
 
4g technology
4g technology4g technology
4g technology
 
Qr code (quick response code)
Qr code (quick response code)Qr code (quick response code)
Qr code (quick response code)
 
Blue ray disc seminar representation
Blue ray disc seminar representationBlue ray disc seminar representation
Blue ray disc seminar representation
 
Brain finger printing
Brain finger printingBrain finger printing
Brain finger printing
 
Audio watermarking
Audio watermarkingAudio watermarking
Audio watermarking
 

Kürzlich hochgeladen

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

An adaptive algorithm for detection of duplicate records

  • 1. An Adaptive Algorithm for Detection of Duplicate Records Presented By: Rama kanta Behera IT200127207 Under the guidance of : Miss Ipsita Mishra
  • 2.
  • 3.
  • 4.
  • 5. OUTLINE OF THE PROPOSED SOLUTION The central idea behind the present algorithm is based on the fundamental property of primality of numbers I f(x) Record set Integer number space Fig: hashing I P Record set Integer number Prime number f(x) g(x) Fig: Extended hashing into prime space
  • 6. r1 r2 … rn I1 I2 … In P1 P2 … Pn PRODUCT( P prior) f(x) g(x) P1*p2 …*pn= P prior Fig: The complete algorithm
  • 7.
  • 8. STEPS OF THE ALGORITHM Step 1 : For each new record, hash is performed and unique hash value (Hnew) for each distinct record is obtained. Step 2 : Hnew is mapped to its corresponding unique prime (Pnew). Step 3 : Pprior is divided with Pnew. If Pnew exactly divides Pprior, then the corresponding record to Pnew is a duplicate and already exists in Pprior. Else, Pnew is a distinct record. Step 4 : If Pnew is a distinct record, Pprior is multiplied with Pnew and the result is stored back in Pprior. Thus updating Pprior renders the algorithm adaptive.
  • 10.
  • 11.