Propagation of data fusion

•

1 gefällt mir•326 views

Propagation of data fusion +91-9994232214,8144199666, ieeeprojectchennai@gmail.com, www.projectsieee.com, www.ieee-projects-chennai.com IEEE PROJECTS 2015-2016 ----------------------------------- Contact:+91-9994232214,+91-8144199666 Email:ieeeprojectchennai@gmail.com Support: ------------- Projects Code Documentation PPT Projects Video File Projects Explanation Teamviewer Support

Propagation of Data Fusion
Abstract:
In a relational database, tuples are called ‘duplicate’ if they describe the
same real-world entity. If such duplicate tuples are observed, it is
recommended to remove them and to replace them with one tuple that
represents the joint information of the duplicate tuples to a maximal extent.
This remove-and-replace operation is called a fusion operation. Within the
setting of a relational database management system, the removal of the
original duplicate tuples can breach referential integrity. In this paper, a
strategy is proposed to maintain referential integrity in a semantically
correct manner, thereby optimizing the quality of relationships in the
database. An algorithm is proposed that is able to propagate a fusion
operation through the entire database. The algorithm is based on a
framework of first and second order fusion functions on the one hand, and
conflict resolution strategies on the other hand. It is shown how classical
strategies for maintaining referential integrity, such as DELETE cascading,
are highly specialized cases of the proposed framework. Experimental
results are reported that (i) show the efficiency of the proposed algorithm
and (ii) show the differences in quality between several second order
fusion functions. It is shown that some strategies easily outperform
DELETE cascading.

Existing System:
Informally, the problem of duplicate data (also known as record linkage or
entity resolution) adheres to the fact that there can exist multiple
descriptions of the same real world entity within one (or more) database(s).
The presence of data imperfections such as typographical errors, lack of
standardization and missing data, constitutes the fact that identifying these
multiple descriptions is not a trivial problem.
In the context of relational databases, dealing with duplicates comes down
to
(a) Identifying which tuples are duplicate.
(b) Replacing those tuples by a single tuple.
Needless to say, in the context of a relational database, the deletion of the
original - duplicate - tuples should take into account integrity constraints,
in particular referential integrity, which is assumed to be satisfied in the
original database.
Proposed System:
This paper proposes an algorithm for maintaining referential integrity in a
relational database after the fusion of duplicate tuples. The algorithm relies

on a framework of fusion functions for multi-valued data (i.e., sets), in
order to cope with one-to-many relationships.
A comparative study shows the difference in accuracy between several
multi-valued fusers. Experimental results show the complexity in terms of
the number of clusters, the number of linked tuples and the recursive
depth.
Hardware Requirements:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Floppy Drive : 1.44 Mb.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• RAM : 256 Mb.
Software Requirements:
• Operating system : - Windows XP.
• Front End : - JSP
• Back End : - SQL Server
Software Requirements:
• Operating system : - Windows XP.

• Front End : - .Net
• Back End : - SQL Server

Weitere ähnliche Inhalte

Was ist angesagt?

K04302082087ijceronline

Gutell 107.ssdbm.2009.200Robin Gutell

Duplicate Detection in Hierarchical Data Using XPathiosrjce

Sequence Alignment In BioinformaticsNikesh Narayanan

A Survey on Bioinformatics Toolsidescitation

A survey of xml tree patternsJPINFOTECH JAYAPRAKASH

UPGMAShreya Feliz

1.2M .pdfbutest

Phylogenetics1Sébastien De Landtsheer

Dwh lecture-07-denormalizationSulman Ahmed

Bioinformaticsseyed mohammad motevalli

Bioinformatics t5-database searching-v2013_wim_vancriekingeProf. Wim Van Criekinge

Clustal W - Multiple Sequence alignment The Oxford College Engineering

Sequence homology search and multiple sequence alignment(1)AnkitTiwari354

Structure alignment methodsSamvartika Majumdar

Which is the best fingerprint for medicinal chemistry?NextMove Software

BLASTrishabhaks

Molecular phylogeneticsAjay Kumar Chandra

UpgmaMUSKANKr

blast and fastaNagendrasahu6

Was ist angesagt? (20)

K04302082087

Gutell 107.ssdbm.2009.200

Duplicate Detection in Hierarchical Data Using XPath

Sequence Alignment In Bioinformatics

A Survey on Bioinformatics Tools

A survey of xml tree patterns

UPGMA

1.2M .pdf

Phylogenetics1

Dwh lecture-07-denormalization

Bioinformatics

Bioinformatics t5-database searching-v2013_wim_vancriekinge

Clustal W - Multiple Sequence alignment

Sequence homology search and multiple sequence alignment(1)

Structure alignment methods

Which is the best fingerprint for medicinal chemistry?

BLAST

Molecular phylogenetics

Upgma

blast and fasta

Ähnlich wie Propagation of data fusion

Confiscation of Duplicate Tuples in The Relational DatabasesIJERA Editor

TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASEScscpconf

Semantic Management of Deduplicate tuples in the Relational DatabasesIRJET Journal

Efficient Record De-Duplication Identifying Using Febrl FrameworkIOSR Journals

Bi4101343346IJERA Editor

TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASEScsandit

ApresentDiogo Guimaraes

Asif nosqlAsif Ali

Spe165 tRajesh War

Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databasesijsrd.com

IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...IRJET Journal

Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER

J017616976IOSR Journals

Automatically converting tabular data toIJwest

Elimination of data redundancy before persisting into dbms using svm classifi...nalini manogaran

Improved Text Mining for Bulk Data Using Deep Learning Approach IJCSIS Research Publications

Data models and roDiana Diana

Iaetsd a survey on one class clusteringIaetsd Iaetsd

(2016)application of parallel glowworm swarm optimization algorithm for data ...Akram Pasha

A03202001005theijes

Ähnlich wie Propagation of data fusion (20)

Confiscation of Duplicate Tuples in The Relational Databases

TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASES

Semantic Management of Deduplicate tuples in the Relational Databases

Efficient Record De-Duplication Identifying Using Febrl Framework

Bi4101343346

TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASES

Apresent

Asif nosql

Spe165 t

Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases

IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...

Textual Data Partitioning with Relationship and Discriminative Analysis

J017616976

Automatically converting tabular data to

Elimination of data redundancy before persisting into dbms using svm classifi...

Improved Text Mining for Bulk Data Using Deep Learning Approach

Data models and ro

Iaetsd a survey on one class clustering

(2016)application of parallel glowworm swarm optimization algorithm for data ...

A03202001005

Mehr von ieeepondy

Demand aware network function placementieeepondy

Service description in the nfv revolution trends, challenges and a way forwardieeepondy

Secure optimization computation outsourcing in cloud computing a case study o...ieeepondy

Spatial related traffic sign inspection for inventory purposes using mobile l...ieeepondy

Standards for hybrid cloudsieeepondy

Rfhoc a random forest approach to auto-tuning hadoop's configurationieeepondy

Resource and instance hour minimization for deadline constrained dag applicat...ieeepondy

Reliable and confidential cloud storage with efficient data forwarding functi...ieeepondy

Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...ieeepondy

Scalable cloud–sensor architecture for the internet of thingsieeepondy

Scalable algorithms for nearest neighbor joins on big trajectory dataieeepondy

Robust workload and energy management for sustainable data centersieeepondy

Privacy preserving deep computation model on cloud for big data feature learningieeepondy

Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...ieeepondy

Protection of big data privacyieeepondy

Power optimization with bler constraint for wireless fronthauls in c ranieeepondy

Performance aware cloud resource allocation via fitness-enabled auctionieeepondy

Performance limitations of a text search application running in cloud instancesieeepondy

Performance analysis and optimal cooperative cluster size for randomly distri...ieeepondy

Predictive control for energy aware consolidation in cloud datacentersieeepondy

Mehr von ieeepondy (20)

Demand aware network function placement

Service description in the nfv revolution trends, challenges and a way forward

Secure optimization computation outsourcing in cloud computing a case study o...

Spatial related traffic sign inspection for inventory purposes using mobile l...

Standards for hybrid clouds

Rfhoc a random forest approach to auto-tuning hadoop's configuration

Resource and instance hour minimization for deadline constrained dag applicat...

Reliable and confidential cloud storage with efficient data forwarding functi...

Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...

Scalable cloud–sensor architecture for the internet of things

Scalable algorithms for nearest neighbor joins on big trajectory data

Robust workload and energy management for sustainable data centers

Privacy preserving deep computation model on cloud for big data feature learning

Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...

Protection of big data privacy

Power optimization with bler constraint for wireless fronthauls in c ran

Performance aware cloud resource allocation via fitness-enabled auction

Performance limitations of a text search application running in cloud instances

Performance analysis and optimal cooperative cluster size for randomly distri...

Predictive control for energy aware consolidation in cloud datacenters

Propagation of data fusion

1. Propagation of Data Fusion Abstract: In a relational database, tuples are called ‘duplicate’ if they describe the same real-world entity. If such duplicate tuples are observed, it is recommended to remove them and to replace them with one tuple that represents the joint information of the duplicate tuples to a maximal extent. This remove-and-replace operation is called a fusion operation. Within the setting of a relational database management system, the removal of the original duplicate tuples can breach referential integrity. In this paper, a strategy is proposed to maintain referential integrity in a semantically correct manner, thereby optimizing the quality of relationships in the database. An algorithm is proposed that is able to propagate a fusion operation through the entire database. The algorithm is based on a framework of first and second order fusion functions on the one hand, and conflict resolution strategies on the other hand. It is shown how classical strategies for maintaining referential integrity, such as DELETE cascading, are highly specialized cases of the proposed framework. Experimental results are reported that (i) show the efficiency of the proposed algorithm and (ii) show the differences in quality between several second order fusion functions. It is shown that some strategies easily outperform DELETE cascading.

2. Existing System: Informally, the problem of duplicate data (also known as record linkage or entity resolution) adheres to the fact that there can exist multiple descriptions of the same real world entity within one (or more) database(s). The presence of data imperfections such as typographical errors, lack of standardization and missing data, constitutes the fact that identifying these multiple descriptions is not a trivial problem. In the context of relational databases, dealing with duplicates comes down to (a) Identifying which tuples are duplicate. (b) Replacing those tuples by a single tuple. Needless to say, in the context of a relational database, the deletion of the original - duplicate - tuples should take into account integrity constraints, in particular referential integrity, which is assumed to be satisfied in the original database. Proposed System: This paper proposes an algorithm for maintaining referential integrity in a relational database after the fusion of duplicate tuples. The algorithm relies

3. on a framework of fusion functions for multi-valued data (i.e., sets), in order to cope with one-to-many relationships. A comparative study shows the difference in accuracy between several multi-valued fusers. Experimental results show the complexity in terms of the number of clusters, the number of linked tuples and the recursive depth. Hardware Requirements: • System : Pentium IV 2.4 GHz. • Hard Disk : 40 GB. • Floppy Drive : 1.44 Mb. • Monitor : 15 VGA Colour. • Mouse : Logitech. • RAM : 256 Mb. Software Requirements: • Operating system : - Windows XP. • Front End : - JSP • Back End : - SQL Server Software Requirements: • Operating system : - Windows XP.

4. • Front End : - .Net • Back End : - SQL Server

Propagation of data fusion

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Propagation of data fusion

Ähnlich wie Propagation of data fusion (20)

Mehr von ieeepondy

Mehr von ieeepondy (20)

Propagation of data fusion