SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
International Association of Scientific Innovation and Research (IASIR)
(An Association Unifying the Sciences, Engineering, and Applied Research)
International Journal of Emerging Technologies in Computational
and Applied Sciences (IJETCAS)
www.iasir.net
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 43
ISSN (Print): 2279-0047
ISSN (Online): 2279-0055
A Review on Graph-based Image Classification
1
Mrs. Snehal N. Amrutkar , 2
Prof.J.V.Shinde
1,2
Department of Computer Engineering
University of Pune
Late G.N.Sapkal College of Engineering, Anjaneri, Nashik
INDIA
Abstract: Graphs are increasingly important in modelling complicated structures, such as circuits, images,
chemical compounds, protein structures, biological networks, social networks and the web. Graph-based data
representations are an important research topic due to the suitability of this kind of data structure to model
entities and the complex relations among them. We are proposing to combine a powerful graph-based image
representation and frequent approximate sub-graph (FAS) mining algorithms in order to classify images.
Among the many approaches in performing image classification, graph based approach is gaining popularity
primarily due to its ability in reflecting global image properties. This paper critically reviews existing important
frequent subgraph mining algorithms This review not only reveals the pros in each method and category but
also explores its limitations.
Keywords: Frequent approximate sub-graphs, Graph-based image representation, and Image classification
I. Introduction
The need to convert large volumes of data into useful information has been increased. In many datasets, the
objects are may be represented as graphs. In Fig. 1, an example of graphs for representing images is shown.
As a result of this need, several authors have developed techniques and methods to process these datasets
[64,65]. e.g. Frequent pattern discovery [66,67,68].
In many research fields, graphs have been largely used to model data due to their representation expressiveness
and their suitability for applications where some kind of entities and their relationships must been coded within
some data structure.
The discovery of frequent patterns, mainly the detection of frequent sub-graphs in graph collections is an
important problem in graph mining tasks[5,69,70]. In frequent sub-graph mining, there are two approaches for
evaluating the similarity of graphs, known as graph matching: exact matching and approximate matching.
The exact matching determining whether the labels and the structure of two graphs are identical. This matching
has been successfully used in many applications[71,72,73]; however, there are some problems where exact
matching could not be applicable with positive outcome. Sometimes, the interesting sub-graphs show slight
differences throughout the data. For example these differences can be seen in protein analysis. Other examples
can be seen on image processing, where these differences may be due to noise and distortion, or may just
illustrate slight spatial differences between instances of the same objects. This means that we should tolerate
certain level of geometric distortion, slight semantic variations or vertices and/or edges mismatch in frequent
sub-graph pattern search. For this reason, it has become necessary to evaluate the similarity between graphs
allowing some structural differences, i.e. approximate matching techniques. The approximate matching consists
in finding the match between vertices and/or edges of two graphs in order to determine their similarity, allowing
structure and/or label differences. On this basis, the need to perform frequent sub-graph mining using
approximate graph matching has been raised[74,75,76,77].Several authors have expressed the necessity to use
approximate graph matching for frequent sub-graph mining on graph collection. These authors defend the idea
that frequent and more interesting sub-graphs for applications and users could be found.
Frequent subgraph mining is the processor of extracting all frequent subgraphs from graph dataset who have
occurrence count greater than or equal to the specified threshold. Following figure gives example of frequent
subgraph mining, where figure 1(a) and 1(b) represents any two graphs and figure 1(c) gives frequent subgraph
which is present in both graphs.
Figure 1(a) Theobromine, 1(b) Caffeine and 1(c) Frequent subgraph
These graph mining methods might be grouped as :
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 44
A. Apriori-Based Approach: Apriori based frequent substructure mining algorithm share similar characteristics
with Apriori-based frequent item set mining algorithms. The search for frequent groups starts with graphs a
small “size” and proceeds in a bottom-up manner by generating candidate having an extra vertex, edge or path.
The definition of graph size depends on algorithm used.
B. Pattern-Growth Approach: The Apriori-based approach has to use the breadth-first search (BFS) strategy
because of its level-wise candidate generation.
II. Survey of algorithms
Various algorithms on graph mining were developed by many researchers. Some algorithms give approximate
results while others provide exact results. Some of them are reviewed in this section.
Xiao,Wang,Wu in 1984 presented CSMiner[1] Algorithm based on node/edge disjoint sub-isomorphism. For
alignment of Multi-PPI networks, it needs to allow node-skipping & mismatching. But it increases computation
cost of graph matching .
Figure 2(a): PPI Network Figure 2(b): Inexact Graph Matching for CSMiner[1] Algorithm
Holder, Cook & Bunke in 1987 presented SUBDUE[2] algorithm which is based on graph edit distance. Using
fuzzy graph match,SUBDUE can identify slightly different instances of substructures which are replaced by
pointer.
Figure 3(a): Cortisone Figure 3(b): Substructures found in Cortisone compound for SUBDUE algorithm[2]
Blockeel and Raedt [3] in 1998 introduced a first-order framework for top-down induction of logical decision
tree. Top-down induction of decision trees is the best known and most successful machine learning technique. It
has been used to solve numerous practical problems. It employs a divide-and conquers strategy, and in this it
differs from its rule-based competitors which are based on covering strategies. Chakrabarti, Dom and Indyk [4]
in 1998 developed a new method for automatically classifying hypertext into a given topic hierarchy, using an
iterative relaxation algorithm.After bootstrapping off a text-based classifier, they used both local texts in a
document as well as the distribution of the estimated classes of other documents in its neighborhood, to refine
the class distribution of document being classified.They discussed three area of research: text and hypertext
information retrieval, machine learning in context other text or hypertext, and computer vision and pattern
recognition.
Gago-Alonso,Carrasco-Ochoa,Medina-Pagola in 2000 presented gdFil[5] Algorithm which is based on pattern
growth. gdFil allow removing all duplicate candidates before support calculation.But,Support calculation
requires expensive sub-graph isomorphism tests.
In FSG[6] proposed by Kuramochi in 2001, candidates are generated by increasing one edge at a time. Size of
the graph in FSG is proportional to the number of edges. To check unique subgraph, FSG uses canonical
labeling technique in which each graph have unique canonical label. Canonical labels of two graphs are same
only when both graphs have same topological structure and same edges and vertices labels. FSG also have some
drawbacks. It also generates multiple candidates. FSG uses isomorphism testing technique, is very costly.
Fig 4: Multiple Candidate Generation for FSG[6]
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 45
Calders and Wisen [7] in 2001 Presented on monotone data mining layer a simple data-mining logic (DML) that
can express common data mining tasks like “find Boolean association rules” or “Find inclusion dependencies”.
Kramer, Raedt, and Helma [8] in 2001 presented the application of feature mining techniques to the
developmental therapeutics program’s AIDS antiviral screen database. Kuramochi and Karypis [9] in 2001
presented a computationally efficient algorithm for finding all frequent subgraphs in large graph databases. They
evaluated the performance of the algorithm by experiments with synthetic datasets as well as a chemical
compound dataset. Pei, Han,Mortazavi-Asl and Pinto [9] in 2001 proposed a novel sequential pattern mining
method called PrefixSpan that is prefix-projected Sequential pattern mining.
Asai, Abe and Kawasoe [10] in 2002 discovered the efficient substructure from large semi structure data and
patterns by labeled ordered trees and studied the problem of discovering all frequent tree-like patterns that have at
least a minimum support in a given collection of semi-structured data. They presented an efficient pattern mining
algorithm FREQT for discovering all frequent tree patterns from a large collection of labeled ordered tree.
Borgelt and Berthold [11] in 2002 presented an algorithm to find fragments in a set of molecules that help to
discriminate between different classes for instance, activity in a drug discovery context. Zaki [12] in 2002
presented TREEMINER algorithm to discover all frequent subtrees in a forest, using a new data structure called
scope-list. Deshpande, Kuramochi and Karypis [13] in 2002 proposed the technique for classifying chemical
compounds.These techniques can be broadly categorized into two groups.The first group consists of techniques
that rely mainly on various global properties of the chemical compounds, such as molecular weight, ionization
potential, inter-atomic distance etc. for capturing the structural properties of the compounds.Since this
information is not relational, existing classification techniques can be easily used on these datasets. However the
absence of actual structural information limits the accuracy of such classified. The second group of techniques
directly analyzes the structure of the chemical compounds to identify patterns that can be used for classification
[14; 16; 15; 8]. One of the earliest studies in discovering substructures was carried out by Dehaspe et al. [8] in
1998 which Inductive Logic Programming (ILP) techniques were used though this approach is quite powerful it is
not designed to scale to large graph databases hence may not be able to handle large chemical compound
databases. A number of recent approaches focused on analyzing the graph representation of the chemical
compounds, to identify frequently occurring patterns, and use these patterns to aid in the classification. Wang et
al. [16] in 1997 developed an algorithm to find frequently occurring blocks in the geometric representation of
protein molecules and showed that these blocks can be used for classification. Inikuchi et al. [15] developed an
algorithm to find all frequently occurring induced sub graphs and presented some evidence that such subgraph
can be used to features for future classification.
Cooper and Frieze [17] in 2003 described a general model of a random graph process whose proportional degree
sequence obeys a power law. These law recently observed in graph associated with www. Dzeroski [18] in 2003
introduced the Multi-Relational Data Mining. Getoor [19] in 2003 studied on link mining. Link among the
objects may demonstrated certain patterns which can be helpful for many data mining tasks and are usually hard
to capture with traditional statistical models. Link mining is promising new area where relational learning meets
statistical modeling.Huan, wang and Prince [20] in 2003 proposed a novel subgraph mining algorithm: FFSM,
which employs a vertical search scheme within an algebraic graph framework. They have developed to reduce
the number of redundant candidates proposed. Their studied on synthetic and real datasets demonstrates that
FFSM achieves a substantial performance gain over the current start-of-the art subgraph mining algorithm
gSpan
Washio and Motoda [21] in 2003 introduced the theoretical basis of graph based data mining and surveys the
state of the art of graph-based data mining. Yan, Han and Afshar [22] in 2003 proposed an alternative but
equally powerful solution which is based on pattern growth.: instead of mining the complete set of frequent
subsequences, they mined frequent closed subsequences only that are those contained no super-sequence with
the same support. They also introduced an efficient algorithm, called CloSpan. (CloSpan is stand for closed
sequential pattern mining.) This outperforms the previous work by one order of magnitude. Moreover a deep a
deep understanding of efficient sequential pattern mining methods may also have strong implications on the
development of efficient methods for mining frequent subtrees, lattices, subgraphs, and other structured patterns
in large databases. The sequential pattern mining algorithms developed so far have good performance in
databases consisting of short frequent sequences. Unlike Apriori algoorithm, gSpan does not generate candidate
patterns & tests for false positive pruning which is time & space efficient but,it has high computational
complexity.
Yan and Han [23] in 2003 proposed to mine close frequent graph patterns. A graph g is closed in a database if
there exists no proper subgraph of g that has the same support as g. A closed graph pattern mining algorithm,
CloseGraph, is developed by exploring several interesting looping methods. Their performance studied shown
that CloseGraph not only dramatically reduces unnecessary subgrapgs to be generated but also substantially
increases the efficiency of mining, especially in the presence of large graph patterns. Yin and Han [24] in 2003
developed a new classification approach is called CPAR (CPAR is classification based on predictive
Association Rules). Based on their study performance study, CPAR achieved high accuracy and efficiency,
which have many useful features.CPAR represents a new approach towards efficient and high quality
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 46
classification. It is interesting to further enhance the efficiency and scalability of this approach and compare it
with other well-established classification schemes. Moreover, the strength of the derived predictive rules also
motivates us to perform an in-depth study on alternative approaches towards effective association rule mining.
Huan, Wang, Prins and Yang [25] in 2004 developed a new algorithm that mines only maximal frequent
subgraphs, that is subgraph that are not a part of any other frequent subgraphs. Their algorithm can achieve a
five-fold speed up over the current state-of-the-art subgraph mining algorithms.Their mining method is based on
a novel graph mining framework in which they first mine all frequent tree patterns from a graph database and
then construct maximal frequent subgraphs from trees. Huan, Wang Bandyopadhyay Snoeyink, and Prins [26] in
2004 applied a novel subgraph mining algorithm mining algorithm to three related graph representation of the
sequence and proximity characteristics of a proteins amino acid residues. The subgraph mining algorithm is
used to discover spatial motifs that can be used to discriminate among protein in different families found in the
SCOP database. Koyuturk, Grama, Szpankowski, [27] in 2004 presented an innovative new algorithm for
detecting frequently occurring patterns and modules in biological network. They show experimentally that their
algorithm can extracted from the KEGG database within seconds. The proposed model and algorithm are
applicable to a variety of biological networks either directly or with minor modification. Meinl, Borgelt and
Berthold [28] in 2004 shown that is possible to mine meaningful, discriminative molecular fragments from large
databases. Using an existing algorithm that employs a depth-first strategy and a sophisticated ordering scheme
allows avoiding costly re-embeddings throughout the candidate growth process, which in turn enables us to find
also larger fragments. Yin, Han, Yang and Yu [29] in 2004 developed a CrossMine, an efficient and scalable
approach for multi-relational classification. Several novel methods are developed in CrossMine, including 1.
tuple ID propagation, which performs semantics-preserving virtual join to achieve high efficiency on databases
with complex schemas. 2. a selective sampling method which makes it highly scalable with respect to the
number of tuples in the databases. Both theoretical backgrounds and implementation techniques of CrossMine
are introduced. Yan, Yu and Han [30] in 2004discovered the issues of indexing graphs and proposed a novel
solution by applying a graph mining technique. Different from the existing path-based methods, our approach,
called gIndex, makes use of frequent substructure as the basic indexing feature. Frequent substructures are ideal
candidates since they explore the intrinsic characteristics of the data and are relatively stable to database
updates. gIndex has 10 times smaller index size, but achieves 3-10 times better performance in comparison with
a typical path-based method.
Yongling Song, Su-Shing Chen in 2005 presented RNGV[31] Algorithm which is based on graph edit distance.
RNGV algorithm uses modified Apriori algorithm, which treat edges as items. Perform edit operations that
includes vertex & edge insertion, deletion & relabeling.RNGV has a limitation that it Explore possible edit
paths keeping one expected as a candidate and this do not claim completeness .Hu, Yan, Huang, Han and Zhou
[32] in 2005 developed a novel algorithm, CODENSE, to efficiently mine frequent coherent dense subgraphs
across large number of massive graph on biological networks for function discovery. Li and Tan [33] in 2005
proposed a novel graph mining algorithm to detect the dense neighborhoods in an interaction graph which may
correspond to protein complexes. Their algorithm first located local cliques for each graph vertex and then
merge the detected local cliques according to their affinity to form maximal dense regions. Yan, Yu and Han
[34] in 2005 investigated the issues of substructure similarity search using indexed features in graph databases.
By transforming the edge relaxation ratio of a query graph into the maximum allowed missing features, their
structural filtering algorithm called Grafil, can filter many graphs without performing pairwise similarity
computations. Yin, Han and Yu [35] in 2005 proposed a new approach, called CROSSCLUS, which performs
cross-relational clustering with user’s guidance.
Kuramochi and Karypis in 2006 presented GREW[36] Algorithm which is based on sub-isomorphism
employing ideas of edge contraction & graph rewriting . They discovers frequent sub-graphs by repeatedly
merging embeddings of existing freq. sub-graphs.Chakrabarti and Faloutsos [37] in 2006 discussed the problem
of detecting abnormalities in a given graph and of generating synthetic but realistic graphs have received
considerable attention recently. Both are tightly coupled to the problem of finding the distinguished
characteristics of real-world graphs i.e. the patterns that show up frequently in such graphs and can be
considered as marks of realism. Karunaratne and Bostrom [38] in 2006 developed a method for learning from
structured data are limited with respect to handling large isolated substructures and also imposed constraints on
search depth and induced structure length. An approach to learning from structured data using a graph based
canonical representation method of structured called finger printing. Krasky, Rohwer, Schroeder and Selzen
[39] in 2006 discussed on a combined bioinformatics and chemoinformatics approach for the development of
new ant parasitic drugs. Meinl, [40] in 2006 solved the problem Parallel molecular mining Worlein, Urzova,
Fischer, and Philippsen which are used in chemoinformatics to find common molecular fragments a database of
molecules represented as two-dimensional graphs. In ParMol package they have implemented four of the most
popular frequent subgraph miners using a common infrastructure: MoFa, gspan, FFSM and Gaston. They also
added additional functionality to some of the algorithms like parallel search, mining directed graphs and mining
in one big graph instead of a graph database. Meinl, Worlein, Fischer, and Philippsen [41] in 2006 presented the
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 47
thread-based parallel versions of MoFa and gSpan that achieve speedup up to 11 on a shared memory SMP
system using 12 processors. They discussed the design space of the parallelization, the results, and the obstacles,
that are caused by the irregular search space and by the current state of Java technology. Tsuda and Kudo [42] in
2006 proposed a graph clustering method that selects informative patterns at the same time. Their task is
analogous to feature selection for vectors however the difference is that the features are not explicitly listed.
This method is fully probabilistic adopting a binomial mixer model defined on a very high dimensional vector
indicating the presence or absence of all possible patterns. Wegner, Frohlich, Mielenz and Zell [43] in 2006
presented a classification method which is based on a coordinate-free chemical space. Thus it does not depend
on descriptor values commonly used coordinated-based chemical space methods.
Figure 5: Sequence of edge collapsing operations performed by GREW[36]
Chen,Yan,Zhu, Han in 2007 presented gApprox[44] algorithm, which is based on substitution probabilities. Due
to noise & distortion, it considers approximation. So interesting patterns can be captured. gApprox algorithm
finds patterns for a single network not for a set of graphs. Shijie Zhang & Jiong Yong in 2007 presented
Monkey[45] Algorithm based on β-edge sub-isomorphism It is capable of finding frequent patterns that were
missed by exact matching algorithms in presence of edge distortions . They handles only missing edge & edge
label mismatch. Merkwirth and Ogorzalek [46] in 2007 described a method for construction of specific types of
neural networks composed structures directly linked to the structure of the molecule under consideration. Each
molecule can be represented by a unique neural connectivity problem (graph) which can be programmed on to a
cellular neural network. A composite network can further successfully perform classification and regression on
real-world chemical data-set. Dong, Gilbert, Guha, Heiland, Kim, Pierce, Fox, and Wild [47] in 2007 developed
an infrastructure of chemoinformatics web service that simplifies that access to this information and the
computational techniques that can be applied to it. They described this infrastructure and give some examples of
its uses and then discuss their plans to use it as a platform for chemoinformatics application development in the
future. Rhodes, Boyer, Kreulex, Chen, and Ordonez [48] in 2007 developed a system that allow user to explore
the US patent corps using Molecular information. Their system contains three main technologies. In this system
a user may go to a web page, draw a molecule search for related Intellectual property (IP) and analyzed the
results.
Figure 6: Exploring the Pattern Space in [44]: (a) the original graph, (b) decomposition of the problem and stage 1, (c) stage 2, (d)
stage 3, (e) stage 4, where each stage is enclosed respectively within a loop. We darken marked vertices along the way, while edges
are correspondingly changed from dotted to thickened as new vertices are absorbed
Figure 7 β-edge isomorphism (β>=2) for Monkey algorithm[45]
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 48
Bogdanov [49] in 2008 studied on Graph searching, indexing, mining and modeling for Bioinformatics,
chemoinformatics and Social network. Fahim et al. [50] in 2008 proposed a method which is based on shifting
the center of the large cluster toward the small cluster and recompiling the membership of small cluster points,
the experimental results reveal that the proposed algorithm produce satisfactory results. Godeck and Lewis [51]
in 2008 stated that QSAR models can play a vital role in both the opening phase and the endgame of lead
optimization. In the opening phase there is often a large quantity of data from high throughput screening (HTS)
and potential leads need to be selected from several distinct chemotypes. Guha and Schurer [52] in 2008
investigated various aspects of developing computational models to predict cell toxicity based on cell
proliferation screening data generated in the MLSCN. By capturing feature-based information in that data set,
such predictive models would be useful in evaluating cell based screening results in general and could be used
as an aid to identify and eliminate potentially undesired compounds. Hubler, Kriegel, Borgwardt, Ghahramani
and Metropolis [53] in 2008 presented metropolis algorithm for sampling a representative small subgraph from
the original large graph with representative describing the requirement that the sample shall preserve crucial
graph properties of the original graph. Karunaratne and Bostrom [54] in 2008 presented a case study in
chemoinformatics in which various types of background knowledge are encoded in graphs that are given as
input to a graph learner. In this paper shown that the type of background knowledge encoded indeed has an
effect on the predictive performance. Lam and Chan [55] in 2008 studied on graph data mining algorithm are
increasingly applied to biological graph data set. In this paper they proposed graph mining algorithm MIGDAC
(Mining graph data for classification) that applies on graph theory and an interestingness measure to discover
interesting sub graphs which can be both characterized and easily distinguished from other classes. Maji, and
Mehta [56] in 2008 proposed a novel a measure of similarity between labeled graphs which has applications to
structured data analysis for example chemical informatics, web document clustering etc. their metric on graphs
exploits vertex context similarity and computes an overall matching score in polynomial time in the size of the
graphs using a network flow formulation of the problem. Tsuda and Kurihara [57] in 2008 proposed a
nonparametric Bayesian method for clustering graph and selecting salient patterns at the same time. Variation
inference is adopted here because sampling is not applicable due to extremely high dimensionality. Schietget,
Costa, Ramon, and Raedt [58] in 2009 proposed a direct efficient and simple approach for generation of
interesting graph pattern. They computed maximum common subgraph from randomly selected pairs of
examples and directly use them as features.
Zou, Li, Gao, and Zhang in 2010 developed MUSE[59] Algorithm which is based on sub-isomorphism on
uncertain graphs. MUSE algorithm find set of FAS patterns by allowing error tolerance on expected support of
discovered sub-graph patterns.
Jiang, Coenen and Zito [60] in 2010 examined a number of edge weighting schemes; and suggested three
strategies for controlling candidate set generation. Spjuth, Willighagen, Guha, Eklund and Wikberg [59] in 2010
Studied on toward interoperable and reproducible QSAR analysis: Exchange of data sets. QSAR is widely used
method to relate chemical structures to responses or properties based experimental observations. Much effort
has been made to evaluate and validate the statistical modeling QSAR, but these analyses treat the dataset as
fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises
addition of chemical structure as well as selection of descriptors and software implementations prior to
calculations. This process is hampered by the lack of standard and exchange formats in the field, making it
virtually impossible to reduce and validate analyses and drastically constrain collaboration and re-use of data.
Yang, Parthasarthy and Sadayappan[60] in 2010 presented a novel approach to data representation for
computing this kernel, particularly targeting sparce matrices representing power-law graphs. They shown their
representation scheme, coupled with a novel tiling algorithm, can yield significant benefits over the current state
of the art GPU and CPU efforts on a number of core data mining algorithms such as PageRank, HITS and
Random Walk with Restart. A graph transaction is represented by adjacency matrices and the frequent patterns
appearing in matrices are mined through the extended algorithm. These are modeled as attribute graph in which
each vertex represents an atom and each edge a bond between atoms. Each vertex carries attribute that indicates
the atom type.
Papapetrou,Ioannou in 2011 developed UGRAP[61] algorithm which is based on sub-isomorphism on uncertain
graphs. This algorithm required for applications which has uncertainty in their data. UGRAP algorithm uses an
Index (containing graph edges & their probabilities) which reduces number of comparisons required for
computing expected support of each candidate pattern. And improve efficiency by optimization for early
termination & scheduling of graph comparisons.
ChenJia, Zhang, Huan in 2007 presented APGM [62] algorithm and Acosta-Mendoza, Gago-Alonso, Medina-
Pagola in 2012 presented VEAM[63] algorithm which are based on substitution probabilities. Due to noise &
distortion, it considers approximation. So interesting patterns can be captured. APGM and VEAM Algorithms
specify which vertices, edges or labels can replace others to perform frequent sub-graph mining. These methods
perform feature (sub-graph) selection taking into account semantic distortions.
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 49
Figure 8: An illustrative example for [61] showing (a) an uncertain graph, (b) its implied exact graphs and (c) a subgraph pattern.
APGM algorithm allows these distortions only between vertices of graphs, while VEAM algorithm allows
distortions between vertices and between edges.
Fig. 9(a): Example of three labeled graphs[63]
Figure 9(b) Example of two ways to map vertices and edges of the labeled graph T to those of the labeled graph G [63].
III. Conclusion
Recently, there has been increasing interest in using graph based methods as a powerful tool for image
classification. This review has discussed some of the major graph mining methods and highlighted their
strengths as well as limitations. Some difficulties of these methods have brought down their use in practical
applications. In case of exact matching, sometimes, the interesting sub-graphs show slight differences
throughout the data.The current research in graph mining methods orients towards producing approximate
solution (sub-optimal solution) for such graph matching problem to find frequent and more interesting sub-
graphs for applications. In fact, many of the graph mining methods have their own disadvantages. This survey
demonstrates that most of the methods fail to find use in many real time applications. This demonstrates the
need for further research to refine the graph mining techniques to be applied for real time scenarios.
IV. References
[1] Y.Xiao, W.Wu, W.Wang, Z.He, Efficient algorithms for node disjoint subgraph homeomorphism determination, in: Proceedings
of the 13th
International Conference on Database Systems for Advanced Applications, New Delhi, India, 2008,pp.452–460.
[2] L.B. Holder, D.J.Cook, H.Bunke, Fuzzy substructure discovery, in: Proceedings of the Ninth International Workshop on
Machine Learning, San Francisco, CA, USA, 1992, pp.218–223.
[3] . Blockeel, L.D. Raedt, “Top-down induction of first-order logic decision trees”, Artificial Intelligence, 101, 1998, pp. 285-297.
[4] S. Chakrabarti, B. Dom, P. Indyk, “Enhanced hypertext categorization using hyperlinks” ACM, (SIGMOD’98), 1998, pp. 307-
318.
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 50
[5] A.Gago-Alonso, J.A.Carrasco-Ochoa, J.E.Medina-Pagola, J.F.Martínez-Trinidad, Full duplicate candidate pruning for frequent
connected subgraph mining, in: Integrated Computer-Aided Engineering, 17August, 2010,pp.211–225.
[6] M. Kuramochi and G. Karypis, “Frequent subgraph discovery”, In Intl. Conf. on Data Mining, (ICDM’01), pages 313-320, Nov.
2001
[7] T. Calders, J. Wijsen, “On Monotone mining Languages”, In proc. Of international workshop on database programming
Languages(DBPL), 2001, pp. 119-132.
[8] S. Kramer, L.D. Raedt, C. Helma, “Molecular feature mining in HIV data”, In Proc.ational conf. on of the 7th ACM SIGKDD
International conf. on knowledge discovery and data mining, 2001, pp. 136-143
[9] M. Kuramochi, G. Karypis, “ Freovequent Subgraph Discovery “, In Proc 2001 Int. conf. Data mining(ICDM’01).
[10] T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Satamota, and S. Arikawa, “Efficient substructure discovery from large semi-
structured data.” In proc. 2002 SIAM Int. conf. Data mining(SDM’02), 2002, pp. 158-174.
[11] C. Borgelt and M.R. Berthold, “Mining molecular fragments: Finding relevant substructures of molecules.” In Proc. 2002 int.
conf. Data Mining (ICDM’02), PP. 211-218.
[12] M.J. Zaki, “Efficiently Mining Frequent trees in a forest.” In Proc. 2002 ACM SIGKDD Int. conf. knowledge Discovery and
Datamining(KDD’02), 2002, pp. 71-80.
[13] M.Deshpande, M. Kuramochi, G. Karypis, “Automated approaches for classifying structures.” In Proc. 2002 workshop on Data
mining in Bioinformatics (BIOKDD’02), 2002, pp. 11-18.
[14] L. Dehaspe, H. Toivonen, R. D. King, “Finding frequent substructures in chemical compounds”. In 4th Int. conf. on knowledge
Discovery and Data mining, 1998.
[15] A. Inokuchi, T. Washio, H. Motoda, “An Apriori-based Algorithm for Mining Frequent substructures from Graph Data. In proc.
2000 European Symp. Principle of Data mining and knowledge Discovery (PKDD’00), 1998, pp. 13-23.
[16] X. wang, J.T.L. Wang, D. Shasha, B. Shapiro, S. Dikshitulu I.Rigoutsos, K. Zhang, “Automated discovery of active motifs in
three dimensional molecules”, In Proc. Of the 3rd int. conf. on knowledge discovery and data mining, 1997.
[17] C. Cooper and A. Frieze, “A general model of web graph.” 2003. [20] S. Dzeroski, “Multi-Relational Data mining: An
Introduction” SIGKOD Explore. Newsl, 5(1), 2003, pp.1-16.
[18] S. Dzeroski, “Multi-Relational Data mining: An Introduction” IGKOD Explore. Newsl, 5(1), 2003, pp.1-16.
[19] L. Getoor, “Link Mining: A new data mining challenge.” SIGKDD Explorations, (5), 2003, pp.84-89.
[20] J. Huan, W. Wang and J. Prins, “Efficient Mining of frequent Subgraph in the Presence of Isomorphism.” In Proc. 2003 int. conf.
Data mining (ICDM’03), 2003, pp. 549-552.
[21] T. Washio and H. Motoda, “State of the art of Graph- based data mining.” SIGKDD Explorations, (5), 2003, pp. 59-68.
[22] X. Yan, J. Han and R. Afshar, “CloSpan: Mining Closed Sequential patterns in Large Datasets.” In Proc. 2003 SIAM Int. conf.
Data mining (SDM’03), 2003, pp. 166-177.
[23] X. Yan and J. Han CloseGraphs: Mining Closed Frequent Graph Patterns. In proc. 2003 ACM SIGKDD Int. conf. knowledge
Discovery and Data Mining (KDD’03), 2003, pp. 286-295.
[24] X. Yin and J. Han, “ CPAR: Classification based on Predictive Association Rules. In Proc. 2003 SIAM Int. conf. Data Mining
(SDM’03), 2003, pp. 331-335.
[25] J. Huan, W. Wang, J. Prins and J. Yang,”Spin: mining maximal frequent subgraphs from graph Databases”, KDD04 Seattle,
Washington, USA, 2004.
[26] J. Huan, W. Wang, D Bandyopadhyay, J. Snoeyink, J. Prins and J. Tropsha,” A. Mining Sapitial Motifs from Protein Structure
Graphs. In Proc. 8th int. conf. Research in computational Molecular Biology (RECOMB), 2004, pp. 308-315.
[27] M. Koyuturk, A. Grama, and W. Szpankowski. “An Efficient algorithm for detecting frequent subgraphs in biological networks.”
Bioinformatics, (20), 2004, pp. i200-i207.
[28] T. Meinl, C. Borgelt and M.R. Berthold, “ Discriminative closed fragment mining and perfect extensions in MoFa.” 2004.
[29] X. Yin, J. Han, J.Yang and P.S. Yu, “CrossMine: Efficient Classification Across Multiple Database Relations. In Proc. 2004 int.
conf. Data Engineering (ICDE’04), 2004, pp. 399-410.
[30] X. Yan, P.S. Yu, and J. Han. “ Graph Indexing: A Frequent Structure-based approach.” In Proc. 2004 ACM SIGKDD Int. conf.
management of Data, 2004, pp.335-346.
[31] Y.Song, S.Chen, Itemsets based graph mining algorithm and application in genetic regulatory networks, in: Proceedings of the
IEEE International Conference on Granular Computing, Atlanta, GA, USA, 2006, pp.337–340.
[32] H. Hu, X. Yan, Y. Huang, J. Han and X. J. Zhou, “Mining coherent dense subgraphs across massive biological networks for
functional discovery.” In proc. 2005 Int. conf. Intelligent system for molecular Biology (ISMB’05), 2005, pp. 213-221.
[33] X.L. Li, S.H. Tan, “Interaction graph mining for protein complexes using local clique merging.” Genome Informatics,
16(2),2005, pp. 260-269.
[34] X. Yan, P.S. Yu, J. Han, “Substructure similarity search in graph databases.” In proc. 2005 ACM-SIGMOD Int. conf.
Management of Data (SIGMOD’05), 2005, pp. 766-777.
[35] X. Yin, J. Han, P.S. Yu, “Cross-Relational Clustering with User’s Guidance. In Proc. 2005 ACM SIGKDD Int. conf. Knowledge
Discovery and Data mining (KDD’05), 2005, 344-353.
[36] M. Kuramochi, G.Karypis, GREW—a scalable frequent subgraph discovery algorithm, in: Proceedings of the 4th
IEEE
International Conference on Data Mining, 2004,pp.439–442.
[37] D. Chakrabarti, C. Faloutsos, “Graph mining: Laws, Generators, and Algorithm.” ACM computing survey, 38(2), 2006, pp. 1-69.
[38] S. Maji, S. Mehta, “A Netflow distance between labeled graphs applications in chemoinformatics.” www.cs.berkeley.edu., 2008.
[39] A. Krasky, A. Rower, J. Schroeder, P.M. Selzen,“A combined bioinformatics and chemoinformatics approach for the
development of new antiparasitic drugs.” Elsevier, genomics, 2006, pp.1-8.
[40] T. Meinl, M. Worlein, O. Urzova, , I. Fischer, M. Philippsen, “The ParMol package for frequent subgraph mining. Electronics
communication of ESST 1, 2006.
[41] T. Meinl, M. Worlein, I. Fischer, M. Philippsen “Mining Molecular datasets on Symmetric Multiprocessor systems. 2006.
[42] Tsuda, K. Kudo, T. Clustering Graphs by weighted substructure mining. Proc. Of 23rd Int. conf. on machine learning, ACM,
148:953-960, 2006.
[43] J. K. Wegner, H. Frohlich, H.M. Mielenz, A. Zell, “ Data and graph mining in chemical space for ADME and activity data sets.”
Wiley-VCH, 25(3),2006, 205-206.
[44] C. Chen, X.Yan, F.Zhu, J.Han, gApprox: mining frequent approximate patterns from a massive network, in: IEEE International
Conference on Data Mining, 2007,pp.445–450.
[45] S.Zhang, J.Yang, RAM: randomized approximate graph mining, in: Proceedings of the 20th
International Conference on
Scientific and Statistical Database Management, Hong Kong, China, 2008, pp.187–203.
[46] C. Merkwith, M. Ogorzalek, “Applying CNN to chemoinformatics.” IEEE Xplore, 2007, 2918-2921.
Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May,
2043, pp. 43-51
IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 51
[47] X. Dong, K.E. Gilbert, R. Guha, R. Heiland, J. Kim, M.E. Pierce, G.C. Fox, D.J. Wild, “Web Service Infrastructure for
chemoinformatics.” J. Chem. Inf. Model, 47, 2007, pp. 1303-1307.
[48] J. Rhodes, S. Boyer, J. Kreulex, Y. Chen, P. Ordonez, “Mining patents using molecular similarity search.” Pacific symp. On
Biocomputing, 12, 2007, pp. 304-315.
[49] P. Bogdanov, “Graph searching, indexing, mining and modeling for Bioinformatics, cheminformatics and Social network.” 2008.
[50] A.M. Fahim, G. Saake, A.M. Salem, F. A. Torkey, M.A. Ramadan, “K-mens for spherical clusters with large variance in sizes.”
WorldAcademy of science, Engineering & Tech., 45, 2008, pp. 177-182.
[51] P. Godeck, R.A. Lewis, “Exploiting QSAR models in lead optimization.” Curr. Opin. Drug Discov Devel, 11(4), 2008, pp. 569-
575.
[52] R. Guha, S.C. Schurer, “Utilizing high throughput screening data for predictive toxicology models: Protocols and Application to
MLSCN assays.” J. comput. Aided Mol Des 22(6-7), 2008, pp. 367-384.
[53] Hubler, C. Kriegel, H.P. Borgwardt, K. Ghahramani, Z. Metropolis Algorithms for Representative Subgraph sampling.
IEEEXplore, 2008, pp. 283-292.
[54] T.Karunaratne, H. Bostrom, “Using background knowledge for graph based learning: a case study in chemoinformatics.”
Springer, Artificial Inteligence, (6), 2008, pp. 151-153.
[55] W.W.M.Lam, K.C.C. Chan, “A Graph mining algorithm for classifying chemical compounds." IEEE Int. conf. on Bioinformatics
and Biomedicine, 2008.
[56] S. Maji, S. Mehta, “A Netflow distance between labeled graphs applications in chemoinformatics.” www.cs.berkeley.edu., 2008.
[57] K. Tsuda, K. Kurihara, “Graph Mining with variational Dirichlet process mixture models.” SIAM, 432-442, 2008.
[58] L. Schietget, F. Costa, J. Ramon, L.D. Raedt, “Maximum common subgraph mining: A Fast and effective Approach towards
feature generation.” In Proc. At SRL-MLG-ILP, Leven, Belgium, 2009.
[59] Z. Zou, J.Li, H.Gao, S.Zhang, Mining frequent subgraph patterns from uncertain graph data, IEEE Transactions on Knowledge
and Data Engineering 22 (9)(2010)1203–1218.
[60] C. Jiang, F. Coenen, M. Zito, M. “Frequent Sub-graph mining on Edge Weighted Graphs.” 2010.
[61] O. Papapetrou, E.Ioanno, D.Skoutas, Efficient discovery of frequent subgraph patterns in uncertain graph databases, in:
Proceedings of the 14th
International Conference on Extending Database Technology, vol.12, New York, USA, 2011, pp.355–
366.
[62] A.Gago-Alonso, J.A.Carrasco-Ochoa, J.E.Medina-Pagola, J.F.Martínez-Trinidad, Full duplicate candidate pruning for frequent
connected subgraph mining, in: Integrated Computer-Aided Engineering, 17August, 2010,pp.211–225.
[63] N. Acosta-Mendoza, A.Gago-Alonso, J.E.Medina-Pagola, Frequent approximate subgraphs as features for graph-based image
classification, Knowledge- Based Systems 27(March)(2012)381–392.
[64] R. Alves, D.S. Rodríguez-Baena, J.S. Aguilar-Ruiz, Gene association analysis: a survey of frequent pattern mining from gene
expression data, Briefings in Bioinformatics 11 (2010) 210–224.
[65] A. Jiménez, F. Berzal, J.C.C. Talavera, Frequent tree pattern mining: a survey, Intelligence Data Analysis 14 (2010) 603–622.
[66] J. Han, H. Cheng, D. Xin, X. Yan, Frequent pattern mining: current status and future directions, Data Mining and Knowledge
Discovery (10th Anniversary 15) (2007) 55–86.
[67] C.H. Weng, Mining fuzzy specific rare itemsets for education data, Knowledge-Based Systems (2011) 697–708.
[68] U. Yun, K.H. Ryu, Approximate weighted frequent pattern mining with/without noisy environments, Knowledge-Based Systems
24 (2011) 73–82.
[69] X. Yan, J. Huan, gSpan: Graph-based substructure pattern mining, in: Proceedings International Conference on Data Mining,
Maebashi, Japan, 2002, pp. 721–724.
[70] S. Nijssen, J.N. Kok, A quickstart in frequent structure mining can make a difference, in: Proceedings of the Tenth ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 647–652.
[71] F. Eichinger, K. Böhm, Software-bug localization with graph mining, in: C.C. Aggarwal, H. Wang (Eds.), Managing and Mining
Graph Data, vol. 40, Springer-Verlag, New York, 2010.
[72] A. Gago-Alonso, A. Puentes-Luberta, J.A. Carrasco-Ochoa, J.E. Medina-Pagola, J.F. Martínez-Trinidad, A new algorithm for
mining frequent connected subgraphs based on adjacency matrices, Intelligence Data Analysis 14 (2010) 385–403.
[73] C. Jiang, F. Coenen, R. Sanderson, M. Zito, Text classification using graph mining-based feature extraction, Knowledge-Based
Systems 23 (2010) 302–308.
[74] N. Ketkar, L. Holder, D. Cook, Mining in the proximity of subgraphs, in: Analysis and Group Detection KDD Workshop on Link
Analysis: Dynamics and Statics of Large Networks, 2006.
[75] C. Borgelt, M. Berthold, Mining molecular fragments: finding relevant substructures of molecules, in: Proceedings of 2002
International Conference on Data Mining (ICDM’02), Maebashi, Japan, 2002, pp. 211–218.
[76] M.S. Hossain, R.A. Angryk, Gdclust: a graph-based document clustering technique, in: Proceedings of the Seventh IEEE
International Conference on Data Mining Workshops, IEEE Computer Society, Washington, DC, USA, 2007, pp. 417–422.
[77] M. Koyutürk, A. Grama, W. Szpankowski, An efficient algorithm for detecting frequent subgraphs in biological networks,
Bioinformatics (2004) 200–207.

Weitere ähnliche Inhalte

Was ist angesagt?

COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACHCOLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACHIJCI JOURNAL
 
Ba2419551957
Ba2419551957Ba2419551957
Ba2419551957IJMER
 
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...IJAEMSJORNAL
 
A comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrievalA comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrievalcsandit
 
Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...unyil96
 
MultiObjective(11) - Copy
MultiObjective(11) - CopyMultiObjective(11) - Copy
MultiObjective(11) - CopyAMIT KUMAR
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsVijay Karan
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsVijay Karan
 
M.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsM.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsVijay Karan
 
An effective RGB color selection for complex 3D object structure in scene gra...
An effective RGB color selection for complex 3D object structure in scene gra...An effective RGB color selection for complex 3D object structure in scene gra...
An effective RGB color selection for complex 3D object structure in scene gra...IJECEIAES
 
Delivery Feet Data using K Mean Clustering with Applied SPSS
Delivery Feet Data using K Mean Clustering with Applied SPSSDelivery Feet Data using K Mean Clustering with Applied SPSS
Delivery Feet Data using K Mean Clustering with Applied SPSSijtsrd
 
A Powerful Automated Image Indexing and Retrieval Tool for Social Media Sample
A Powerful Automated Image Indexing and Retrieval Tool for Social Media SampleA Powerful Automated Image Indexing and Retrieval Tool for Social Media Sample
A Powerful Automated Image Indexing and Retrieval Tool for Social Media SampleIRJET Journal
 
IRJET-Classification of Documents as Old or New using Frequency Domain
IRJET-Classification of Documents as Old or New using Frequency DomainIRJET-Classification of Documents as Old or New using Frequency Domain
IRJET-Classification of Documents as Old or New using Frequency DomainIRJET Journal
 
Chapter1_C.doc
Chapter1_C.docChapter1_C.doc
Chapter1_C.docbutest
 
Topological data analysis
Topological data analysisTopological data analysis
Topological data analysisSunghyon Kyeong
 

Was ist angesagt? (19)

COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACHCOLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
 
Ba2419551957
Ba2419551957Ba2419551957
Ba2419551957
 
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
 
50120130406022
5012013040602250120130406022
50120130406022
 
A comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrievalA comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrieval
 
2009 spie hmm
2009 spie hmm2009 spie hmm
2009 spie hmm
 
Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...
 
Cg33504508
Cg33504508Cg33504508
Cg33504508
 
MultiObjective(11) - Copy
MultiObjective(11) - CopyMultiObjective(11) - Copy
MultiObjective(11) - Copy
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
 
M.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing ProjectsM.Phil Computer Science Image Processing Projects
M.Phil Computer Science Image Processing Projects
 
M.E Computer Science Image Processing Projects
M.E Computer Science Image Processing ProjectsM.E Computer Science Image Processing Projects
M.E Computer Science Image Processing Projects
 
An effective RGB color selection for complex 3D object structure in scene gra...
An effective RGB color selection for complex 3D object structure in scene gra...An effective RGB color selection for complex 3D object structure in scene gra...
An effective RGB color selection for complex 3D object structure in scene gra...
 
Delivery Feet Data using K Mean Clustering with Applied SPSS
Delivery Feet Data using K Mean Clustering with Applied SPSSDelivery Feet Data using K Mean Clustering with Applied SPSS
Delivery Feet Data using K Mean Clustering with Applied SPSS
 
20120140502006
2012014050200620120140502006
20120140502006
 
A Powerful Automated Image Indexing and Retrieval Tool for Social Media Sample
A Powerful Automated Image Indexing and Retrieval Tool for Social Media SampleA Powerful Automated Image Indexing and Retrieval Tool for Social Media Sample
A Powerful Automated Image Indexing and Retrieval Tool for Social Media Sample
 
IRJET-Classification of Documents as Old or New using Frequency Domain
IRJET-Classification of Documents as Old or New using Frequency DomainIRJET-Classification of Documents as Old or New using Frequency Domain
IRJET-Classification of Documents as Old or New using Frequency Domain
 
Chapter1_C.doc
Chapter1_C.docChapter1_C.doc
Chapter1_C.doc
 
Topological data analysis
Topological data analysisTopological data analysis
Topological data analysis
 

Andere mochten auch (19)

Aijrfans14 215
Aijrfans14 215Aijrfans14 215
Aijrfans14 215
 
Ijetcas14 336
Ijetcas14 336Ijetcas14 336
Ijetcas14 336
 
Ijetcas14 493
Ijetcas14 493Ijetcas14 493
Ijetcas14 493
 
Ijetcas14 507
Ijetcas14 507Ijetcas14 507
Ijetcas14 507
 
Ijetcas14 308
Ijetcas14 308Ijetcas14 308
Ijetcas14 308
 
Ijetcas14 584
Ijetcas14 584Ijetcas14 584
Ijetcas14 584
 
Ijetcas14 344
Ijetcas14 344Ijetcas14 344
Ijetcas14 344
 
Ijetcas14 546
Ijetcas14 546Ijetcas14 546
Ijetcas14 546
 
Ijetcas14 306
Ijetcas14 306Ijetcas14 306
Ijetcas14 306
 
Ijetcas14 317
Ijetcas14 317Ijetcas14 317
Ijetcas14 317
 
Ijetcas14 356
Ijetcas14 356Ijetcas14 356
Ijetcas14 356
 
Aijrfans14 289
Aijrfans14 289Aijrfans14 289
Aijrfans14 289
 
Red Apple Opportunities
Red Apple Opportunities Red Apple Opportunities
Red Apple Opportunities
 
Ijetcas14 350
Ijetcas14 350Ijetcas14 350
Ijetcas14 350
 
Viengsouvanh 2104
Viengsouvanh 2104Viengsouvanh 2104
Viengsouvanh 2104
 
Aijrfans14 214
Aijrfans14 214Aijrfans14 214
Aijrfans14 214
 
Ijetcas14 325
Ijetcas14 325Ijetcas14 325
Ijetcas14 325
 
Ijetcas14 309
Ijetcas14 309Ijetcas14 309
Ijetcas14 309
 
Ijetcas14 351
Ijetcas14 351Ijetcas14 351
Ijetcas14 351
 

Ähnlich wie Ijetcas14 314

IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
 
IRJET - Object Detection using Hausdorff Distance
IRJET -  	  Object Detection using Hausdorff DistanceIRJET -  	  Object Detection using Hausdorff Distance
IRJET - Object Detection using Hausdorff DistanceIRJET Journal
 
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining ApproachBugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining ApproachMangaiK4
 
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining ApproachBugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining ApproachMangaiK4
 
Grid resource discovery a survey and comparative analysis 2
Grid resource discovery a survey and comparative analysis 2Grid resource discovery a survey and comparative analysis 2
Grid resource discovery a survey and comparative analysis 2IAEME Publication
 
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...IJERA Editor
 
A Framework for Visualizing Association Mining Results
A Framework for Visualizing  Association Mining ResultsA Framework for Visualizing  Association Mining Results
A Framework for Visualizing Association Mining Resultsertekg
 
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITION
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITIONA NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITION
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITIONsipij
 
Image Features Matching and Classification Using Machine Learning
Image Features Matching and Classification Using Machine LearningImage Features Matching and Classification Using Machine Learning
Image Features Matching and Classification Using Machine LearningIRJET Journal
 
Image similarity using fourier transform
Image similarity using fourier transformImage similarity using fourier transform
Image similarity using fourier transformIAEME Publication
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGIJDKP
 
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...ijma
 
Feature extraction techniques on cbir a review
Feature extraction techniques on cbir a reviewFeature extraction techniques on cbir a review
Feature extraction techniques on cbir a reviewIAEME Publication
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesEditor IJMTER
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach IJECEIAES
 
Classifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkClassifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkAI Publications
 
A hybrid approach for categorizing images based on complex networks and neur...
A hybrid approach for categorizing images based on complex  networks and neur...A hybrid approach for categorizing images based on complex  networks and neur...
A hybrid approach for categorizing images based on complex networks and neur...IJECEIAES
 
2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...
2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...
2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...gerogepatton
 

Ähnlich wie Ijetcas14 314 (20)

Objects Clustering of Movie Using Graph Mining Technique
Objects Clustering of Movie Using Graph Mining TechniqueObjects Clustering of Movie Using Graph Mining Technique
Objects Clustering of Movie Using Graph Mining Technique
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff Distance
 
IRJET - Object Detection using Hausdorff Distance
IRJET -  	  Object Detection using Hausdorff DistanceIRJET -  	  Object Detection using Hausdorff Distance
IRJET - Object Detection using Hausdorff Distance
 
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining ApproachBugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
 
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining ApproachBugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
BugLoc: Bug Localization in Multi Threaded Application via Graph Mining Approach
 
Grid resource discovery a survey and comparative analysis 2
Grid resource discovery a survey and comparative analysis 2Grid resource discovery a survey and comparative analysis 2
Grid resource discovery a survey and comparative analysis 2
 
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
 
A Framework for Visualizing Association Mining Results
A Framework for Visualizing  Association Mining ResultsA Framework for Visualizing  Association Mining Results
A Framework for Visualizing Association Mining Results
 
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITION
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITIONA NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITION
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITION
 
Image Features Matching and Classification Using Machine Learning
Image Features Matching and Classification Using Machine LearningImage Features Matching and Classification Using Machine Learning
Image Features Matching and Classification Using Machine Learning
 
Image similarity using fourier transform
Image similarity using fourier transformImage similarity using fourier transform
Image similarity using fourier transform
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
 
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
 
Feature extraction techniques on cbir a review
Feature extraction techniques on cbir a reviewFeature extraction techniques on cbir a review
Feature extraction techniques on cbir a review
 
Bl24409420
Bl24409420Bl24409420
Bl24409420
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining Techniques
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
 
Classifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkClassifier Model using Artificial Neural Network
Classifier Model using Artificial Neural Network
 
A hybrid approach for categorizing images based on complex networks and neur...
A hybrid approach for categorizing images based on complex  networks and neur...A hybrid approach for categorizing images based on complex  networks and neur...
A hybrid approach for categorizing images based on complex networks and neur...
 
2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...
2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...
2D FEATURES-BASED DETECTOR AND DESCRIPTOR SELECTION SYSTEM FOR HIERARCHICAL R...
 

Mehr von Iasir Journals (20)

ijetcas14 650
ijetcas14 650ijetcas14 650
ijetcas14 650
 
Ijetcas14 648
Ijetcas14 648Ijetcas14 648
Ijetcas14 648
 
Ijetcas14 647
Ijetcas14 647Ijetcas14 647
Ijetcas14 647
 
Ijetcas14 643
Ijetcas14 643Ijetcas14 643
Ijetcas14 643
 
Ijetcas14 641
Ijetcas14 641Ijetcas14 641
Ijetcas14 641
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
 
Ijetcas14 632
Ijetcas14 632Ijetcas14 632
Ijetcas14 632
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
 
Ijetcas14 619
Ijetcas14 619Ijetcas14 619
Ijetcas14 619
 
Ijetcas14 615
Ijetcas14 615Ijetcas14 615
Ijetcas14 615
 
Ijetcas14 608
Ijetcas14 608Ijetcas14 608
Ijetcas14 608
 
Ijetcas14 605
Ijetcas14 605Ijetcas14 605
Ijetcas14 605
 
Ijetcas14 604
Ijetcas14 604Ijetcas14 604
Ijetcas14 604
 
Ijetcas14 598
Ijetcas14 598Ijetcas14 598
Ijetcas14 598
 
Ijetcas14 594
Ijetcas14 594Ijetcas14 594
Ijetcas14 594
 
Ijetcas14 593
Ijetcas14 593Ijetcas14 593
Ijetcas14 593
 
Ijetcas14 591
Ijetcas14 591Ijetcas14 591
Ijetcas14 591
 
Ijetcas14 589
Ijetcas14 589Ijetcas14 589
Ijetcas14 589
 
Ijetcas14 585
Ijetcas14 585Ijetcas14 585
Ijetcas14 585
 
Ijetcas14 583
Ijetcas14 583Ijetcas14 583
Ijetcas14 583
 

Kürzlich hochgeladen

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
lifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxlifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxsomshekarkn64
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 

Kürzlich hochgeladen (20)

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
lifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxlifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 

Ijetcas14 314

  • 1. International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 43 ISSN (Print): 2279-0047 ISSN (Online): 2279-0055 A Review on Graph-based Image Classification 1 Mrs. Snehal N. Amrutkar , 2 Prof.J.V.Shinde 1,2 Department of Computer Engineering University of Pune Late G.N.Sapkal College of Engineering, Anjaneri, Nashik INDIA Abstract: Graphs are increasingly important in modelling complicated structures, such as circuits, images, chemical compounds, protein structures, biological networks, social networks and the web. Graph-based data representations are an important research topic due to the suitability of this kind of data structure to model entities and the complex relations among them. We are proposing to combine a powerful graph-based image representation and frequent approximate sub-graph (FAS) mining algorithms in order to classify images. Among the many approaches in performing image classification, graph based approach is gaining popularity primarily due to its ability in reflecting global image properties. This paper critically reviews existing important frequent subgraph mining algorithms This review not only reveals the pros in each method and category but also explores its limitations. Keywords: Frequent approximate sub-graphs, Graph-based image representation, and Image classification I. Introduction The need to convert large volumes of data into useful information has been increased. In many datasets, the objects are may be represented as graphs. In Fig. 1, an example of graphs for representing images is shown. As a result of this need, several authors have developed techniques and methods to process these datasets [64,65]. e.g. Frequent pattern discovery [66,67,68]. In many research fields, graphs have been largely used to model data due to their representation expressiveness and their suitability for applications where some kind of entities and their relationships must been coded within some data structure. The discovery of frequent patterns, mainly the detection of frequent sub-graphs in graph collections is an important problem in graph mining tasks[5,69,70]. In frequent sub-graph mining, there are two approaches for evaluating the similarity of graphs, known as graph matching: exact matching and approximate matching. The exact matching determining whether the labels and the structure of two graphs are identical. This matching has been successfully used in many applications[71,72,73]; however, there are some problems where exact matching could not be applicable with positive outcome. Sometimes, the interesting sub-graphs show slight differences throughout the data. For example these differences can be seen in protein analysis. Other examples can be seen on image processing, where these differences may be due to noise and distortion, or may just illustrate slight spatial differences between instances of the same objects. This means that we should tolerate certain level of geometric distortion, slight semantic variations or vertices and/or edges mismatch in frequent sub-graph pattern search. For this reason, it has become necessary to evaluate the similarity between graphs allowing some structural differences, i.e. approximate matching techniques. The approximate matching consists in finding the match between vertices and/or edges of two graphs in order to determine their similarity, allowing structure and/or label differences. On this basis, the need to perform frequent sub-graph mining using approximate graph matching has been raised[74,75,76,77].Several authors have expressed the necessity to use approximate graph matching for frequent sub-graph mining on graph collection. These authors defend the idea that frequent and more interesting sub-graphs for applications and users could be found. Frequent subgraph mining is the processor of extracting all frequent subgraphs from graph dataset who have occurrence count greater than or equal to the specified threshold. Following figure gives example of frequent subgraph mining, where figure 1(a) and 1(b) represents any two graphs and figure 1(c) gives frequent subgraph which is present in both graphs. Figure 1(a) Theobromine, 1(b) Caffeine and 1(c) Frequent subgraph These graph mining methods might be grouped as :
  • 2. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 44 A. Apriori-Based Approach: Apriori based frequent substructure mining algorithm share similar characteristics with Apriori-based frequent item set mining algorithms. The search for frequent groups starts with graphs a small “size” and proceeds in a bottom-up manner by generating candidate having an extra vertex, edge or path. The definition of graph size depends on algorithm used. B. Pattern-Growth Approach: The Apriori-based approach has to use the breadth-first search (BFS) strategy because of its level-wise candidate generation. II. Survey of algorithms Various algorithms on graph mining were developed by many researchers. Some algorithms give approximate results while others provide exact results. Some of them are reviewed in this section. Xiao,Wang,Wu in 1984 presented CSMiner[1] Algorithm based on node/edge disjoint sub-isomorphism. For alignment of Multi-PPI networks, it needs to allow node-skipping & mismatching. But it increases computation cost of graph matching . Figure 2(a): PPI Network Figure 2(b): Inexact Graph Matching for CSMiner[1] Algorithm Holder, Cook & Bunke in 1987 presented SUBDUE[2] algorithm which is based on graph edit distance. Using fuzzy graph match,SUBDUE can identify slightly different instances of substructures which are replaced by pointer. Figure 3(a): Cortisone Figure 3(b): Substructures found in Cortisone compound for SUBDUE algorithm[2] Blockeel and Raedt [3] in 1998 introduced a first-order framework for top-down induction of logical decision tree. Top-down induction of decision trees is the best known and most successful machine learning technique. It has been used to solve numerous practical problems. It employs a divide-and conquers strategy, and in this it differs from its rule-based competitors which are based on covering strategies. Chakrabarti, Dom and Indyk [4] in 1998 developed a new method for automatically classifying hypertext into a given topic hierarchy, using an iterative relaxation algorithm.After bootstrapping off a text-based classifier, they used both local texts in a document as well as the distribution of the estimated classes of other documents in its neighborhood, to refine the class distribution of document being classified.They discussed three area of research: text and hypertext information retrieval, machine learning in context other text or hypertext, and computer vision and pattern recognition. Gago-Alonso,Carrasco-Ochoa,Medina-Pagola in 2000 presented gdFil[5] Algorithm which is based on pattern growth. gdFil allow removing all duplicate candidates before support calculation.But,Support calculation requires expensive sub-graph isomorphism tests. In FSG[6] proposed by Kuramochi in 2001, candidates are generated by increasing one edge at a time. Size of the graph in FSG is proportional to the number of edges. To check unique subgraph, FSG uses canonical labeling technique in which each graph have unique canonical label. Canonical labels of two graphs are same only when both graphs have same topological structure and same edges and vertices labels. FSG also have some drawbacks. It also generates multiple candidates. FSG uses isomorphism testing technique, is very costly. Fig 4: Multiple Candidate Generation for FSG[6]
  • 3. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 45 Calders and Wisen [7] in 2001 Presented on monotone data mining layer a simple data-mining logic (DML) that can express common data mining tasks like “find Boolean association rules” or “Find inclusion dependencies”. Kramer, Raedt, and Helma [8] in 2001 presented the application of feature mining techniques to the developmental therapeutics program’s AIDS antiviral screen database. Kuramochi and Karypis [9] in 2001 presented a computationally efficient algorithm for finding all frequent subgraphs in large graph databases. They evaluated the performance of the algorithm by experiments with synthetic datasets as well as a chemical compound dataset. Pei, Han,Mortazavi-Asl and Pinto [9] in 2001 proposed a novel sequential pattern mining method called PrefixSpan that is prefix-projected Sequential pattern mining. Asai, Abe and Kawasoe [10] in 2002 discovered the efficient substructure from large semi structure data and patterns by labeled ordered trees and studied the problem of discovering all frequent tree-like patterns that have at least a minimum support in a given collection of semi-structured data. They presented an efficient pattern mining algorithm FREQT for discovering all frequent tree patterns from a large collection of labeled ordered tree. Borgelt and Berthold [11] in 2002 presented an algorithm to find fragments in a set of molecules that help to discriminate between different classes for instance, activity in a drug discovery context. Zaki [12] in 2002 presented TREEMINER algorithm to discover all frequent subtrees in a forest, using a new data structure called scope-list. Deshpande, Kuramochi and Karypis [13] in 2002 proposed the technique for classifying chemical compounds.These techniques can be broadly categorized into two groups.The first group consists of techniques that rely mainly on various global properties of the chemical compounds, such as molecular weight, ionization potential, inter-atomic distance etc. for capturing the structural properties of the compounds.Since this information is not relational, existing classification techniques can be easily used on these datasets. However the absence of actual structural information limits the accuracy of such classified. The second group of techniques directly analyzes the structure of the chemical compounds to identify patterns that can be used for classification [14; 16; 15; 8]. One of the earliest studies in discovering substructures was carried out by Dehaspe et al. [8] in 1998 which Inductive Logic Programming (ILP) techniques were used though this approach is quite powerful it is not designed to scale to large graph databases hence may not be able to handle large chemical compound databases. A number of recent approaches focused on analyzing the graph representation of the chemical compounds, to identify frequently occurring patterns, and use these patterns to aid in the classification. Wang et al. [16] in 1997 developed an algorithm to find frequently occurring blocks in the geometric representation of protein molecules and showed that these blocks can be used for classification. Inikuchi et al. [15] developed an algorithm to find all frequently occurring induced sub graphs and presented some evidence that such subgraph can be used to features for future classification. Cooper and Frieze [17] in 2003 described a general model of a random graph process whose proportional degree sequence obeys a power law. These law recently observed in graph associated with www. Dzeroski [18] in 2003 introduced the Multi-Relational Data Mining. Getoor [19] in 2003 studied on link mining. Link among the objects may demonstrated certain patterns which can be helpful for many data mining tasks and are usually hard to capture with traditional statistical models. Link mining is promising new area where relational learning meets statistical modeling.Huan, wang and Prince [20] in 2003 proposed a novel subgraph mining algorithm: FFSM, which employs a vertical search scheme within an algebraic graph framework. They have developed to reduce the number of redundant candidates proposed. Their studied on synthetic and real datasets demonstrates that FFSM achieves a substantial performance gain over the current start-of-the art subgraph mining algorithm gSpan Washio and Motoda [21] in 2003 introduced the theoretical basis of graph based data mining and surveys the state of the art of graph-based data mining. Yan, Han and Afshar [22] in 2003 proposed an alternative but equally powerful solution which is based on pattern growth.: instead of mining the complete set of frequent subsequences, they mined frequent closed subsequences only that are those contained no super-sequence with the same support. They also introduced an efficient algorithm, called CloSpan. (CloSpan is stand for closed sequential pattern mining.) This outperforms the previous work by one order of magnitude. Moreover a deep a deep understanding of efficient sequential pattern mining methods may also have strong implications on the development of efficient methods for mining frequent subtrees, lattices, subgraphs, and other structured patterns in large databases. The sequential pattern mining algorithms developed so far have good performance in databases consisting of short frequent sequences. Unlike Apriori algoorithm, gSpan does not generate candidate patterns & tests for false positive pruning which is time & space efficient but,it has high computational complexity. Yan and Han [23] in 2003 proposed to mine close frequent graph patterns. A graph g is closed in a database if there exists no proper subgraph of g that has the same support as g. A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting looping methods. Their performance studied shown that CloseGraph not only dramatically reduces unnecessary subgrapgs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns. Yin and Han [24] in 2003 developed a new classification approach is called CPAR (CPAR is classification based on predictive Association Rules). Based on their study performance study, CPAR achieved high accuracy and efficiency, which have many useful features.CPAR represents a new approach towards efficient and high quality
  • 4. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 46 classification. It is interesting to further enhance the efficiency and scalability of this approach and compare it with other well-established classification schemes. Moreover, the strength of the derived predictive rules also motivates us to perform an in-depth study on alternative approaches towards effective association rule mining. Huan, Wang, Prins and Yang [25] in 2004 developed a new algorithm that mines only maximal frequent subgraphs, that is subgraph that are not a part of any other frequent subgraphs. Their algorithm can achieve a five-fold speed up over the current state-of-the-art subgraph mining algorithms.Their mining method is based on a novel graph mining framework in which they first mine all frequent tree patterns from a graph database and then construct maximal frequent subgraphs from trees. Huan, Wang Bandyopadhyay Snoeyink, and Prins [26] in 2004 applied a novel subgraph mining algorithm mining algorithm to three related graph representation of the sequence and proximity characteristics of a proteins amino acid residues. The subgraph mining algorithm is used to discover spatial motifs that can be used to discriminate among protein in different families found in the SCOP database. Koyuturk, Grama, Szpankowski, [27] in 2004 presented an innovative new algorithm for detecting frequently occurring patterns and modules in biological network. They show experimentally that their algorithm can extracted from the KEGG database within seconds. The proposed model and algorithm are applicable to a variety of biological networks either directly or with minor modification. Meinl, Borgelt and Berthold [28] in 2004 shown that is possible to mine meaningful, discriminative molecular fragments from large databases. Using an existing algorithm that employs a depth-first strategy and a sophisticated ordering scheme allows avoiding costly re-embeddings throughout the candidate growth process, which in turn enables us to find also larger fragments. Yin, Han, Yang and Yu [29] in 2004 developed a CrossMine, an efficient and scalable approach for multi-relational classification. Several novel methods are developed in CrossMine, including 1. tuple ID propagation, which performs semantics-preserving virtual join to achieve high efficiency on databases with complex schemas. 2. a selective sampling method which makes it highly scalable with respect to the number of tuples in the databases. Both theoretical backgrounds and implementation techniques of CrossMine are introduced. Yan, Yu and Han [30] in 2004discovered the issues of indexing graphs and proposed a novel solution by applying a graph mining technique. Different from the existing path-based methods, our approach, called gIndex, makes use of frequent substructure as the basic indexing feature. Frequent substructures are ideal candidates since they explore the intrinsic characteristics of the data and are relatively stable to database updates. gIndex has 10 times smaller index size, but achieves 3-10 times better performance in comparison with a typical path-based method. Yongling Song, Su-Shing Chen in 2005 presented RNGV[31] Algorithm which is based on graph edit distance. RNGV algorithm uses modified Apriori algorithm, which treat edges as items. Perform edit operations that includes vertex & edge insertion, deletion & relabeling.RNGV has a limitation that it Explore possible edit paths keeping one expected as a candidate and this do not claim completeness .Hu, Yan, Huang, Han and Zhou [32] in 2005 developed a novel algorithm, CODENSE, to efficiently mine frequent coherent dense subgraphs across large number of massive graph on biological networks for function discovery. Li and Tan [33] in 2005 proposed a novel graph mining algorithm to detect the dense neighborhoods in an interaction graph which may correspond to protein complexes. Their algorithm first located local cliques for each graph vertex and then merge the detected local cliques according to their affinity to form maximal dense regions. Yan, Yu and Han [34] in 2005 investigated the issues of substructure similarity search using indexed features in graph databases. By transforming the edge relaxation ratio of a query graph into the maximum allowed missing features, their structural filtering algorithm called Grafil, can filter many graphs without performing pairwise similarity computations. Yin, Han and Yu [35] in 2005 proposed a new approach, called CROSSCLUS, which performs cross-relational clustering with user’s guidance. Kuramochi and Karypis in 2006 presented GREW[36] Algorithm which is based on sub-isomorphism employing ideas of edge contraction & graph rewriting . They discovers frequent sub-graphs by repeatedly merging embeddings of existing freq. sub-graphs.Chakrabarti and Faloutsos [37] in 2006 discussed the problem of detecting abnormalities in a given graph and of generating synthetic but realistic graphs have received considerable attention recently. Both are tightly coupled to the problem of finding the distinguished characteristics of real-world graphs i.e. the patterns that show up frequently in such graphs and can be considered as marks of realism. Karunaratne and Bostrom [38] in 2006 developed a method for learning from structured data are limited with respect to handling large isolated substructures and also imposed constraints on search depth and induced structure length. An approach to learning from structured data using a graph based canonical representation method of structured called finger printing. Krasky, Rohwer, Schroeder and Selzen [39] in 2006 discussed on a combined bioinformatics and chemoinformatics approach for the development of new ant parasitic drugs. Meinl, [40] in 2006 solved the problem Parallel molecular mining Worlein, Urzova, Fischer, and Philippsen which are used in chemoinformatics to find common molecular fragments a database of molecules represented as two-dimensional graphs. In ParMol package they have implemented four of the most popular frequent subgraph miners using a common infrastructure: MoFa, gspan, FFSM and Gaston. They also added additional functionality to some of the algorithms like parallel search, mining directed graphs and mining in one big graph instead of a graph database. Meinl, Worlein, Fischer, and Philippsen [41] in 2006 presented the
  • 5. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 47 thread-based parallel versions of MoFa and gSpan that achieve speedup up to 11 on a shared memory SMP system using 12 processors. They discussed the design space of the parallelization, the results, and the obstacles, that are caused by the irregular search space and by the current state of Java technology. Tsuda and Kudo [42] in 2006 proposed a graph clustering method that selects informative patterns at the same time. Their task is analogous to feature selection for vectors however the difference is that the features are not explicitly listed. This method is fully probabilistic adopting a binomial mixer model defined on a very high dimensional vector indicating the presence or absence of all possible patterns. Wegner, Frohlich, Mielenz and Zell [43] in 2006 presented a classification method which is based on a coordinate-free chemical space. Thus it does not depend on descriptor values commonly used coordinated-based chemical space methods. Figure 5: Sequence of edge collapsing operations performed by GREW[36] Chen,Yan,Zhu, Han in 2007 presented gApprox[44] algorithm, which is based on substitution probabilities. Due to noise & distortion, it considers approximation. So interesting patterns can be captured. gApprox algorithm finds patterns for a single network not for a set of graphs. Shijie Zhang & Jiong Yong in 2007 presented Monkey[45] Algorithm based on β-edge sub-isomorphism It is capable of finding frequent patterns that were missed by exact matching algorithms in presence of edge distortions . They handles only missing edge & edge label mismatch. Merkwirth and Ogorzalek [46] in 2007 described a method for construction of specific types of neural networks composed structures directly linked to the structure of the molecule under consideration. Each molecule can be represented by a unique neural connectivity problem (graph) which can be programmed on to a cellular neural network. A composite network can further successfully perform classification and regression on real-world chemical data-set. Dong, Gilbert, Guha, Heiland, Kim, Pierce, Fox, and Wild [47] in 2007 developed an infrastructure of chemoinformatics web service that simplifies that access to this information and the computational techniques that can be applied to it. They described this infrastructure and give some examples of its uses and then discuss their plans to use it as a platform for chemoinformatics application development in the future. Rhodes, Boyer, Kreulex, Chen, and Ordonez [48] in 2007 developed a system that allow user to explore the US patent corps using Molecular information. Their system contains three main technologies. In this system a user may go to a web page, draw a molecule search for related Intellectual property (IP) and analyzed the results. Figure 6: Exploring the Pattern Space in [44]: (a) the original graph, (b) decomposition of the problem and stage 1, (c) stage 2, (d) stage 3, (e) stage 4, where each stage is enclosed respectively within a loop. We darken marked vertices along the way, while edges are correspondingly changed from dotted to thickened as new vertices are absorbed Figure 7 β-edge isomorphism (β>=2) for Monkey algorithm[45]
  • 6. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 48 Bogdanov [49] in 2008 studied on Graph searching, indexing, mining and modeling for Bioinformatics, chemoinformatics and Social network. Fahim et al. [50] in 2008 proposed a method which is based on shifting the center of the large cluster toward the small cluster and recompiling the membership of small cluster points, the experimental results reveal that the proposed algorithm produce satisfactory results. Godeck and Lewis [51] in 2008 stated that QSAR models can play a vital role in both the opening phase and the endgame of lead optimization. In the opening phase there is often a large quantity of data from high throughput screening (HTS) and potential leads need to be selected from several distinct chemotypes. Guha and Schurer [52] in 2008 investigated various aspects of developing computational models to predict cell toxicity based on cell proliferation screening data generated in the MLSCN. By capturing feature-based information in that data set, such predictive models would be useful in evaluating cell based screening results in general and could be used as an aid to identify and eliminate potentially undesired compounds. Hubler, Kriegel, Borgwardt, Ghahramani and Metropolis [53] in 2008 presented metropolis algorithm for sampling a representative small subgraph from the original large graph with representative describing the requirement that the sample shall preserve crucial graph properties of the original graph. Karunaratne and Bostrom [54] in 2008 presented a case study in chemoinformatics in which various types of background knowledge are encoded in graphs that are given as input to a graph learner. In this paper shown that the type of background knowledge encoded indeed has an effect on the predictive performance. Lam and Chan [55] in 2008 studied on graph data mining algorithm are increasingly applied to biological graph data set. In this paper they proposed graph mining algorithm MIGDAC (Mining graph data for classification) that applies on graph theory and an interestingness measure to discover interesting sub graphs which can be both characterized and easily distinguished from other classes. Maji, and Mehta [56] in 2008 proposed a novel a measure of similarity between labeled graphs which has applications to structured data analysis for example chemical informatics, web document clustering etc. their metric on graphs exploits vertex context similarity and computes an overall matching score in polynomial time in the size of the graphs using a network flow formulation of the problem. Tsuda and Kurihara [57] in 2008 proposed a nonparametric Bayesian method for clustering graph and selecting salient patterns at the same time. Variation inference is adopted here because sampling is not applicable due to extremely high dimensionality. Schietget, Costa, Ramon, and Raedt [58] in 2009 proposed a direct efficient and simple approach for generation of interesting graph pattern. They computed maximum common subgraph from randomly selected pairs of examples and directly use them as features. Zou, Li, Gao, and Zhang in 2010 developed MUSE[59] Algorithm which is based on sub-isomorphism on uncertain graphs. MUSE algorithm find set of FAS patterns by allowing error tolerance on expected support of discovered sub-graph patterns. Jiang, Coenen and Zito [60] in 2010 examined a number of edge weighting schemes; and suggested three strategies for controlling candidate set generation. Spjuth, Willighagen, Guha, Eklund and Wikberg [59] in 2010 Studied on toward interoperable and reproducible QSAR analysis: Exchange of data sets. QSAR is widely used method to relate chemical structures to responses or properties based experimental observations. Much effort has been made to evaluate and validate the statistical modeling QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structure as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standard and exchange formats in the field, making it virtually impossible to reduce and validate analyses and drastically constrain collaboration and re-use of data. Yang, Parthasarthy and Sadayappan[60] in 2010 presented a novel approach to data representation for computing this kernel, particularly targeting sparce matrices representing power-law graphs. They shown their representation scheme, coupled with a novel tiling algorithm, can yield significant benefits over the current state of the art GPU and CPU efforts on a number of core data mining algorithms such as PageRank, HITS and Random Walk with Restart. A graph transaction is represented by adjacency matrices and the frequent patterns appearing in matrices are mined through the extended algorithm. These are modeled as attribute graph in which each vertex represents an atom and each edge a bond between atoms. Each vertex carries attribute that indicates the atom type. Papapetrou,Ioannou in 2011 developed UGRAP[61] algorithm which is based on sub-isomorphism on uncertain graphs. This algorithm required for applications which has uncertainty in their data. UGRAP algorithm uses an Index (containing graph edges & their probabilities) which reduces number of comparisons required for computing expected support of each candidate pattern. And improve efficiency by optimization for early termination & scheduling of graph comparisons. ChenJia, Zhang, Huan in 2007 presented APGM [62] algorithm and Acosta-Mendoza, Gago-Alonso, Medina- Pagola in 2012 presented VEAM[63] algorithm which are based on substitution probabilities. Due to noise & distortion, it considers approximation. So interesting patterns can be captured. APGM and VEAM Algorithms specify which vertices, edges or labels can replace others to perform frequent sub-graph mining. These methods perform feature (sub-graph) selection taking into account semantic distortions.
  • 7. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 49 Figure 8: An illustrative example for [61] showing (a) an uncertain graph, (b) its implied exact graphs and (c) a subgraph pattern. APGM algorithm allows these distortions only between vertices of graphs, while VEAM algorithm allows distortions between vertices and between edges. Fig. 9(a): Example of three labeled graphs[63] Figure 9(b) Example of two ways to map vertices and edges of the labeled graph T to those of the labeled graph G [63]. III. Conclusion Recently, there has been increasing interest in using graph based methods as a powerful tool for image classification. This review has discussed some of the major graph mining methods and highlighted their strengths as well as limitations. Some difficulties of these methods have brought down their use in practical applications. In case of exact matching, sometimes, the interesting sub-graphs show slight differences throughout the data.The current research in graph mining methods orients towards producing approximate solution (sub-optimal solution) for such graph matching problem to find frequent and more interesting sub- graphs for applications. In fact, many of the graph mining methods have their own disadvantages. This survey demonstrates that most of the methods fail to find use in many real time applications. This demonstrates the need for further research to refine the graph mining techniques to be applied for real time scenarios. IV. References [1] Y.Xiao, W.Wu, W.Wang, Z.He, Efficient algorithms for node disjoint subgraph homeomorphism determination, in: Proceedings of the 13th International Conference on Database Systems for Advanced Applications, New Delhi, India, 2008,pp.452–460. [2] L.B. Holder, D.J.Cook, H.Bunke, Fuzzy substructure discovery, in: Proceedings of the Ninth International Workshop on Machine Learning, San Francisco, CA, USA, 1992, pp.218–223. [3] . Blockeel, L.D. Raedt, “Top-down induction of first-order logic decision trees”, Artificial Intelligence, 101, 1998, pp. 285-297. [4] S. Chakrabarti, B. Dom, P. Indyk, “Enhanced hypertext categorization using hyperlinks” ACM, (SIGMOD’98), 1998, pp. 307- 318.
  • 8. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 50 [5] A.Gago-Alonso, J.A.Carrasco-Ochoa, J.E.Medina-Pagola, J.F.Martínez-Trinidad, Full duplicate candidate pruning for frequent connected subgraph mining, in: Integrated Computer-Aided Engineering, 17August, 2010,pp.211–225. [6] M. Kuramochi and G. Karypis, “Frequent subgraph discovery”, In Intl. Conf. on Data Mining, (ICDM’01), pages 313-320, Nov. 2001 [7] T. Calders, J. Wijsen, “On Monotone mining Languages”, In proc. Of international workshop on database programming Languages(DBPL), 2001, pp. 119-132. [8] S. Kramer, L.D. Raedt, C. Helma, “Molecular feature mining in HIV data”, In Proc.ational conf. on of the 7th ACM SIGKDD International conf. on knowledge discovery and data mining, 2001, pp. 136-143 [9] M. Kuramochi, G. Karypis, “ Freovequent Subgraph Discovery “, In Proc 2001 Int. conf. Data mining(ICDM’01). [10] T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Satamota, and S. Arikawa, “Efficient substructure discovery from large semi- structured data.” In proc. 2002 SIAM Int. conf. Data mining(SDM’02), 2002, pp. 158-174. [11] C. Borgelt and M.R. Berthold, “Mining molecular fragments: Finding relevant substructures of molecules.” In Proc. 2002 int. conf. Data Mining (ICDM’02), PP. 211-218. [12] M.J. Zaki, “Efficiently Mining Frequent trees in a forest.” In Proc. 2002 ACM SIGKDD Int. conf. knowledge Discovery and Datamining(KDD’02), 2002, pp. 71-80. [13] M.Deshpande, M. Kuramochi, G. Karypis, “Automated approaches for classifying structures.” In Proc. 2002 workshop on Data mining in Bioinformatics (BIOKDD’02), 2002, pp. 11-18. [14] L. Dehaspe, H. Toivonen, R. D. King, “Finding frequent substructures in chemical compounds”. In 4th Int. conf. on knowledge Discovery and Data mining, 1998. [15] A. Inokuchi, T. Washio, H. Motoda, “An Apriori-based Algorithm for Mining Frequent substructures from Graph Data. In proc. 2000 European Symp. Principle of Data mining and knowledge Discovery (PKDD’00), 1998, pp. 13-23. [16] X. wang, J.T.L. Wang, D. Shasha, B. Shapiro, S. Dikshitulu I.Rigoutsos, K. Zhang, “Automated discovery of active motifs in three dimensional molecules”, In Proc. Of the 3rd int. conf. on knowledge discovery and data mining, 1997. [17] C. Cooper and A. Frieze, “A general model of web graph.” 2003. [20] S. Dzeroski, “Multi-Relational Data mining: An Introduction” SIGKOD Explore. Newsl, 5(1), 2003, pp.1-16. [18] S. Dzeroski, “Multi-Relational Data mining: An Introduction” IGKOD Explore. Newsl, 5(1), 2003, pp.1-16. [19] L. Getoor, “Link Mining: A new data mining challenge.” SIGKDD Explorations, (5), 2003, pp.84-89. [20] J. Huan, W. Wang and J. Prins, “Efficient Mining of frequent Subgraph in the Presence of Isomorphism.” In Proc. 2003 int. conf. Data mining (ICDM’03), 2003, pp. 549-552. [21] T. Washio and H. Motoda, “State of the art of Graph- based data mining.” SIGKDD Explorations, (5), 2003, pp. 59-68. [22] X. Yan, J. Han and R. Afshar, “CloSpan: Mining Closed Sequential patterns in Large Datasets.” In Proc. 2003 SIAM Int. conf. Data mining (SDM’03), 2003, pp. 166-177. [23] X. Yan and J. Han CloseGraphs: Mining Closed Frequent Graph Patterns. In proc. 2003 ACM SIGKDD Int. conf. knowledge Discovery and Data Mining (KDD’03), 2003, pp. 286-295. [24] X. Yin and J. Han, “ CPAR: Classification based on Predictive Association Rules. In Proc. 2003 SIAM Int. conf. Data Mining (SDM’03), 2003, pp. 331-335. [25] J. Huan, W. Wang, J. Prins and J. Yang,”Spin: mining maximal frequent subgraphs from graph Databases”, KDD04 Seattle, Washington, USA, 2004. [26] J. Huan, W. Wang, D Bandyopadhyay, J. Snoeyink, J. Prins and J. Tropsha,” A. Mining Sapitial Motifs from Protein Structure Graphs. In Proc. 8th int. conf. Research in computational Molecular Biology (RECOMB), 2004, pp. 308-315. [27] M. Koyuturk, A. Grama, and W. Szpankowski. “An Efficient algorithm for detecting frequent subgraphs in biological networks.” Bioinformatics, (20), 2004, pp. i200-i207. [28] T. Meinl, C. Borgelt and M.R. Berthold, “ Discriminative closed fragment mining and perfect extensions in MoFa.” 2004. [29] X. Yin, J. Han, J.Yang and P.S. Yu, “CrossMine: Efficient Classification Across Multiple Database Relations. In Proc. 2004 int. conf. Data Engineering (ICDE’04), 2004, pp. 399-410. [30] X. Yan, P.S. Yu, and J. Han. “ Graph Indexing: A Frequent Structure-based approach.” In Proc. 2004 ACM SIGKDD Int. conf. management of Data, 2004, pp.335-346. [31] Y.Song, S.Chen, Itemsets based graph mining algorithm and application in genetic regulatory networks, in: Proceedings of the IEEE International Conference on Granular Computing, Atlanta, GA, USA, 2006, pp.337–340. [32] H. Hu, X. Yan, Y. Huang, J. Han and X. J. Zhou, “Mining coherent dense subgraphs across massive biological networks for functional discovery.” In proc. 2005 Int. conf. Intelligent system for molecular Biology (ISMB’05), 2005, pp. 213-221. [33] X.L. Li, S.H. Tan, “Interaction graph mining for protein complexes using local clique merging.” Genome Informatics, 16(2),2005, pp. 260-269. [34] X. Yan, P.S. Yu, J. Han, “Substructure similarity search in graph databases.” In proc. 2005 ACM-SIGMOD Int. conf. Management of Data (SIGMOD’05), 2005, pp. 766-777. [35] X. Yin, J. Han, P.S. Yu, “Cross-Relational Clustering with User’s Guidance. In Proc. 2005 ACM SIGKDD Int. conf. Knowledge Discovery and Data mining (KDD’05), 2005, 344-353. [36] M. Kuramochi, G.Karypis, GREW—a scalable frequent subgraph discovery algorithm, in: Proceedings of the 4th IEEE International Conference on Data Mining, 2004,pp.439–442. [37] D. Chakrabarti, C. Faloutsos, “Graph mining: Laws, Generators, and Algorithm.” ACM computing survey, 38(2), 2006, pp. 1-69. [38] S. Maji, S. Mehta, “A Netflow distance between labeled graphs applications in chemoinformatics.” www.cs.berkeley.edu., 2008. [39] A. Krasky, A. Rower, J. Schroeder, P.M. Selzen,“A combined bioinformatics and chemoinformatics approach for the development of new antiparasitic drugs.” Elsevier, genomics, 2006, pp.1-8. [40] T. Meinl, M. Worlein, O. Urzova, , I. Fischer, M. Philippsen, “The ParMol package for frequent subgraph mining. Electronics communication of ESST 1, 2006. [41] T. Meinl, M. Worlein, I. Fischer, M. Philippsen “Mining Molecular datasets on Symmetric Multiprocessor systems. 2006. [42] Tsuda, K. Kudo, T. Clustering Graphs by weighted substructure mining. Proc. Of 23rd Int. conf. on machine learning, ACM, 148:953-960, 2006. [43] J. K. Wegner, H. Frohlich, H.M. Mielenz, A. Zell, “ Data and graph mining in chemical space for ADME and activity data sets.” Wiley-VCH, 25(3),2006, 205-206. [44] C. Chen, X.Yan, F.Zhu, J.Han, gApprox: mining frequent approximate patterns from a massive network, in: IEEE International Conference on Data Mining, 2007,pp.445–450. [45] S.Zhang, J.Yang, RAM: randomized approximate graph mining, in: Proceedings of the 20th International Conference on Scientific and Statistical Database Management, Hong Kong, China, 2008, pp.187–203. [46] C. Merkwith, M. Ogorzalek, “Applying CNN to chemoinformatics.” IEEE Xplore, 2007, 2918-2921.
  • 9. Snehal N. Amrutkar et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(1), March-May, 2043, pp. 43-51 IJETCAS 14-314; © 2014, IJETCAS All Rights Reserved Page 51 [47] X. Dong, K.E. Gilbert, R. Guha, R. Heiland, J. Kim, M.E. Pierce, G.C. Fox, D.J. Wild, “Web Service Infrastructure for chemoinformatics.” J. Chem. Inf. Model, 47, 2007, pp. 1303-1307. [48] J. Rhodes, S. Boyer, J. Kreulex, Y. Chen, P. Ordonez, “Mining patents using molecular similarity search.” Pacific symp. On Biocomputing, 12, 2007, pp. 304-315. [49] P. Bogdanov, “Graph searching, indexing, mining and modeling for Bioinformatics, cheminformatics and Social network.” 2008. [50] A.M. Fahim, G. Saake, A.M. Salem, F. A. Torkey, M.A. Ramadan, “K-mens for spherical clusters with large variance in sizes.” WorldAcademy of science, Engineering & Tech., 45, 2008, pp. 177-182. [51] P. Godeck, R.A. Lewis, “Exploiting QSAR models in lead optimization.” Curr. Opin. Drug Discov Devel, 11(4), 2008, pp. 569- 575. [52] R. Guha, S.C. Schurer, “Utilizing high throughput screening data for predictive toxicology models: Protocols and Application to MLSCN assays.” J. comput. Aided Mol Des 22(6-7), 2008, pp. 367-384. [53] Hubler, C. Kriegel, H.P. Borgwardt, K. Ghahramani, Z. Metropolis Algorithms for Representative Subgraph sampling. IEEEXplore, 2008, pp. 283-292. [54] T.Karunaratne, H. Bostrom, “Using background knowledge for graph based learning: a case study in chemoinformatics.” Springer, Artificial Inteligence, (6), 2008, pp. 151-153. [55] W.W.M.Lam, K.C.C. Chan, “A Graph mining algorithm for classifying chemical compounds." IEEE Int. conf. on Bioinformatics and Biomedicine, 2008. [56] S. Maji, S. Mehta, “A Netflow distance between labeled graphs applications in chemoinformatics.” www.cs.berkeley.edu., 2008. [57] K. Tsuda, K. Kurihara, “Graph Mining with variational Dirichlet process mixture models.” SIAM, 432-442, 2008. [58] L. Schietget, F. Costa, J. Ramon, L.D. Raedt, “Maximum common subgraph mining: A Fast and effective Approach towards feature generation.” In Proc. At SRL-MLG-ILP, Leven, Belgium, 2009. [59] Z. Zou, J.Li, H.Gao, S.Zhang, Mining frequent subgraph patterns from uncertain graph data, IEEE Transactions on Knowledge and Data Engineering 22 (9)(2010)1203–1218. [60] C. Jiang, F. Coenen, M. Zito, M. “Frequent Sub-graph mining on Edge Weighted Graphs.” 2010. [61] O. Papapetrou, E.Ioanno, D.Skoutas, Efficient discovery of frequent subgraph patterns in uncertain graph databases, in: Proceedings of the 14th International Conference on Extending Database Technology, vol.12, New York, USA, 2011, pp.355– 366. [62] A.Gago-Alonso, J.A.Carrasco-Ochoa, J.E.Medina-Pagola, J.F.Martínez-Trinidad, Full duplicate candidate pruning for frequent connected subgraph mining, in: Integrated Computer-Aided Engineering, 17August, 2010,pp.211–225. [63] N. Acosta-Mendoza, A.Gago-Alonso, J.E.Medina-Pagola, Frequent approximate subgraphs as features for graph-based image classification, Knowledge- Based Systems 27(March)(2012)381–392. [64] R. Alves, D.S. Rodríguez-Baena, J.S. Aguilar-Ruiz, Gene association analysis: a survey of frequent pattern mining from gene expression data, Briefings in Bioinformatics 11 (2010) 210–224. [65] A. Jiménez, F. Berzal, J.C.C. Talavera, Frequent tree pattern mining: a survey, Intelligence Data Analysis 14 (2010) 603–622. [66] J. Han, H. Cheng, D. Xin, X. Yan, Frequent pattern mining: current status and future directions, Data Mining and Knowledge Discovery (10th Anniversary 15) (2007) 55–86. [67] C.H. Weng, Mining fuzzy specific rare itemsets for education data, Knowledge-Based Systems (2011) 697–708. [68] U. Yun, K.H. Ryu, Approximate weighted frequent pattern mining with/without noisy environments, Knowledge-Based Systems 24 (2011) 73–82. [69] X. Yan, J. Huan, gSpan: Graph-based substructure pattern mining, in: Proceedings International Conference on Data Mining, Maebashi, Japan, 2002, pp. 721–724. [70] S. Nijssen, J.N. Kok, A quickstart in frequent structure mining can make a difference, in: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 647–652. [71] F. Eichinger, K. Böhm, Software-bug localization with graph mining, in: C.C. Aggarwal, H. Wang (Eds.), Managing and Mining Graph Data, vol. 40, Springer-Verlag, New York, 2010. [72] A. Gago-Alonso, A. Puentes-Luberta, J.A. Carrasco-Ochoa, J.E. Medina-Pagola, J.F. Martínez-Trinidad, A new algorithm for mining frequent connected subgraphs based on adjacency matrices, Intelligence Data Analysis 14 (2010) 385–403. [73] C. Jiang, F. Coenen, R. Sanderson, M. Zito, Text classification using graph mining-based feature extraction, Knowledge-Based Systems 23 (2010) 302–308. [74] N. Ketkar, L. Holder, D. Cook, Mining in the proximity of subgraphs, in: Analysis and Group Detection KDD Workshop on Link Analysis: Dynamics and Statics of Large Networks, 2006. [75] C. Borgelt, M. Berthold, Mining molecular fragments: finding relevant substructures of molecules, in: Proceedings of 2002 International Conference on Data Mining (ICDM’02), Maebashi, Japan, 2002, pp. 211–218. [76] M.S. Hossain, R.A. Angryk, Gdclust: a graph-based document clustering technique, in: Proceedings of the Seventh IEEE International Conference on Data Mining Workshops, IEEE Computer Society, Washington, DC, USA, 2007, pp. 417–422. [77] M. Koyutürk, A. Grama, W. Szpankowski, An efficient algorithm for detecting frequent subgraphs in biological networks, Bioinformatics (2004) 200–207.