SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Frequent subgraph discovery for
      a single large graph
Agenda
•   Motivation
•   Summary of existing approaches ?
•   Support computations
•   Comparison and Evaluation
Background
• Frequent subgraph mining
  – Graph-transection setting (for graph datasets)
     • Many small graphs
  – Single-graph setting
     • One big graph


• New problem for single-graph setting
  – Definition of support
Challenge
• Difficulty of defining the support in a large
  graph
  – Property of anti-monotone is required in pruning
    the search space


• Anti-monotone
  – A⊂B ⇒sup(A) > sup(B)
Subgraph Support
• The most intuitive definition
   – Count of embeddings in input graph
       • Not anti-monotone
 Count of embeddings   1      2           2




                                              5
Motivation
• Suggest a new definition of support for
  subgraph that
  – Resulting support is anti-monotone
  – Support can be computed efficiently


• Three Support computation algorithms
  – Overlap based (2)
  – Minimum image based (1)
Agenda
• Motivation
• Summary of existing approaches
• Support computations
  – Simple overlap
                         Overlap based methods
  – Harmful overlap
  – Minimum image
• Comparison and Evaluation
Overlap based support
• The size of maximum independent set (MIS)
  – Find overlaps
  – Find maximum independent node size
Overlap
• Sharing at least one node in each embeddings

• 𝑉1 ∩ 𝑉2 ≠ ∅
    (𝑉1 , 𝑉2 : 𝑛𝑜𝑑𝑒 𝑠𝑒𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔𝑠)




  Embedding is an occurrence of pattern
                                                 9
Overlap Graph
• 𝑂 = (𝑉 𝑂 , 𝐸 𝑂 )
   – 𝑉 𝑂 : set of embeddings as its node set
   – 𝐸 𝑂 = { 𝑓1 , 𝑓2 |
      𝑓1 , 𝑓2 ∈ 𝑉 𝑂 ∧ 𝑓1 ≡ 𝑓2 ∧ 𝑉1 ∩ 𝑉2 ≠ ∅ 1 ∈ 1 , 𝑓2 ∈
                                            𝑓   𝑉          𝑉
                                                           2   }
   – If two embeddings share at least one node,
     nodes of overlap graph is connected




                                                                   10
Maximum Independent Set Support
• Independent node set of Graph 𝐺 = (𝑉, 𝐸)
  – 𝐼 ⊆ 𝑉 𝑤𝑖𝑡ℎ ∀𝑢, 𝑣 ∈ 𝐼: 𝑢, 𝑣 ∈ 𝐸
  – Maximum independent node set need not to be
    unique


   The size of                        The size of
   maximum independent node set : 1   maximum independent node set : 2

• MIS-support = size of maximum independent
  node set
                                                                         11
Harmful Overlap Support(1/3)
• MIS-support
  – Considering any overlap as harmful


• Overlap is Not necessarily harmful
  – Anti-monotone property is important




                                          12
Harmful Overlap Support(2/3)
• Harmful Overlap Graph 𝐻 = (𝑉 𝐻 , 𝐸 𝐻 )
  – 𝑉 𝐻 : set of embeddings as its node set
  – 𝐸 𝐻 = {(𝑓1 , 𝑓2 )|𝑓1 , 𝑓2 ∈ 𝑉 𝐻 ∧ 𝑓1 ≡ 𝑓2 ∧
      𝑉1 = 𝑉2 ∨ 𝑎𝑛𝑐𝑒𝑠𝑡𝑜𝑟𝑠 𝑜𝑓 𝑉1 = 𝑎𝑛𝑐𝑒𝑠𝑡𝑜𝑟𝑠 𝑜𝑓 𝑉2
                      𝑓1 ∈ 𝑉1 , 𝑓2 ∈ 𝑉2 }

• HO-support              In this case,
                          MIS-support = 1, HO-support = 2




                                                        13
Harmful Overlap(3/3)
• Completing anti-monotone property




                                      #A : 2
                                      #B : 3
                                      #AB : 2
                                      #BAB : 2




                                            14
Note
• Harmful overlap is a weaker concept than
  simple overlap
  – HO-support is never lower than MIS-support




                                                 15
Experiment
• Support computation as Part of the
  MoSS(Molecular Substructure Miner) program
  – IC93 dataset[7]
     • 1283 molecules forms a connected component
  – Tic-Tac-Toc win dataset
     • This consists of 626 connected components




                                                    16
Result
• Vertical axis: Number of frequent subgraphs of
  which support exceeds threshold
• Horizontal axis: Number of nodes (of pattern)?
• In the case IC93
  – Up to 30% more
     • Due to heavily overlapping
       with of carbon atoms
• In the case Tic-Tac-Toe
  – Around 5 % more
                                               17
Agenda
• Motivation
• Summary of existing approaches
• Support computations
  – Simple overlap
  – Harmful overlap
  – Minimum image
• Comparison and Evaluation
Minimum image based definition
• Minimum image based support of p in g
  – Number of unique nodes mapped
      1

      2              Embeddings           Unique
                 1    3     3       5      3
      3          2    2     4       4      2
                 3    1     5       3      3
       4

       5
Benefits
I. Instead of 𝑂(𝑁 2 ) 𝑜𝑣𝑒𝑟𝑙𝑎𝑝𝑠, 𝑂 𝑁 𝑑𝑎𝑡𝑎𝑠𝑒𝑡
II. No NP-compete MIS problem
III. Not necessary to compute all occurrence,
     only for all nodes
Agenda
• Motivation
• Summary of existing approaches
• Support computations
  – Simple overlap
  – Harmful overlap
  – Minimum image
• Comparison and Evaluation
Embedding of a Pattern
• 𝑃𝑎𝑡𝑡𝑒𝑟𝑛 𝑝 = (𝑉𝑝 , 𝐸 𝑝 , 𝜆 𝑝 )
• 𝐷𝑎𝑡𝑎 𝑔𝑟𝑎𝑝ℎ 𝑔 = (𝑉𝑔 , 𝐸 𝑔 , 𝜆 𝑔 )
• 𝐸𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝜑: 𝑉𝑝 → 𝑉𝑔
Three support measures
• Simple Overlap
   – 𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒 𝜑 𝑎𝑛𝑑 𝜑 ′ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝑝 𝑒𝑥𝑖𝑠𝑡𝑠 𝑖𝑓
                 𝜑(𝑉𝑝 ) ∩ 𝜑 ′ 𝑉𝑝 ≠ ∅


• Harmful overlap
   – 𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒 𝜑 𝑎𝑛𝑑 𝜑 ′ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝑝 𝑒𝑥𝑖𝑠𝑡𝑠 𝑖𝑓
        ∃𝑣 ∈ 𝑉𝑝 : 𝜑 𝑣 , 𝜑′(𝑣) ∈ 𝜑(𝑉𝑝 ) ∩ 𝜑 ′ 𝑉𝑝

• Minimum image based support of p in g
   – 𝜎3 𝑝, 𝑔 = min |{𝜑 𝑖 𝑣 : 𝜑 𝑖 𝑖𝑠 𝑎𝑛 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑜𝑓 𝑝 𝑖𝑛 𝑔}|
                𝑣∈𝑉 𝑝
Comparison




𝜎1 = 1    <     𝜎2 = 2      <     𝜎3 = 3
Overlap   harmful overlap       Minimum image
Experimental Setting
• Comparisons of Image-based and overlap-
  based algorithms

• Dataset
  – WebKB dataset (4 large graphs of structure of web
    pages)
Experiment Result
Conclusion
• Conclusion
  – Overlap based support measure that is anti-
    monotone
  – Maximum image based algorithm that is more
    efficient than previous ones

Weitere ähnliche Inhalte

Was ist angesagt?

Histogram Equalization(Image Processing Presentation)
Histogram Equalization(Image Processing Presentation)Histogram Equalization(Image Processing Presentation)
Histogram Equalization(Image Processing Presentation)CherryBerry2
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache sparkEmiliano Martinez Sanchez
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationArthur Mensch
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
CSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th MayCSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th Maycstalks
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theorySeongwon Hwang
 
Introduction to Image Processing
Introduction to Image ProcessingIntroduction to Image Processing
Introduction to Image ProcessingIsrael Gbati
 
L1-based compression of random forest modelSlide
L1-based compression of random forest modelSlideL1-based compression of random forest modelSlide
L1-based compression of random forest modelSlideArnaud Joly
 
Support vector machine
Support vector machineSupport vector machine
Support vector machinePrasenjit Dey
 
Digital signal processing on arm new
Digital signal processing on arm newDigital signal processing on arm new
Digital signal processing on arm newIsrael Gbati
 
Popular image restoration technique
Popular image restoration techniquePopular image restoration technique
Popular image restoration techniqueVARUN KUMAR
 

Was ist angesagt? (16)

Histogram Equalization(Image Processing Presentation)
Histogram Equalization(Image Processing Presentation)Histogram Equalization(Image Processing Presentation)
Histogram Equalization(Image Processing Presentation)
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache spark
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
CSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th MayCSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th May
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theory
 
Introduction to Image Processing
Introduction to Image ProcessingIntroduction to Image Processing
Introduction to Image Processing
 
Matlab_LT_0718
Matlab_LT_0718Matlab_LT_0718
Matlab_LT_0718
 
L1-based compression of random forest modelSlide
L1-based compression of random forest modelSlideL1-based compression of random forest modelSlide
L1-based compression of random forest modelSlide
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
DNN and RBM
DNN and RBMDNN and RBM
DNN and RBM
 
Digital signal processing on arm new
Digital signal processing on arm newDigital signal processing on arm new
Digital signal processing on arm new
 
Lec15 sfm
Lec15 sfmLec15 sfm
Lec15 sfm
 
Popular image restoration technique
Popular image restoration techniquePopular image restoration technique
Popular image restoration technique
 
Dsp2
Dsp2Dsp2
Dsp2
 

Ähnlich wie 120808

Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)Grigory Yaroslavtsev
 
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Universitat Politècnica de Catalunya
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 
adversarial robustness lecture
adversarial robustness lectureadversarial robustness lecture
adversarial robustness lectureMuhammadAhmedShah2
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةFares Al-Qunaieer
 
Kdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiKdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiLaks Lakshmanan
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Introduction to dynamic programming
Introduction to dynamic programmingIntroduction to dynamic programming
Introduction to dynamic programmingAmisha Narsingani
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Wuhyun Rico Shin
 
Lecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest NeighborsLecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest NeighborsMarina Santini
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesData-Centric_Alliance
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningSungchul Kim
 
IJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsIJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsAkisato Kimura
 
Case Study of Convolutional Neural Network
Case Study of Convolutional Neural NetworkCase Study of Convolutional Neural Network
Case Study of Convolutional Neural NetworkNamHyuk Ahn
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Sangwoo Mo
 
clique-summary
clique-summaryclique-summary
clique-summaryJia Wang
 

Ähnlich wie 120808 (20)

Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)
 
unit 4 nearest neighbor.ppt
unit 4 nearest neighbor.pptunit 4 nearest neighbor.ppt
unit 4 nearest neighbor.ppt
 
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 
adversarial robustness lecture
adversarial robustness lectureadversarial robustness lecture
adversarial robustness lecture
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
Kdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiKdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iii
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Introduction to dynamic programming
Introduction to dynamic programmingIntroduction to dynamic programming
Introduction to dynamic programming
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
 
Lecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest NeighborsLecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest Neighbors
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
 
IJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsIJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphs
 
Case Study of Convolutional Neural Network
Case Study of Convolutional Neural NetworkCase Study of Convolutional Neural Network
Case Study of Convolutional Neural Network
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)
 
clique-summary
clique-summaryclique-summary
clique-summary
 

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

120808

  • 1. Frequent subgraph discovery for a single large graph
  • 2. Agenda • Motivation • Summary of existing approaches ? • Support computations • Comparison and Evaluation
  • 3. Background • Frequent subgraph mining – Graph-transection setting (for graph datasets) • Many small graphs – Single-graph setting • One big graph • New problem for single-graph setting – Definition of support
  • 4. Challenge • Difficulty of defining the support in a large graph – Property of anti-monotone is required in pruning the search space • Anti-monotone – A⊂B ⇒sup(A) > sup(B)
  • 5. Subgraph Support • The most intuitive definition – Count of embeddings in input graph • Not anti-monotone Count of embeddings 1 2 2 5
  • 6. Motivation • Suggest a new definition of support for subgraph that – Resulting support is anti-monotone – Support can be computed efficiently • Three Support computation algorithms – Overlap based (2) – Minimum image based (1)
  • 7. Agenda • Motivation • Summary of existing approaches • Support computations – Simple overlap Overlap based methods – Harmful overlap – Minimum image • Comparison and Evaluation
  • 8. Overlap based support • The size of maximum independent set (MIS) – Find overlaps – Find maximum independent node size
  • 9. Overlap • Sharing at least one node in each embeddings • 𝑉1 ∩ 𝑉2 ≠ ∅ (𝑉1 , 𝑉2 : 𝑛𝑜𝑑𝑒 𝑠𝑒𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔𝑠) Embedding is an occurrence of pattern 9
  • 10. Overlap Graph • 𝑂 = (𝑉 𝑂 , 𝐸 𝑂 ) – 𝑉 𝑂 : set of embeddings as its node set – 𝐸 𝑂 = { 𝑓1 , 𝑓2 | 𝑓1 , 𝑓2 ∈ 𝑉 𝑂 ∧ 𝑓1 ≡ 𝑓2 ∧ 𝑉1 ∩ 𝑉2 ≠ ∅ 1 ∈ 1 , 𝑓2 ∈ 𝑓 𝑉 𝑉 2 } – If two embeddings share at least one node, nodes of overlap graph is connected 10
  • 11. Maximum Independent Set Support • Independent node set of Graph 𝐺 = (𝑉, 𝐸) – 𝐼 ⊆ 𝑉 𝑤𝑖𝑡ℎ ∀𝑢, 𝑣 ∈ 𝐼: 𝑢, 𝑣 ∈ 𝐸 – Maximum independent node set need not to be unique The size of The size of maximum independent node set : 1 maximum independent node set : 2 • MIS-support = size of maximum independent node set 11
  • 12. Harmful Overlap Support(1/3) • MIS-support – Considering any overlap as harmful • Overlap is Not necessarily harmful – Anti-monotone property is important 12
  • 13. Harmful Overlap Support(2/3) • Harmful Overlap Graph 𝐻 = (𝑉 𝐻 , 𝐸 𝐻 ) – 𝑉 𝐻 : set of embeddings as its node set – 𝐸 𝐻 = {(𝑓1 , 𝑓2 )|𝑓1 , 𝑓2 ∈ 𝑉 𝐻 ∧ 𝑓1 ≡ 𝑓2 ∧ 𝑉1 = 𝑉2 ∨ 𝑎𝑛𝑐𝑒𝑠𝑡𝑜𝑟𝑠 𝑜𝑓 𝑉1 = 𝑎𝑛𝑐𝑒𝑠𝑡𝑜𝑟𝑠 𝑜𝑓 𝑉2 𝑓1 ∈ 𝑉1 , 𝑓2 ∈ 𝑉2 } • HO-support In this case, MIS-support = 1, HO-support = 2 13
  • 14. Harmful Overlap(3/3) • Completing anti-monotone property #A : 2 #B : 3 #AB : 2 #BAB : 2 14
  • 15. Note • Harmful overlap is a weaker concept than simple overlap – HO-support is never lower than MIS-support 15
  • 16. Experiment • Support computation as Part of the MoSS(Molecular Substructure Miner) program – IC93 dataset[7] • 1283 molecules forms a connected component – Tic-Tac-Toc win dataset • This consists of 626 connected components 16
  • 17. Result • Vertical axis: Number of frequent subgraphs of which support exceeds threshold • Horizontal axis: Number of nodes (of pattern)? • In the case IC93 – Up to 30% more • Due to heavily overlapping with of carbon atoms • In the case Tic-Tac-Toe – Around 5 % more 17
  • 18. Agenda • Motivation • Summary of existing approaches • Support computations – Simple overlap – Harmful overlap – Minimum image • Comparison and Evaluation
  • 19. Minimum image based definition • Minimum image based support of p in g – Number of unique nodes mapped 1 2 Embeddings Unique 1 3 3 5 3 3 2 2 4 4 2 3 1 5 3 3 4 5
  • 20. Benefits I. Instead of 𝑂(𝑁 2 ) 𝑜𝑣𝑒𝑟𝑙𝑎𝑝𝑠, 𝑂 𝑁 𝑑𝑎𝑡𝑎𝑠𝑒𝑡 II. No NP-compete MIS problem III. Not necessary to compute all occurrence, only for all nodes
  • 21. Agenda • Motivation • Summary of existing approaches • Support computations – Simple overlap – Harmful overlap – Minimum image • Comparison and Evaluation
  • 22. Embedding of a Pattern • 𝑃𝑎𝑡𝑡𝑒𝑟𝑛 𝑝 = (𝑉𝑝 , 𝐸 𝑝 , 𝜆 𝑝 ) • 𝐷𝑎𝑡𝑎 𝑔𝑟𝑎𝑝ℎ 𝑔 = (𝑉𝑔 , 𝐸 𝑔 , 𝜆 𝑔 ) • 𝐸𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝜑: 𝑉𝑝 → 𝑉𝑔
  • 23. Three support measures • Simple Overlap – 𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒 𝜑 𝑎𝑛𝑑 𝜑 ′ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝑝 𝑒𝑥𝑖𝑠𝑡𝑠 𝑖𝑓 𝜑(𝑉𝑝 ) ∩ 𝜑 ′ 𝑉𝑝 ≠ ∅ • Harmful overlap – 𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒 𝜑 𝑎𝑛𝑑 𝜑 ′ 𝑜𝑓 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝑝 𝑒𝑥𝑖𝑠𝑡𝑠 𝑖𝑓 ∃𝑣 ∈ 𝑉𝑝 : 𝜑 𝑣 , 𝜑′(𝑣) ∈ 𝜑(𝑉𝑝 ) ∩ 𝜑 ′ 𝑉𝑝 • Minimum image based support of p in g – 𝜎3 𝑝, 𝑔 = min |{𝜑 𝑖 𝑣 : 𝜑 𝑖 𝑖𝑠 𝑎𝑛 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑜𝑓 𝑝 𝑖𝑛 𝑔}| 𝑣∈𝑉 𝑝
  • 24. Comparison 𝜎1 = 1 < 𝜎2 = 2 < 𝜎3 = 3 Overlap harmful overlap Minimum image
  • 25. Experimental Setting • Comparisons of Image-based and overlap- based algorithms • Dataset – WebKB dataset (4 large graphs of structure of web pages)
  • 27. Conclusion • Conclusion – Overlap based support measure that is anti- monotone – Maximum image based algorithm that is more efficient than previous ones