SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Downloaden Sie, um offline zu lesen
© 2019 IBM Corporation
Graph generation using a graph grammar
Hiroshi Kajino
IBM Research - Tokyo
IBIS 2019
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
2
Contents
© 2019 IBM Corporation
/52
A molecular graph should satisfy some constraints to be valid
§Learning a generative model of a molecular graph
–Input: set of molecular graphs ! = {$%, $', … , $)}
–Output: probability distribution +($) such that $. ∼ +
§Hard vs. soft constraints on +($)’s support
–Hard constraint: valence condition
∃ rule-based classifier that judges this constraint
–Soft constraint: stability
∄ rule-based classifier, in general
3
Why should we care about a formal language?
!"
Formal language
can help
© 2019 IBM Corporation
/52
A formal language defines a set of strings with certain properties;
an associated grammar tells us how to generate them
§Formal language
Typically defined as a set of strings
–Language point of view
= a set of strings with certain properties
(= a subset of all possible strings)
–Generative point of view
A grammar is often associated with a language
= how to generate strings in ℒ
4
Why should we care about a formal language?
All possible strings Σ∗
Language ℒ
Grammar 5
ℒ = 6.7. ∶ 9 ≥ 1 ⊂ 6, 7 ∗ = Σ∗
5 = = , 6, 7 , =, = → 67, = → 6=7
© 2019 IBM Corporation
/52
A formal language defines a set of graphs satisfying hard constraints;
an associated grammar tells us how to generate them
§Formal language
Typically defined as a set of graphs
–Language point of view
= a set of graphs satisfying the hard constraints
(= a subset of all possible graphs)
–Generative point of view
A grammar is often associated with a language
= how to generate graphs in ℒ
5
Why should we care about a formal language?
All possible graphs Σ∗
Language ℒ
Grammar 5
ℒ =
Molecules satisfying
the valence conditions
⊂ {All possible graphs}
???
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
6
Contents
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
7
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
=
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
8
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
=
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
9
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
6=7
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
10
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
6=7
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
11
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
66=77
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
12
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
66=77
© 2019 IBM Corporation
/52
CFG generates a string by repeatedly applying a production rule to a
non-terminal, until there exists no non-terminal
§Context-free grammar 5 = (T, Σ, U, =)
– T: set of non-terminals
– Σ: set of terminals
– U: set of production rules
– = ∈ T: the start symbol
13
Context-free grammar
§ Example
– T = {=}
– Σ = 6, 7
– U = {= → 67, = → 6=7}
– =
666777
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
14
Contents
© 2019 IBM Corporation
/52
Hypergraph is a generalization of a graph
§Hypergraph W = (T, X) consists of…
–Node Y ∈ T
–Hyperedge Z ∈ X ⊆ 2 ]
: Connect an arbitrary number of nodes
cf, An edge in a graph connects exactly two nodes
15
Hyperedge replacement grammar
HyperedgeNode
© 2019 IBM Corporation
/52
HRG generates a hypergraph by repeatedly replacing
non-terminal hyperedges with hypergraphs
§Hyperedge replacement grammar (HRG) 5 = (T, Σ, U, =)
–T: set of non-terminals
–Σ: set of terminals
–=: start symbol
–U: set of production rules
– A rule replaces a non-terminal hyperedge with a hypergraph
16
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Labels on hyperedges
C
N
S
[Feder, 71], [Pavlidis+, 72]
© 2019 IBM Corporation
/52
Start from start symbol S
17
Hyperedge replacement grammar
S
1C
2
H
H1N
2
N
N
NS
Production rules ^
© 2019 IBM Corporation
/52
The left rule is applicable
18
Hyperedge replacement grammar
S
1C
2
H
H1N
2
N
N
NS
Production rules ^
© 2019 IBM Corporation
/52
We obtain a hypergraph with three non-terminals
19
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
N
N
N
© 2019 IBM Corporation
/52
Apply the right rule to one of the non-terminals
20
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
N
N
N
© 2019 IBM Corporation
/52
Two non-terminals remain
21
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
N
N
H
H
© 2019 IBM Corporation
/52
Repeat the procedure until there is no non-terminal
22
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
N
N
H
H
© 2019 IBM Corporation
/52
Repeat the procedure until there is no non-terminal
23
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
C
N
H
H
H
H
© 2019 IBM Corporation
/52
Repeat the procedure until there is no non-terminal
24
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
C
N
H
H
H
H
© 2019 IBM Corporation
/52
Graph generation halts when there is no non-terminal
25
Hyperedge replacement grammar
1C
2
H
H1N
2
N
N
NS
Production rules ^
C
C
C
H
H
H
H
H
H
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
26
Contents
© 2019 IBM Corporation
/52
HRG inference algorithm outputs HRG that can reconstruct the input
§HRG inference algorithm [Aguiñaga+, 16]
–Input: Set of hypergraphs ℋ
–Output: HRG such that ℋ ⊆ ℒ(HRG)
Minimum requirement of the grammar’s expressiveness
–Idea: Infer production rules necessary to obtain each hypergraph
Decompose each hypergraph into a set of production rules
27
HRG inference algorithm
Language of the grammar
© 2019 IBM Corporation
/52
Tree decomposition discovers a tree-like structure in a graph
§Tree decomposition
–All the nodes and edges must be included in the tree
–For each node, the tree nodes that contain it must be connected
28
HRG inference algorithm
* Digits represent the node correspondence
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
29
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
31
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
1C
4
H
H 1
4 N
1
3
4
1
4 N
Production rule
=
attach
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
32
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
1C
4
H
H 1
4 N
1
3
4
1
4 N
Production rule
=
attach
Children
© 2019 IBM Corporation
/52
Tree decomposition and (a syntax tree of) HRG are equivalent
§Relationship between tree decomposition and HRG
1. Connecting hypergraphs in tree recovers the original hypergraph
2. Connection ⇔ Hyperedge replacement
33
HRG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
1C
4
H
H 1
4 N
1
3
4
1
4 N
Production rule
=
attach
N
N
© 2019 IBM Corporation
/52
HRG can be inferred from tree decompositions of input hypergraphs.
§HRG inference algorithm [Aguiñaga+, 16]
–Algorithm:
–Expressiveness: ℋ ⊆ ℒ(HRG)
The resultant HRG can generate all input hypergraphs.
(∵ clear from its algorithm)
34
HRG inference algorithm
1. Compute tree decompositions of input hypergraphs
2. Extract production rules
3. Compose HRG by taking their union
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
35
Contents
© 2019 IBM Corporation
/52
We want a graph grammar that guarantees hard constraints
§Objective
Construct a graph grammar that never violates the valence condition
§Application: Generative model of a molecule
–Grammar-based generation guarantees the valence condition
–Probabilistic model could learn soft constraints
36
Molecular hypergraph grammar
!"
© 2019 IBM Corporation
/52
A simple application to molecular graphs doesn’t work
§A simple application to molecular graphs
–Input: Molecular graphs
–Issue: Valence conditions can be violated
37
Molecular hypergraph grammar
H C C H
H H
H H
Input
C
H
H
C
C
H
H
C
H C C C C H
Tree decomposition
→
→
→
S
NC
NC
NH C
NC C
N
N
N
N
C H
Extracted rules
This rule increases
the degree of carbon
© 2019 IBM Corporation
/52
Our idea is to use a hypergraph representation of a molecule
§Conserved quantity
–HRG: # of nodes in a hyperedge
–Our grammar: # of bonds connected to each atom (valence)
∴ Atom should be modeled as a hyperedge
§Molecular hypergraph
– Atom = hyperedge
– Bond = node
38
Molecular hypergraph grammar
C C
H
H
HH
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
A language for molecular hypergraphs consists of two properties
§Molecular hypergraph as a language
A set of hypergraphs with the following properties:
1. Each node has degree 2 (=2-regular)
2. Label on a hyperedge determines # of nodes it has (= valence)
39
Molecular hypergraph grammar
H HC
H
C
H
H HC
H
C
H
C C
H
H
HH
H
H
C C H
HH
H
C C
H
HH
H
© 2019 IBM Corporation
/52
MHG, a grammar for the language, is defined as a subclass of HRG
§Molecular Hypergraph Grammar (MHG)
–Definition: HRG that generates molecular hypergraphs only
–Counterexamples:
40
Molecular hypergraph grammar
MHG
HRG
C C
H
H
HH
H
HH
C C
H
H
HH
H
HH
Valence #
2-regularity #
This can be
avoided by
learning HRG from
data [Aguiñaga+, 16]
Use an irredundant
tree decomposition
(our contribution)
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
41
Contents
© 2019 IBM Corporation
/52
A naive application of the existing algorithm doesn’t work
§Naive application of the HRG inference algorithm [Aguiñaga+, 16]
–Input: Set of hypergraphs
–Output: HRG w/ the following properties:
• All the input hypergraphs are in the language $
• Guarantee the valence conditions $
• No guarantee on 2-regularity %
42
MHG inference algorithm
C C
H
H
HH
H
HH
This cannot be transformed
into a molecular graph
© 2019 IBM Corporation
/52
Irredundant tree decomposition is a key to guarantee 2-regularity
§Irredundant tree decomposition
–The connected subgraph induced by a node must be a path
–Any tree decomposition can be made irredundant in poly-time
43
MHG inference algorithm
1
3
2
1
3
4
3C
4
H
H
3 C
2
H
H
1C
4
H
H 1 C
2
H
H
4
RedundantIrredundant
© 2019 IBM Corporation
/52
MHG inference algorithm is different from the existing one by two steps
§MHG Inference algorithm [Kajino, 19]
–Input: Set of molecular graphs
–Output: MHG w/ the following properties:
• All the input hypergraphs are in the language $
• Guarantee the valence conditions $
• Guarantee 2-regularity &
44
MHG inference algorithm
1. Convert molecular graphs into molecular hypergraphs
2. Compute tree decompositions of molecular hypergraphs
3. Convert each tree decomposition to be irredundant
4. Extract production rules
5. Compose MHG by taking their union
Thanks to HRG
Our contribution
© 2019 IBM Corporation
/52
I will talk about an application of formal language to a graph generation
problem
§Formal language
–Context-free grammar
–Hyperedge replacement grammar (HRG)
–HRG inference algorithm
§Application to molecular graph generation
–Molecular hypergraph grammar (a special case of HRG)
–MHG inference algorithm
–Combination with VAE
45
Contents
© 2019 IBM Corporation
/52
We obtain (Enc, Dec) between molecule and latent vector
by combining MHG and RNN-VAE
§MHG-VAE: (Enc, Dec) between molecule & latent vector
46
Combination with VAE
EncG
EncN
EncH
Molecular
graph
Molecular
hypergraph
Parse Tree
according to MHG
! ∈ ℝ$
Latent vector
MHG-VAE encoder
MHG Enc of RNN-VAE
© 2019 IBM Corporation
/52
First, we learn (Enc, Dec) between a molecule and its vector
representation using MHG-VAE
§Global molecular optimization [Gómez-Bombarelli+, 16]
–Find: Molecule that maximizes the target
–Method: VAE+BO
1. Obtain MHG from the input molecules
2. Train RNN-VAE on syntax trees
3. Obtain vector representations f. ∈ ℝh
.i%
)
1. Some of which have target values {j. ∈ ℝ}
4. BO gives us candidates fk ∈ ℝh
ki%
l that may maximize the target
5. Decode them to obtain molecules !k ki%
l
47
Combination with VAE
Image from [Gómez-Bombarelli+, 16]
© 2019 IBM Corporation
/52
Given vector representations and their target values,
we use BO to obtain a vector that optimizes the target
§Global molecular optimization [Gómez-Bombarelli+, 16]
–Find: Molecule that maximizes the target
–Method: VAE+BO
1. Obtain MHG from the input molecules
2. Train RNN-VAE on syntax trees
3. Obtain vector representations f. ∈ ℝh
.i%
)
1. Some of which have target values {j. ∈ ℝ}
4. BO gives us candidates fk ∈ ℝh
ki%
l that may maximize the target
5. Decode them to obtain molecules !k ki%
l
48
Combination with VAE
Image from [Gómez-Bombarelli+, 16]
© 2019 IBM Corporation
/52
We evaluate the benefit of our grammar-based representation,
compared with existing ones
§Empirical study
–Purpose: How much does our representation facilitate VAE training?
–Baselines:
• {C,G,SD}VAE use SMILES (text repr.)
• JT-VAE assembles molecular components
– It requires NNs other than VAE for scoring
–Tasks:
• VAE reconstruction
• Valid prior ratio
• Global molecular optimization
49
Combination with VAE
Image from [Jin+, 18]
© 2019 IBM Corporation
/52
Our grammar-based representation achieves better scores.
This result empirically supports the effectiveness of our approach.
§Result
50
Combination with VAE
Synthetic accessibility score
Penalty to a ring larger than six
Water solubility
Maximizing m(n)
© 2019 IBM Corporation
/52
A graph grammar can be a building block for a graph generative model
§Classify constraints into hard ones and soft ones
ML for the soft ones, rules for the hard ones
§Define a language by encoding hard constraints
E.g., valence conditions
§Design a grammar for the language
Sometime, w/ an inference algorithm
Code is now public on Github
https://github.com/ibm-research-tokyo/graph_grammar
51
Takeaways
© 2019 IBM Corporation
/52
[Aguiñaga+, 16] Aguiñaga, S., Palacios, R., Chiang, D., and Weninger, T.: Growing graphs from hyperedge
replacement graph grammars. In Proceedings of the 25th ACM International on Conference on Information and
Knowledge Management, pp. 469–478, 2016.
[Feder, 71] Feder, J: Plex languages. Information Sciences, 3, pp. 225-241, 1971.
[Gómez-Bombarelli+, 16] Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez-
Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., and Aspuru-Guzik, A.: Automatic
chemical design using a data-driven continuous representation of molecules. ACS Central Science, 2018. (ArXiv ver.
appears in 2016)
[Jin+, 18] Jin, W., Barzilay, R., and Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation.
In Proceedings of the Thirty-fifth International Conference on Machine Learning, 2018.
[Kajino, 19] Kajino, H.: Molecular hypergraph grammar with its application to molecular optimization. In Proceedings of
the Thirty-sixth International Conference on Machine Learning, 2019.
[Pavlidis, 72] Pavlidis, T.: Linear and context-free graph grammars. Journal of the ACM, 19(1), pp.11-23, 1972.
[You+, 18] You, J., Liu, B., Ying, Z., Pande, V., and Leskovec, J.: Graph convolutional policy network for goal-directed
molecular graph generation. In Advances in Neural Information Processing Systems 31, pp. 6412–6422, 2018.
.
52
References

Weitere ähnliche Inhalte

Was ist angesagt?

Chapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitationChapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitation
subramaniam shankar
 
Extreme learning machine:Theory and applications
Extreme learning machine:Theory and applicationsExtreme learning machine:Theory and applications
Extreme learning machine:Theory and applications
James Chou
 

Was ist angesagt? (20)

Bleu vs rouge
Bleu vs rougeBleu vs rouge
Bleu vs rouge
 
Word embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMWord embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTM
 
Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution
 
On First-Order Meta-Learning Algorithms
On First-Order Meta-Learning AlgorithmsOn First-Order Meta-Learning Algorithms
On First-Order Meta-Learning Algorithms
 
Parsing (Automata)
Parsing (Automata)Parsing (Automata)
Parsing (Automata)
 
Common Problems in Hyperparameter Optimization
Common Problems in Hyperparameter OptimizationCommon Problems in Hyperparameter Optimization
Common Problems in Hyperparameter Optimization
 
PR-175: XLNet: Generalized Autoregressive Pretraining for Language Understanding
PR-175: XLNet: Generalized Autoregressive Pretraining for Language UnderstandingPR-175: XLNet: Generalized Autoregressive Pretraining for Language Understanding
PR-175: XLNet: Generalized Autoregressive Pretraining for Language Understanding
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
 
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
 
Gbml - Genetics Based Machine Learning
Gbml - Genetics Based Machine LearningGbml - Genetics Based Machine Learning
Gbml - Genetics Based Machine Learning
 
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
 
Chapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitationChapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitation
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
 
Trinity of AI: data, algorithms and cloud
Trinity of AI: data, algorithms and cloudTrinity of AI: data, algorithms and cloud
Trinity of AI: data, algorithms and cloud
 
A Simple Explanation of XLNet
A Simple Explanation of XLNetA Simple Explanation of XLNet
A Simple Explanation of XLNet
 
Extreme learning machine:Theory and applications
Extreme learning machine:Theory and applicationsExtreme learning machine:Theory and applications
Extreme learning machine:Theory and applications
 
Compiler Construction Course - Introduction
Compiler Construction Course - IntroductionCompiler Construction Course - Introduction
Compiler Construction Course - Introduction
 
Apache Giraph
Apache GiraphApache Giraph
Apache Giraph
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Gpt1 and 2 model review
Gpt1 and 2 model reviewGpt1 and 2 model review
Gpt1 and 2 model review
 

Ähnlich wie Graph generation using a graph grammar

An introduction on language processing
An introduction on language processingAn introduction on language processing
An introduction on language processing
Ralf Laemmel
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
RIILP
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
Ruymán Reyes
 

Ähnlich wie Graph generation using a graph grammar (16)

An introduction on language processing
An introduction on language processingAn introduction on language processing
An introduction on language processing
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
Pattern-Based Debugging of Declarative Models
Pattern-Based Debugging of Declarative ModelsPattern-Based Debugging of Declarative Models
Pattern-Based Debugging of Declarative Models
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
 
Fast Variant Calling with ADAM and avocado
Fast Variant Calling with ADAM and avocadoFast Variant Calling with ADAM and avocado
Fast Variant Calling with ADAM and avocado
 
parellel computing
parellel computingparellel computing
parellel computing
 
ijcai2020_submitted_paper_but_not_acceptted
ijcai2020_submitted_paper_but_not_accepttedijcai2020_submitted_paper_but_not_acceptted
ijcai2020_submitted_paper_but_not_acceptted
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
 
parallel-computation.pdf
parallel-computation.pdfparallel-computation.pdf
parallel-computation.pdf
 
chapter7.ppt
chapter7.pptchapter7.ppt
chapter7.ppt
 
HSA HSAIL Introduction Hot Chips 2013
HSA HSAIL Introduction  Hot Chips 2013 HSA HSAIL Introduction  Hot Chips 2013
HSA HSAIL Introduction Hot Chips 2013
 
Open mp
Open mpOpen mp
Open mp
 
Timed Concurrent Language for Argumentation
Timed Concurrent Language for ArgumentationTimed Concurrent Language for Argumentation
Timed Concurrent Language for Argumentation
 

Mehr von Hiroshi Kajino

能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
Hiroshi Kajino
 
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Hiroshi Kajino
 

Mehr von Hiroshi Kajino (10)

化学構造式のためのハイパーグラフ文法(JSAI2018)
化学構造式のためのハイパーグラフ文法(JSAI2018)化学構造式のためのハイパーグラフ文法(JSAI2018)
化学構造式のためのハイパーグラフ文法(JSAI2018)
 
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
能動学習による多関係データセットの構築(IBIS2015 博士課程招待講演)
 
Active Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data ConstructionActive Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data Construction
 
能動学習による多関係データセットの構築
能動学習による多関係データセットの構築能動学習による多関係データセットの構築
能動学習による多関係データセットの構築
 
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)Instance-privacy Preserving Crowdsourcing (HCOMP2014)
Instance-privacy Preserving Crowdsourcing (HCOMP2014)
 
Preserving Worker Privacy in Crowdsourcing
Preserving Worker Privacy in CrowdsourcingPreserving Worker Privacy in Crowdsourcing
Preserving Worker Privacy in Crowdsourcing
 
プライバシ保護クラウドソーシング
プライバシ保護クラウドソーシングプライバシ保護クラウドソーシング
プライバシ保護クラウドソーシング
 
20130716 aaai13-short
20130716 aaai13-short20130716 aaai13-short
20130716 aaai13-short
 
20130605-JSAI2013
20130605-JSAI201320130605-JSAI2013
20130605-JSAI2013
 
20130304-DEIM2013
20130304-DEIM201320130304-DEIM2013
20130304-DEIM2013
 

Kürzlich hochgeladen

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 

Kürzlich hochgeladen (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Graph generation using a graph grammar

  • 1. © 2019 IBM Corporation Graph generation using a graph grammar Hiroshi Kajino IBM Research - Tokyo IBIS 2019
  • 2. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 2 Contents
  • 3. © 2019 IBM Corporation /52 A molecular graph should satisfy some constraints to be valid §Learning a generative model of a molecular graph –Input: set of molecular graphs ! = {$%, $', … , $)} –Output: probability distribution +($) such that $. ∼ + §Hard vs. soft constraints on +($)’s support –Hard constraint: valence condition ∃ rule-based classifier that judges this constraint –Soft constraint: stability ∄ rule-based classifier, in general 3 Why should we care about a formal language? !" Formal language can help
  • 4. © 2019 IBM Corporation /52 A formal language defines a set of strings with certain properties; an associated grammar tells us how to generate them §Formal language Typically defined as a set of strings –Language point of view = a set of strings with certain properties (= a subset of all possible strings) –Generative point of view A grammar is often associated with a language = how to generate strings in ℒ 4 Why should we care about a formal language? All possible strings Σ∗ Language ℒ Grammar 5 ℒ = 6.7. ∶ 9 ≥ 1 ⊂ 6, 7 ∗ = Σ∗ 5 = = , 6, 7 , =, = → 67, = → 6=7
  • 5. © 2019 IBM Corporation /52 A formal language defines a set of graphs satisfying hard constraints; an associated grammar tells us how to generate them §Formal language Typically defined as a set of graphs –Language point of view = a set of graphs satisfying the hard constraints (= a subset of all possible graphs) –Generative point of view A grammar is often associated with a language = how to generate graphs in ℒ 5 Why should we care about a formal language? All possible graphs Σ∗ Language ℒ Grammar 5 ℒ = Molecules satisfying the valence conditions ⊂ {All possible graphs} ???
  • 6. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 6 Contents
  • 7. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 7 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = =
  • 8. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 8 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = =
  • 9. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 9 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 6=7
  • 10. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 10 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 6=7
  • 11. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 11 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 66=77
  • 12. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 12 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 66=77
  • 13. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 13 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 666777
  • 14. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 14 Contents
  • 15. © 2019 IBM Corporation /52 Hypergraph is a generalization of a graph §Hypergraph W = (T, X) consists of… –Node Y ∈ T –Hyperedge Z ∈ X ⊆ 2 ] : Connect an arbitrary number of nodes cf, An edge in a graph connects exactly two nodes 15 Hyperedge replacement grammar HyperedgeNode
  • 16. © 2019 IBM Corporation /52 HRG generates a hypergraph by repeatedly replacing non-terminal hyperedges with hypergraphs §Hyperedge replacement grammar (HRG) 5 = (T, Σ, U, =) –T: set of non-terminals –Σ: set of terminals –=: start symbol –U: set of production rules – A rule replaces a non-terminal hyperedge with a hypergraph 16 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Labels on hyperedges C N S [Feder, 71], [Pavlidis+, 72]
  • 17. © 2019 IBM Corporation /52 Start from start symbol S 17 Hyperedge replacement grammar S 1C 2 H H1N 2 N N NS Production rules ^
  • 18. © 2019 IBM Corporation /52 The left rule is applicable 18 Hyperedge replacement grammar S 1C 2 H H1N 2 N N NS Production rules ^
  • 19. © 2019 IBM Corporation /52 We obtain a hypergraph with three non-terminals 19 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ N N N
  • 20. © 2019 IBM Corporation /52 Apply the right rule to one of the non-terminals 20 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ N N N
  • 21. © 2019 IBM Corporation /52 Two non-terminals remain 21 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C N N H H
  • 22. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 22 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C N N H H
  • 23. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 23 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C N H H H H
  • 24. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 24 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C N H H H H
  • 25. © 2019 IBM Corporation /52 Graph generation halts when there is no non-terminal 25 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C C H H H H H H
  • 26. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 26 Contents
  • 27. © 2019 IBM Corporation /52 HRG inference algorithm outputs HRG that can reconstruct the input §HRG inference algorithm [Aguiñaga+, 16] –Input: Set of hypergraphs ℋ –Output: HRG such that ℋ ⊆ ℒ(HRG) Minimum requirement of the grammar’s expressiveness –Idea: Infer production rules necessary to obtain each hypergraph Decompose each hypergraph into a set of production rules 27 HRG inference algorithm Language of the grammar
  • 28. © 2019 IBM Corporation /52 Tree decomposition discovers a tree-like structure in a graph §Tree decomposition –All the nodes and edges must be included in the tree –For each node, the tree nodes that contain it must be connected 28 HRG inference algorithm * Digits represent the node correspondence 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H C C H HH H C C H HH H
  • 29. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 29 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H C C H HH H C C H HH H
  • 30. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 31 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach
  • 31. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 32 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach Children
  • 32. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 33 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach N N
  • 33. © 2019 IBM Corporation /52 HRG can be inferred from tree decompositions of input hypergraphs. §HRG inference algorithm [Aguiñaga+, 16] –Algorithm: –Expressiveness: ℋ ⊆ ℒ(HRG) The resultant HRG can generate all input hypergraphs. (∵ clear from its algorithm) 34 HRG inference algorithm 1. Compute tree decompositions of input hypergraphs 2. Extract production rules 3. Compose HRG by taking their union
  • 34. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 35 Contents
  • 35. © 2019 IBM Corporation /52 We want a graph grammar that guarantees hard constraints §Objective Construct a graph grammar that never violates the valence condition §Application: Generative model of a molecule –Grammar-based generation guarantees the valence condition –Probabilistic model could learn soft constraints 36 Molecular hypergraph grammar !"
  • 36. © 2019 IBM Corporation /52 A simple application to molecular graphs doesn’t work §A simple application to molecular graphs –Input: Molecular graphs –Issue: Valence conditions can be violated 37 Molecular hypergraph grammar H C C H H H H H Input C H H C C H H C H C C C C H Tree decomposition → → → S NC NC NH C NC C N N N N C H Extracted rules This rule increases the degree of carbon
  • 37. © 2019 IBM Corporation /52 Our idea is to use a hypergraph representation of a molecule §Conserved quantity –HRG: # of nodes in a hyperedge –Our grammar: # of bonds connected to each atom (valence) ∴ Atom should be modeled as a hyperedge §Molecular hypergraph – Atom = hyperedge – Bond = node 38 Molecular hypergraph grammar C C H H HH H H C C H HH H C C H HH H
  • 38. © 2019 IBM Corporation /52 A language for molecular hypergraphs consists of two properties §Molecular hypergraph as a language A set of hypergraphs with the following properties: 1. Each node has degree 2 (=2-regular) 2. Label on a hyperedge determines # of nodes it has (= valence) 39 Molecular hypergraph grammar H HC H C H H HC H C H C C H H HH H H C C H HH H C C H HH H
  • 39. © 2019 IBM Corporation /52 MHG, a grammar for the language, is defined as a subclass of HRG §Molecular Hypergraph Grammar (MHG) –Definition: HRG that generates molecular hypergraphs only –Counterexamples: 40 Molecular hypergraph grammar MHG HRG C C H H HH H HH C C H H HH H HH Valence # 2-regularity # This can be avoided by learning HRG from data [Aguiñaga+, 16] Use an irredundant tree decomposition (our contribution)
  • 40. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 41 Contents
  • 41. © 2019 IBM Corporation /52 A naive application of the existing algorithm doesn’t work §Naive application of the HRG inference algorithm [Aguiñaga+, 16] –Input: Set of hypergraphs –Output: HRG w/ the following properties: • All the input hypergraphs are in the language $ • Guarantee the valence conditions $ • No guarantee on 2-regularity % 42 MHG inference algorithm C C H H HH H HH This cannot be transformed into a molecular graph
  • 42. © 2019 IBM Corporation /52 Irredundant tree decomposition is a key to guarantee 2-regularity §Irredundant tree decomposition –The connected subgraph induced by a node must be a path –Any tree decomposition can be made irredundant in poly-time 43 MHG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 4 RedundantIrredundant
  • 43. © 2019 IBM Corporation /52 MHG inference algorithm is different from the existing one by two steps §MHG Inference algorithm [Kajino, 19] –Input: Set of molecular graphs –Output: MHG w/ the following properties: • All the input hypergraphs are in the language $ • Guarantee the valence conditions $ • Guarantee 2-regularity & 44 MHG inference algorithm 1. Convert molecular graphs into molecular hypergraphs 2. Compute tree decompositions of molecular hypergraphs 3. Convert each tree decomposition to be irredundant 4. Extract production rules 5. Compose MHG by taking their union Thanks to HRG Our contribution
  • 44. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 45 Contents
  • 45. © 2019 IBM Corporation /52 We obtain (Enc, Dec) between molecule and latent vector by combining MHG and RNN-VAE §MHG-VAE: (Enc, Dec) between molecule & latent vector 46 Combination with VAE EncG EncN EncH Molecular graph Molecular hypergraph Parse Tree according to MHG ! ∈ ℝ$ Latent vector MHG-VAE encoder MHG Enc of RNN-VAE
  • 46. © 2019 IBM Corporation /52 First, we learn (Enc, Dec) between a molecule and its vector representation using MHG-VAE §Global molecular optimization [Gómez-Bombarelli+, 16] –Find: Molecule that maximizes the target –Method: VAE+BO 1. Obtain MHG from the input molecules 2. Train RNN-VAE on syntax trees 3. Obtain vector representations f. ∈ ℝh .i% ) 1. Some of which have target values {j. ∈ ℝ} 4. BO gives us candidates fk ∈ ℝh ki% l that may maximize the target 5. Decode them to obtain molecules !k ki% l 47 Combination with VAE Image from [Gómez-Bombarelli+, 16]
  • 47. © 2019 IBM Corporation /52 Given vector representations and their target values, we use BO to obtain a vector that optimizes the target §Global molecular optimization [Gómez-Bombarelli+, 16] –Find: Molecule that maximizes the target –Method: VAE+BO 1. Obtain MHG from the input molecules 2. Train RNN-VAE on syntax trees 3. Obtain vector representations f. ∈ ℝh .i% ) 1. Some of which have target values {j. ∈ ℝ} 4. BO gives us candidates fk ∈ ℝh ki% l that may maximize the target 5. Decode them to obtain molecules !k ki% l 48 Combination with VAE Image from [Gómez-Bombarelli+, 16]
  • 48. © 2019 IBM Corporation /52 We evaluate the benefit of our grammar-based representation, compared with existing ones §Empirical study –Purpose: How much does our representation facilitate VAE training? –Baselines: • {C,G,SD}VAE use SMILES (text repr.) • JT-VAE assembles molecular components – It requires NNs other than VAE for scoring –Tasks: • VAE reconstruction • Valid prior ratio • Global molecular optimization 49 Combination with VAE Image from [Jin+, 18]
  • 49. © 2019 IBM Corporation /52 Our grammar-based representation achieves better scores. This result empirically supports the effectiveness of our approach. §Result 50 Combination with VAE Synthetic accessibility score Penalty to a ring larger than six Water solubility Maximizing m(n)
  • 50. © 2019 IBM Corporation /52 A graph grammar can be a building block for a graph generative model §Classify constraints into hard ones and soft ones ML for the soft ones, rules for the hard ones §Define a language by encoding hard constraints E.g., valence conditions §Design a grammar for the language Sometime, w/ an inference algorithm Code is now public on Github https://github.com/ibm-research-tokyo/graph_grammar 51 Takeaways
  • 51. © 2019 IBM Corporation /52 [Aguiñaga+, 16] Aguiñaga, S., Palacios, R., Chiang, D., and Weninger, T.: Growing graphs from hyperedge replacement graph grammars. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 469–478, 2016. [Feder, 71] Feder, J: Plex languages. Information Sciences, 3, pp. 225-241, 1971. [Gómez-Bombarelli+, 16] Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez- Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., and Aspuru-Guzik, A.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 2018. (ArXiv ver. appears in 2016) [Jin+, 18] Jin, W., Barzilay, R., and Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In Proceedings of the Thirty-fifth International Conference on Machine Learning, 2018. [Kajino, 19] Kajino, H.: Molecular hypergraph grammar with its application to molecular optimization. In Proceedings of the Thirty-sixth International Conference on Machine Learning, 2019. [Pavlidis, 72] Pavlidis, T.: Linear and context-free graph grammars. Journal of the ACM, 19(1), pp.11-23, 1972. [You+, 18] You, J., Liu, B., Ying, Z., Pande, V., and Leskovec, J.: Graph convolutional policy network for goal-directed molecular graph generation. In Advances in Neural Information Processing Systems 31, pp. 6412–6422, 2018. . 52 References