SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Graph Neural Network
<2-2๋ถ€> Recommendation - Heterogeneous
Heterogeneous Graph Neural Network Researches
ํ•ด๋‹น ๋…ผ๋ฌธ์€ 2018๋…„์— ๋ฐœํ‘œ๋œ ๋…ผ๋ฌธ์œผ๋กœ GCN ์ด ์—ฌ๋Ÿฌ๊ฐ€์ง€ Task ์—์„œ State-of-the-art ๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  Graph
Neural Network ๊ฐ€ ์ฃผ๋ชฉ๋ฐ›๋Š” ์‹œ์ ์— ์—ฐ๊ตฌ๋œ ๋…ผ๋ฌธ์œผ๋กœ, ์‹ค๋ฌด์—์„œ Graph Neural Network๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š”
์„œ๋กœ ๋‹ค๋ฅธ Node ์™€ Link ๊ฐ„์˜ ์กฐํ•ฉ์— ๋Œ€ํ•œ ํ•ด์„์ด ํ•„์š”ํ•˜์ง€๋งŒ, ๊ธฐ์กด์˜ ์—ฐ๊ตฌ๋Š” ๋™์ผํ•œ Node ์™€ Link ๋กœ ๊ตฌ์„ฑ๋œ
Graph ๋กœ ํ•œ์ •ํ•˜์—ฌ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ๋‹ค๋Š” ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋จ.
Heterogeneous Graph
Neural Network (โ€˜19)
Heterogeneous Graph
Attention Network (โ€˜19)
Heterogeneous
Graph
Transformer(โ€˜20)
GraphSage:
Representation
Learning on Large
Graphs(โ€˜18)
Link
SEMI-SUPERVISED
CLASSIFICATION
WITH GRAPH
CONVOLUTIONAL
NETWORKS(โ€˜17)
Link
Link GitHub
Link
Link
CNN ์„ ํ™œ์šฉํ•˜์—ฌ
Graph Network๋ฅผ
๊ตฌ์„ฑํ•œ ์—ฐ๊ตฌ
(Graph Attention
Network ๋“ฑ ์œ ์‚ฌ ์—ฐ
๊ตฌ ๋‹ค์ˆ˜)
GCN์˜ Large
Graph ์ ์šฉ ํ•œ๊ณ„๋ฅผ
๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด
Sampling ์„ ์ ์šฉ
ํ•œ ์—ฐ๊ตฌ
์„œ๋กœ ๋‹ค๋ฅธ ๋…ธ๋“œ์™€ ๋งํฌ
๊ตฌ์„ฑ์„ ํ•ด์„ํ•˜๊ธฐ ์œ„ํ•œ
์—ฐ๊ตฌ๋กœ ์‚ฌ์ „์— Node
Path ๋ฅผ ์ •์˜ํ•˜๋Š”
Meta-Path ์ค‘์‹ฌ ์—ฐ๊ตฌ
Meta-Path ์—†์ด๋„
Heterogeneous ํ•œ
๊ทธ๋ž˜ํ”„ ํ•ด์„์„ ์œ„
ํ•œ ์—ฐ๊ตฌ
(Transformer์ ์šฉ)
https://arxiv.org/pdf/2003.01332.pdf
HGT(Heterogeneous Graph Transformer)
Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling
structured data. However, most GNNs are designed for homogeneous graphs, in which all nodes and edges
belong to the same types, making them infeasible to represent heterogeneous structures.
๊ธฐ์กด์˜ ์—ฐ๊ตฌ๋“ค์€ ๋…ธ๋“œ์™€ ๋งํฌ์˜ ํ˜•ํƒœ๊ฐ€ ํ†ต์ผ๋œ homogeneous ๊ทธ๋ž˜ํ”„์— ํ•œ์ •ํ•˜์—ฌ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ์œผ๋‚˜,
์‹ค์ œ ํ•„๋“œ์—์„œ๋Š” ๋…ธ๋“œ์™€ ๋งํฌ์˜ ํƒ€์ž…๊ณผ ํ˜•ํƒœ๊ฐ€ ๋‹ค๋ฅธ heterogeneous ๊ฐ€ ์ผ๋ฐ˜์ ์œผ๋กœ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์œ„ํ•œ
์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ ํ•จ.
[heterogeneous]
[homogeneous]
Node ์™€ Link ์˜ ํ˜•ํƒœ๊ฐ€ ๋‹ค๋ฆ„ !
Problem of Past Researches
(1) Meta-Path ์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— Domain Knowledge ์— ๋Œ€ํ•œ ์ดํ•ด๊ฐ€ ํ•„์ˆ˜์ , Domain ๋ณ„ ํŠนํ™” ์„ค๊ณ„ ํ•„์š”
https://www.youtube.com/watch?v=lPP6LRqejA4
[Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation]
์ด์ข…์˜ Node ์กฐํ•ฉ ์ˆœ์„œ ์‚ฌ์ „ ์ •์˜ ๋ฐ
์ด์— ๋”ฐ๋ฅธ ๋ณ„๋„์˜ ๋„คํŠธ์›Œํฌ ์„ค๊ณ„
Problem of Past Researches
(2) ์„œ๋กœ ๋‹ค๋ฅธ Meta-Path ์ •์˜ ๊ฐ„์— ๋ณ„๋„์˜ Weight ๋ฐ Network ๋ฅผ ๊ฐ–๊ธฐ ๋•Œ๋ฌธ์—, ์ถฉ๋ถ„ํ•œ Heterogeneous
์ •๋ณด์˜ ํ•™์Šต์ด ๋˜์ง€ ์•Š์Œ
https://www.youtube.com/watch?v=lPP6LRqejA4
[Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation]
๋…ธ๋“œ, ๋งํฌ ๊ฐ„์˜ ์กฐํ•ฉ์„ ์ƒ๊ฐํ•˜๋ฉด ํ›จ์”ฌ ๋งŽ์€ ์กฐ
ํ•ฉ์ด ์กด์žฌํ•˜์ง€๋งŒ, ์กฐํ•ฉ์„ ๋„ˆ๋ฌด ๋‹ค์–‘ํ™” ํ•  ๊ฒฝ์šฐ,
์กฐํ•ฉ๋ณ„ ์ถฉ๋ถ„ํ•œ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๊ฐ€ ํ™•๋ณด๋˜์ง€ ์•Š์„ ์ˆ˜
์žˆ์–ด, ์กฐํ•ฉ์ด ์ œํ•œ๋จ => ์ถฉ๋ถ„ํ•œ Heterogeneous
ํ•™์Šต์ด ๋˜์ง€ ์•Š์Œ
Problem of Past Researches
(3) Heterogeneous Graph ๋Š” ๋งค์šฐ ๋™์ ์ธ๋ฐ(๋ณ€๋™ ๋  ์ˆ˜ ์žˆ๋Š”๋ฐ) ๊ตฌ์กฐ์ ์œผ๋กœ ๊ทธ ๋ณ€ํ™”๋ฅผ ๋ฐ˜์˜ํ•˜๋Š”๋ฐ ํ•œ๊ณ„
https://www.youtube.com/watch?v=lPP6LRqejA4
[Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation]
ํŒ๋งค ์‹œ๊ฐ„, ์ข…๋ฃŒ ์‹œ๊ฐ„ ๋“ฑ ๋™์ ์œผ๋กœ
๋ณ€ํ•  ์ˆ˜ ์žˆ๋Š” ์š”์†Œ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด์„œ Time
๋ณ„๋กœ ๋ณ„๋„์˜ ๋…ธ๋“œ๋ฅผ ๋งŒ๋“ค์–ด์„œ ํ•ด๊ฒฐ
=> ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ฌธ์ œ ๋ฐœ์ƒ (H/W ๋ฆฌ์†Œ์Šค ๋“ฑ)
Heterogeneous Graph Transformer Approach
[๋™์ผํ•œ ๋…ธ๋“œ ๊ฐ„์—๋„ ๋‹ค์–‘ํ•œ ๊ด€๊ณ„ ํ‘œํ˜„]
(1) Instead of attending on node or edge type alone, we use the meta relation โŸจฯ„ (s),ฯ•(e), ฯ„ (t)โŸฉ to decompose the
interaction and transform matrices, enabling HGT to capture both the common and specific patterns of different
relationships using equal or even fewer parameters.
์ €์ž์™€ ๋…ผ๋ฌธ๊ฐ„์— 1์ €์ž, 2์ €์ž, 3์ €์ž ๋“ฑ
๋™์ผ ๋…ธ๋“œ ๊ฐ„์—๋„ ๋‹ค์–‘ํ•œ ๊ด€๊ณ„๊ฐ€ ์กด์žฌํ• 
์ˆ˜ ์žˆ๋„๋ก ํ‘œํ˜„
Heterogeneous Graph Transformer Approach
[์•„ํ‚คํƒ์ณ ์ž์ฒด๋กœ soft meta path ๊ตฌํ˜„]
(2) Different from most of the existing works that are based on customized meta paths, we rely on the nature
of the neural architecture to incorporate high-order heterogeneous neighbor information, which automatically
learns the importance of implicit meta paths.
Due to the nature of its architecture, HGT can incorporate information from high-order neighbors of different
types through message passing across layers, which can be regarded as โ€œsoftโ€ meta paths. That said, even if
HGT take only its one-hop edges as input without manually designing meta paths, the proposed attention
mechanism can automatically and implicitly learn and extract โ€œmeta pathsโ€ that are important for different
downstream tasks.
์‚ฌ์ „ ์ง€์‹์„ ๊ฐ€์ง€๊ณ  Meta Path ๋ฅผ ์ •์˜ํ•˜์—ฌ ์‚ฌ์šฉ Attention ๊ธฐ๋ฐ˜์œผ๋กœ ์ค‘์š”ํ•œ path ๋ฅผ ์ฐพ์•„์„œ ์•Œ์•„์„œ ๋ฐฐ์šธ ์ˆ˜
์žˆ๋Š” ์‹ ๊ฒฝ๋ง์ ์ธ ๊ตฌ์กฐ๋ฅผ ๋งŒ๋“ค์—ˆ์–ด! => ๋’ค์—์„œ ์ƒ์„ธ ์„ค๋ช…
Heterogeneous Graph Transformer Approach
[Positional Encoding ์ด์šฉ Time Gap ์ ์šฉ]
(3) Most previous works donโ€™t take the dynamic nature of (heterogeneous) graphs into consideration, while we
propose the relative temporal encoding technique to incorporate temporal information by using limited
computational resources.
(4) None of the existing heterogeneous GNNs are designed for and experimented with Web-scale graphs, we
therefore propose the heterogeneous Mini-Batch graph sampling algorithm designed for Web-scale graph training,
enabling experiments on the billion-scale Open Academic Graph.
[Heterogeneous Graph ์˜ Balance ๋ฅผ ๊ณ ๋ คํ•œ Mini-Batch ๊ธฐ๋ฒ•]
Overall Architecture
Overall Architecture-๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ ์˜ˆ์‹œ
Target Node
(Update ๋Œ€์ƒ)
Source Node
(Type-1)
Source Node
(Type-2)
Overall Architecture-Message Parsing ๊ตฌ์กฐ
Attention
Message
Aggregate
Overall Architecture-Transformer ์˜ ์‘์šฉ
[Transformer Multi-head Attention]
Q
K
V
Attention(Q,E,K)
Target Node(t-1) Target Node(t-1)
[HGT Overall Architecture]
Overall Architecture-Heterogeneous Mutual Attention ์ƒ์„ธ (1)
Heterogeneous Mutual Attention
S : Source
T : Target
E : Edge
Multi Head Attention
Overall Architecture-Heterogeneous Mutual Attention ์ƒ์„ธ (2)
we add a prior tensor ยต โˆˆ R |A |ร— |R |ร—
|A | to denote the general significance of
each meta relation triplet
Therefore, unlike the vanilla Transformer
that directly calculates the dot product
between the Query and Key vectors, we
keep a distinct edge based matrix W ATT
ฯ•(e) โˆˆ R d h ร— d h for each edge type ฯ•(e).
In doing so, the model can capture
different semantic relations even between
the same node type pairs
Linear projection (or fully connected layer) is
perhaps one of the most common operations in
deep learning models. When doing linear
projection, we can project a vector x of
dimension n to a vector y of dimension size
m by multiplying a projection matrix W of
shape [n, m]
Heterogeneous Mutual Attention
Overall Architecture-Heterogeneous Mutual Attention ์ƒ์„ธ (3)
Triplet (T - E - S) ์˜ ์ค‘์š”๋„๋ฅผ ๋ฐ˜์˜
๊ฐ๊ฐ ๋‹ค๋ฅธ ๋ฐฑํ„ฐ ์‚ฌ์ด์ฆˆ๋ฅผ Fully Connected Layer
์—ฐ์‚ฐ์„ ํ†ตํ•ด ๋™์ผํ•œ ์‚ฌ์ด์ฆˆ๋กœ ๋งž์ถ”๋Š” ์ž‘์—…
Q K
แ†ž
์›๋ž˜ Transformer ์—์„œ Attention
๋ณ€ํ˜•๋œ Attention (๋‹จ์ˆœ T ์™€ S ์˜ ์œ ์‚ฌ๋„๊ฐ€
์•„๋‹Œ, T - E - S ๋ฅผ ๊ณ ๋ คํ•œ ์œ ์‚ฌ๋„๋ฅผ ๊ตฌํ•จ)
T E
แ†ž S
แ†ž
target edge source
Heterogeneous Mutual Attention
Overall Architecture-Heterogeneous Message Passing ์ƒ์„ธ (1)
Heterogeneous Message Passing
Overall Architecture-Target Specific Aggregation ์ƒ์„ธ (1)
Target Specific Aggregation
Overall Architecture-Relative Temporal Encoding
The traditional way to incorporate temporal information is to construct a separate graph for each time slot.
However, such a procedure may lose a large portion of structural dependencies across different time slots. We
propose the Relative Temporal Encoding (RTE) mechanism to model the dynamic dependencies in
heterogeneous graphs. RTE is inspired by Transformerโ€™s positional encoding method .Specifically, given a
source node s and a target node t, along with their corresponding timestamps T (s) and T (t), we denote the
relative time gap โˆ†T (t,s) = T (t) โˆ’T (s) as an index to get a relative temporal encoding RT E(โˆ†T (t,s)).
์ง
ํ™€
ํƒ€๊ฒŸ ๋…ธ๋“œ์™€ ์†Œ์Šค ๋…ธ๋“œ์˜ ์‹œ๊ฐ„ Gap ์„ Positional Encoding ๊ธฐ๋ฒ•์œผ๋กœ
Embedding ํ•˜๊ณ  ์ด ๊ฐ’์„ Source ์— (+) ํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ƒ๋Œ€์ ์ธ ์‹œ๊ฐ„
์ฐจ์ด๋ฅผ ํ‘œํ˜„ํ•œ๋‹ค.
์ฐธ๊ณ  - Layer Dependent Importance Sampling
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
https://arxiv.org/pdf/1911.07323.pdf
(a)๊ฐ™์€ ๋…ธ๋“œ๋ฅผ ๋ฐ˜
๋ณต์ ์œผ๋กœ Sampling
ํ•จ
(b)๊ทผ์ ‘ํ•˜์ง€ ์•Š์€
๋…ธ๋“œ๋ฅผ
Sampling ํ•จ
(a),(b) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐ
ํ•˜๊ณ  ์ด์›ƒ ๋…ธ๋“œ๋ฅผ
์ƒ˜ํ”Œ๋ง
- ๊ฒ€์€์ƒ‰ : ํƒ€๊ฒŸ ๋…ธ๋“œ
- ํŒŒ๋ž€์ƒ‰ : ์ด์›ƒ ๋…ธ๋“œ
- ๋ถ‰์€ ํ…Œ๋‘๋ฆฌ : ์ƒ˜ํ”Œ๋ง ๋…ธ๋“œ
์ฐธ๊ณ  - Layer Dependent Importance Sampling
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
https://arxiv.org/pdf/1911.07323.pdf
์‹ค์ œ๋กœ๋Š” Normalized Laplacian (์˜ˆ์‹œ๋Š” D-A)
i ๋ฒˆ์งธ
Overall Architecture - HGSampling
[๊ธฐ์กด Sampling ์ ์šฉ์‹œ ๋ฌธ์ œ์ ] directly using them for
heterogeneous graphs is prone to get sub-graphs that are
extremely imbalanced regarding different node types, due to
that the degree distribution and the total number of nodes for
each type can vary dramatically
=> Sub-Graph ๊ฐ€ ๋…ธ๋“œ ํƒ€์ž…๊ณผ ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ ์ •๋„๋ผ๋Š” ์ธก๋ฉด์—
์„œ ๋งค์šฐ ๋ถˆ๊ท ํ˜• ํ•ด์งˆ ์ˆ˜๊ฐ€ ์žˆ๋‹ค.
[ํ•ด๊ฒฐ ๋ฐฉ์•ˆ]
1) keep a similar number of nodes and edges for each type
=> ๋…ธ๋“œ์™€ ์—ฃ์ง€ ํƒ€์ž…์„ ๊ท ๋“ฑํ•˜๊ฒŒ ๋˜๋„๋ก Type ์„ ๊ตฌ๋ถ„ํ•ด์„œ
๊ด€๋ฆฌํ•˜๋Š” Budget ์„ ์šด์˜ํ•œ๋‹ค.
2) keep the sampled sub-graph dense to minimize the
information loss and reduce the sample variance
=> ์—ฌ๋Ÿฌ๊ฐœ์˜ ์ธ์ ‘ ๋…ธ๋“œ๋“ค ์ค‘์— ๋ฌด์—‡์„ ์„ ํƒํ• ์ง€์˜ ๋ฌธ์ œ์— ์žˆ
์–ด์„œ, ์ธ์ ‘ ๋…ธ๋“œ์˜ Sampling Probability ๋ฅผ ๊ตฌํ•˜์—ฌ Sampling
Layer-Dependent Importance Sampling ์ฐธ์กฐ
algorithm2
Overall Architecture - Inductive Timestamp Assignment
Till now we have assumed that each node t is assigned with a timestamp T (t). However, in real-world
heterogeneous graphs, many nodes are not associated with a fixed time. Therefore, we need to assign
different timestamps to it. We denote these nodes as plain nodes.
Algorithm 1 ์˜ Add-In-Budget Method ์ƒ์„ธ
(Budget ์— ๋‹ด์„ ๋•Œ, Time Stamp ๊ฐ€ ์—†๋‹ค๋ฉด,
Target ์˜ Time ์„ ์ƒ์† ๋ฐ›๋„๋ก ํ•จ)
Time ์ด ์žˆ๋Š” ๊ฒฝ์šฐ์—๋Š” D_t ๋ฅผ ๋”ํ•ด ์คŒ
Overall Architecture - HGSampling with Inductive Timestamp Assignment
Node Type ๋ณ„๋กœ Budget
์ด ๊ด€๋ฆฌ ๋จ ! ๊ท ๋“ฑํ•˜๊ฒŒ
Sampling ํ•˜๊ธฐ ์œ„ํ•จ
p1 ์„ ์‹œ์ž‘์œผ๋กœ ์ดˆ๊ธฐํ™”
P1 ๊ณผ ์—ฐ๊ฒฐ๋œ ๋ชจ๋“  ๋…ธ๋“œ๋ฅผ
Budget ์— ์ถ”๊ฐ€ ํ•จ
ํƒ€์ž…๋ณ„๋กœ n ๊ฐœ๋ฅผ ๊ณจ๋ผ์„œ Output ์— ์ถ”๊ฐ€ํ•˜๊ณ ,
Budget ์—์„œ๋Š” pop-out ์‹œํ‚ด ์—ฌ๊ธฐ์„œ n=1 ์ด๊ณ  ๋™๊ทธ
๋ผ๋ฏธ ํƒ€์ž… Budget ์— ์†Œ์Šค ๋…ธ๋“œ ํ•˜๋‚˜ ๋‚จ์•˜์Œ
(์—ฌ๊ธฐ์„œ importance sampling ์„ ์‚ฌ์šฉ)
์•ž์—์„œ ์„ ํƒํ•œ type ๋ณ„ n ๊ฐœ์˜ ๋…ธ๋“œ๋กœ ๋ถ€ํ„ฐ ์ธ์ ‘ํ•œ
๋…ธ๋“œ๋ฅผ ๋‹ค์‹œ ์ฐพ์•„์„œ Budget ์— ๋„ฃ๋Š”๋‹ค
Budget ์— ๋„ฃ์€ Node ๋ฅผ ๋‹ค์‹œ Importance
Sampling ์„ ํ†ตํ•ด์„œ n๊ฐœ๋ฅผ ์„ ํƒํ•ด์„œ Output ์—
์ถ”๊ฐ€ํ•˜๊ณ , Pop-Out ์‹œํ‚ด

Weitere รคhnliche Inhalte

Was ist angesagt?

Osaka prml reading_2.3
Osaka prml reading_2.3Osaka prml reading_2.3
Osaka prml reading_2.3
florets1
ย 
ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )
ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )
ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )
Takao Yamanaka
ย 

Was ist angesagt? (20)

PRML 4.4-4.5.2 ใƒฉใƒ—ใƒฉใ‚น่ฟ‘ไผผ
PRML 4.4-4.5.2 ใƒฉใƒ—ใƒฉใ‚น่ฟ‘ไผผPRML 4.4-4.5.2 ใƒฉใƒ—ใƒฉใ‚น่ฟ‘ไผผ
PRML 4.4-4.5.2 ใƒฉใƒ—ใƒฉใ‚น่ฟ‘ไผผ
ย 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
ย 
Graph kernels
Graph kernelsGraph kernels
Graph kernels
ย 
PRML่ผช่ชญ#12
PRML่ผช่ชญ#12PRML่ผช่ชญ#12
PRML่ผช่ชญ#12
ย 
Implements Cascaded Shadow Maps with using Texture Array
Implements Cascaded Shadow Maps with using Texture ArrayImplements Cascaded Shadow Maps with using Texture Array
Implements Cascaded Shadow Maps with using Texture Array
ย 
PRML่ผช่ชญ#8
PRML่ผช่ชญ#8PRML่ผช่ชญ#8
PRML่ผช่ชญ#8
ย 
PRML EPๆณ• 10.7 10.7.2
PRML EPๆณ• 10.7 10.7.2 PRML EPๆณ• 10.7 10.7.2
PRML EPๆณ• 10.7 10.7.2
ย 
ๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•ใฎ่ชฌๆ˜Ž
ๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•ใฎ่ชฌๆ˜Žๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•ใฎ่ชฌๆ˜Ž
ๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•ใฎ่ชฌๆ˜Ž
ย 
Random Features Strengthen Graph Neural Networks
Random Features Strengthen Graph Neural NetworksRandom Features Strengthen Graph Neural Networks
Random Features Strengthen Graph Neural Networks
ย 
Osaka prml reading_2.3
Osaka prml reading_2.3Osaka prml reading_2.3
Osaka prml reading_2.3
ย 
ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )
ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )
ๅค‰ๅˆ†ๆŽจ่ซ–ๆณ•๏ผˆๅค‰ๅˆ†ใƒ™ใ‚คใ‚บๆณ•๏ผ‰(PRML็ฌฌ10็ซ )
ย 
YOU ใฏไฝ•ใ—ใฆ VLDB2020 Tokyo ใธ๏ผŸ (ใ‚ฐใƒฉใƒ•็ทจ)
YOU ใฏไฝ•ใ—ใฆ VLDB2020 Tokyo ใธ๏ผŸ (ใ‚ฐใƒฉใƒ•็ทจ)YOU ใฏไฝ•ใ—ใฆ VLDB2020 Tokyo ใธ๏ผŸ (ใ‚ฐใƒฉใƒ•็ทจ)
YOU ใฏไฝ•ใ—ใฆ VLDB2020 Tokyo ใธ๏ผŸ (ใ‚ฐใƒฉใƒ•็ทจ)
ย 
Graph convolutional matrix completion
Graph convolutional  matrix completionGraph convolutional  matrix completion
Graph convolutional matrix completion
ย 
Prml 10 1
Prml 10 1Prml 10 1
Prml 10 1
ย 
Approximate Inference (Chapter 10, PRML Reading)
Approximate Inference (Chapter 10, PRML Reading)Approximate Inference (Chapter 10, PRML Reading)
Approximate Inference (Chapter 10, PRML Reading)
ย 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
ย 
Data assim r
Data assim rData assim r
Data assim r
ย 
้šŽๅฑคใƒ™ใ‚คใ‚บใจ่‡ช็”ฑใ‚จใƒใƒซใ‚ฎใƒผ
้šŽๅฑคใƒ™ใ‚คใ‚บใจ่‡ช็”ฑใ‚จใƒใƒซใ‚ฎใƒผ้šŽๅฑคใƒ™ใ‚คใ‚บใจ่‡ช็”ฑใ‚จใƒใƒซใ‚ฎใƒผ
้šŽๅฑคใƒ™ใ‚คใ‚บใจ่‡ช็”ฑใ‚จใƒใƒซใ‚ฎใƒผ
ย 
PRML่ผช่ชญ#10
PRML่ผช่ชญ#10PRML่ผช่ชญ#10
PRML่ผช่ชญ#10
ย 
้ซ˜้€Ÿใชใ‚ฌใƒณใƒžๅˆ†ๅธƒใฎๆœ€ๅฐคๆŽจๅฎšๆณ•ใซใคใ„ใฆ
้ซ˜้€Ÿใชใ‚ฌใƒณใƒžๅˆ†ๅธƒใฎๆœ€ๅฐคๆŽจๅฎšๆณ•ใซใคใ„ใฆ้ซ˜้€Ÿใชใ‚ฌใƒณใƒžๅˆ†ๅธƒใฎๆœ€ๅฐคๆŽจๅฎšๆณ•ใซใคใ„ใฆ
้ซ˜้€Ÿใชใ‚ฌใƒณใƒžๅˆ†ๅธƒใฎๆœ€ๅฐคๆŽจๅฎšๆณ•ใซใคใ„ใฆ
ย 

ร„hnlich wie Graph neural network #2-2 (heterogeneous graph transformer)

Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€
Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€ Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€
Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€
bitnineglobal
ย 
Big Bird - Transformers for Longer Sequences
Big Bird - Transformers for Longer SequencesBig Bird - Transformers for Longer Sequences
Big Bird - Transformers for Longer Sequences
taeseon ryu
ย 
Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...
taeseon ryu
ย 
220112 ์ง€์Šนํ˜„ mauve
220112 ์ง€์Šนํ˜„ mauve220112 ์ง€์Šนํ˜„ mauve
220112 ์ง€์Šนํ˜„ mauve
ssuser23ed0c
ย 
์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4
์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4
์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4
KoChungWook
ย 

ร„hnlich wie Graph neural network #2-2 (heterogeneous graph transformer) (20)

Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€
Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€ Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€
Graph Database Meetup in Korea #4. ๊ทธ๋ž˜ํ”„ ์ด๋ก ์„ ์ ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™œ์šฉ ์‚ฌ๋ก€
ย 
deep encoder, shallow decoder reevaluating non-autoregressive machine transl...
deep encoder, shallow decoder  reevaluating non-autoregressive machine transl...deep encoder, shallow decoder  reevaluating non-autoregressive machine transl...
deep encoder, shallow decoder reevaluating non-autoregressive machine transl...
ย 
Bidirectional attention flow for machine comprehension
Bidirectional attention flow for machine comprehensionBidirectional attention flow for machine comprehension
Bidirectional attention flow for machine comprehension
ย 
[์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
 [์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others [์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
[์ปดํ“จํ„ฐ๋น„์ „๊ณผ ์ธ๊ณต์ง€๋Šฅ] 8. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜ 5 - Others
ย 
MRC recent trend_ppt
MRC recent trend_pptMRC recent trend_ppt
MRC recent trend_ppt
ย 
InfoGAN Paper Review
InfoGAN Paper ReviewInfoGAN Paper Review
InfoGAN Paper Review
ย 
Neural Machine Translation ๊ธฐ๋ฐ˜์˜ ์˜์–ด-์ผ๋ณธ์–ด ์ž๋™๋ฒˆ์—ญ
Neural Machine Translation ๊ธฐ๋ฐ˜์˜ ์˜์–ด-์ผ๋ณธ์–ด ์ž๋™๋ฒˆ์—ญNeural Machine Translation ๊ธฐ๋ฐ˜์˜ ์˜์–ด-์ผ๋ณธ์–ด ์ž๋™๋ฒˆ์—ญ
Neural Machine Translation ๊ธฐ๋ฐ˜์˜ ์˜์–ด-์ผ๋ณธ์–ด ์ž๋™๋ฒˆ์—ญ
ย 
Exploring Deep Learning Acceleration Technology Embedded in LLMs
Exploring Deep Learning Acceleration Technology Embedded in LLMsExploring Deep Learning Acceleration Technology Embedded in LLMs
Exploring Deep Learning Acceleration Technology Embedded in LLMs
ย 
LSTM ๋„คํŠธ์›Œํฌ ์ดํ•ดํ•˜๊ธฐ
LSTM ๋„คํŠธ์›Œํฌ ์ดํ•ดํ•˜๊ธฐLSTM ๋„คํŠธ์›Œํฌ ์ดํ•ดํ•˜๊ธฐ
LSTM ๋„คํŠธ์›Œํฌ ์ดํ•ดํ•˜๊ธฐ
ย 
ํŒŒ์ด์ฝ˜ ํ•œ๊ตญ 2019 ํŠœํ† ๋ฆฌ์–ผ - LRP (Part 2)
ํŒŒ์ด์ฝ˜ ํ•œ๊ตญ 2019 ํŠœํ† ๋ฆฌ์–ผ - LRP (Part 2)ํŒŒ์ด์ฝ˜ ํ•œ๊ตญ 2019 ํŠœํ† ๋ฆฌ์–ผ - LRP (Part 2)
ํŒŒ์ด์ฝ˜ ํ•œ๊ตญ 2019 ํŠœํ† ๋ฆฌ์–ผ - LRP (Part 2)
ย 
Big Bird - Transformers for Longer Sequences
Big Bird - Transformers for Longer SequencesBig Bird - Transformers for Longer Sequences
Big Bird - Transformers for Longer Sequences
ย 
PR-218: MFAS: Multimodal Fusion Architecture Search
PR-218: MFAS: Multimodal Fusion Architecture SearchPR-218: MFAS: Multimodal Fusion Architecture Search
PR-218: MFAS: Multimodal Fusion Architecture Search
ย 
Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...
ย 
220112 ์ง€์Šนํ˜„ mauve
220112 ์ง€์Šนํ˜„ mauve220112 ์ง€์Šนํ˜„ mauve
220112 ์ง€์Šนํ˜„ mauve
ย 
์ˆœํ™˜์‹ ๊ฒฝ๋ง(Recurrent neural networks) ๊ฐœ์š”
์ˆœํ™˜์‹ ๊ฒฝ๋ง(Recurrent neural networks) ๊ฐœ์š”์ˆœํ™˜์‹ ๊ฒฝ๋ง(Recurrent neural networks) ๊ฐœ์š”
์ˆœํ™˜์‹ ๊ฒฝ๋ง(Recurrent neural networks) ๊ฐœ์š”
ย 
Text summarization
Text summarizationText summarization
Text summarization
ย 
DeBERTA : Decoding-Enhanced BERT with Disentangled Attention
DeBERTA : Decoding-Enhanced BERT with Disentangled AttentionDeBERTA : Decoding-Enhanced BERT with Disentangled Attention
DeBERTA : Decoding-Enhanced BERT with Disentangled Attention
ย 
Chapter 17 monte carlo methods
Chapter 17 monte carlo methodsChapter 17 monte carlo methods
Chapter 17 monte carlo methods
ย 
์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4
์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4
์ด์‚ฐ์น˜์ˆ˜ํ•™ Project4
ย 
NoSQL ๊ฐ„๋‹จํ•œ ์†Œ๊ฐœ
NoSQL ๊ฐ„๋‹จํ•œ ์†Œ๊ฐœNoSQL ๊ฐ„๋‹จํ•œ ์†Œ๊ฐœ
NoSQL ๊ฐ„๋‹จํ•œ ์†Œ๊ฐœ
ย 

Mehr von seungwoo kim

Mehr von seungwoo kim (9)

Graph Neural Network #2-1 (PinSage)
Graph Neural Network #2-1 (PinSage)Graph Neural Network #2-1 (PinSage)
Graph Neural Network #2-1 (PinSage)
ย 
Graph neural network 2๋ถ€ recommendation ๊ฐœ์š”
Graph neural network  2๋ถ€  recommendation ๊ฐœ์š”Graph neural network  2๋ถ€  recommendation ๊ฐœ์š”
Graph neural network 2๋ถ€ recommendation ๊ฐœ์š”
ย 
Graph Neural Network 1๋ถ€
Graph Neural Network 1๋ถ€Graph Neural Network 1๋ถ€
Graph Neural Network 1๋ถ€
ย 
Enhancing VAEs for collaborative filtering : flexible priors & gating mechanisms
Enhancing VAEs for collaborative filtering : flexible priors & gating mechanismsEnhancing VAEs for collaborative filtering : flexible priors & gating mechanisms
Enhancing VAEs for collaborative filtering : flexible priors & gating mechanisms
ย 
Deep neural networks for You-Tube recommendations
Deep neural networks for You-Tube recommendationsDeep neural networks for You-Tube recommendations
Deep neural networks for You-Tube recommendations
ย 
XAI recent researches
XAI recent researchesXAI recent researches
XAI recent researches
ย 
Albert
AlbertAlbert
Albert
ย 
Siamese neural networks+Bert
Siamese neural networks+BertSiamese neural networks+Bert
Siamese neural networks+Bert
ย 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
ย 

Graph neural network #2-2 (heterogeneous graph transformer)

  • 1. Graph Neural Network <2-2๋ถ€> Recommendation - Heterogeneous
  • 2. Heterogeneous Graph Neural Network Researches ํ•ด๋‹น ๋…ผ๋ฌธ์€ 2018๋…„์— ๋ฐœํ‘œ๋œ ๋…ผ๋ฌธ์œผ๋กœ GCN ์ด ์—ฌ๋Ÿฌ๊ฐ€์ง€ Task ์—์„œ State-of-the-art ๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  Graph Neural Network ๊ฐ€ ์ฃผ๋ชฉ๋ฐ›๋Š” ์‹œ์ ์— ์—ฐ๊ตฌ๋œ ๋…ผ๋ฌธ์œผ๋กœ, ์‹ค๋ฌด์—์„œ Graph Neural Network๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์„œ๋กœ ๋‹ค๋ฅธ Node ์™€ Link ๊ฐ„์˜ ์กฐํ•ฉ์— ๋Œ€ํ•œ ํ•ด์„์ด ํ•„์š”ํ•˜์ง€๋งŒ, ๊ธฐ์กด์˜ ์—ฐ๊ตฌ๋Š” ๋™์ผํ•œ Node ์™€ Link ๋กœ ๊ตฌ์„ฑ๋œ Graph ๋กœ ํ•œ์ •ํ•˜์—ฌ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ๋‹ค๋Š” ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋จ. Heterogeneous Graph Neural Network (โ€˜19) Heterogeneous Graph Attention Network (โ€˜19) Heterogeneous Graph Transformer(โ€˜20) GraphSage: Representation Learning on Large Graphs(โ€˜18) Link SEMI-SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS(โ€˜17) Link Link GitHub Link Link CNN ์„ ํ™œ์šฉํ•˜์—ฌ Graph Network๋ฅผ ๊ตฌ์„ฑํ•œ ์—ฐ๊ตฌ (Graph Attention Network ๋“ฑ ์œ ์‚ฌ ์—ฐ ๊ตฌ ๋‹ค์ˆ˜) GCN์˜ Large Graph ์ ์šฉ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด Sampling ์„ ์ ์šฉ ํ•œ ์—ฐ๊ตฌ ์„œ๋กœ ๋‹ค๋ฅธ ๋…ธ๋“œ์™€ ๋งํฌ ๊ตฌ์„ฑ์„ ํ•ด์„ํ•˜๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋กœ ์‚ฌ์ „์— Node Path ๋ฅผ ์ •์˜ํ•˜๋Š” Meta-Path ์ค‘์‹ฌ ์—ฐ๊ตฌ Meta-Path ์—†์ด๋„ Heterogeneous ํ•œ ๊ทธ๋ž˜ํ”„ ํ•ด์„์„ ์œ„ ํ•œ ์—ฐ๊ตฌ (Transformer์ ์šฉ)
  • 4. HGT(Heterogeneous Graph Transformer) Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed for homogeneous graphs, in which all nodes and edges belong to the same types, making them infeasible to represent heterogeneous structures. ๊ธฐ์กด์˜ ์—ฐ๊ตฌ๋“ค์€ ๋…ธ๋“œ์™€ ๋งํฌ์˜ ํ˜•ํƒœ๊ฐ€ ํ†ต์ผ๋œ homogeneous ๊ทธ๋ž˜ํ”„์— ํ•œ์ •ํ•˜์—ฌ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ์œผ๋‚˜, ์‹ค์ œ ํ•„๋“œ์—์„œ๋Š” ๋…ธ๋“œ์™€ ๋งํฌ์˜ ํƒ€์ž…๊ณผ ํ˜•ํƒœ๊ฐ€ ๋‹ค๋ฅธ heterogeneous ๊ฐ€ ์ผ๋ฐ˜์ ์œผ๋กœ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์œ„ํ•œ ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ ํ•จ. [heterogeneous] [homogeneous] Node ์™€ Link ์˜ ํ˜•ํƒœ๊ฐ€ ๋‹ค๋ฆ„ !
  • 5. Problem of Past Researches (1) Meta-Path ์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— Domain Knowledge ์— ๋Œ€ํ•œ ์ดํ•ด๊ฐ€ ํ•„์ˆ˜์ , Domain ๋ณ„ ํŠนํ™” ์„ค๊ณ„ ํ•„์š” https://www.youtube.com/watch?v=lPP6LRqejA4 [Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation] ์ด์ข…์˜ Node ์กฐํ•ฉ ์ˆœ์„œ ์‚ฌ์ „ ์ •์˜ ๋ฐ ์ด์— ๋”ฐ๋ฅธ ๋ณ„๋„์˜ ๋„คํŠธ์›Œํฌ ์„ค๊ณ„
  • 6. Problem of Past Researches (2) ์„œ๋กœ ๋‹ค๋ฅธ Meta-Path ์ •์˜ ๊ฐ„์— ๋ณ„๋„์˜ Weight ๋ฐ Network ๋ฅผ ๊ฐ–๊ธฐ ๋•Œ๋ฌธ์—, ์ถฉ๋ถ„ํ•œ Heterogeneous ์ •๋ณด์˜ ํ•™์Šต์ด ๋˜์ง€ ์•Š์Œ https://www.youtube.com/watch?v=lPP6LRqejA4 [Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation] ๋…ธ๋“œ, ๋งํฌ ๊ฐ„์˜ ์กฐํ•ฉ์„ ์ƒ๊ฐํ•˜๋ฉด ํ›จ์”ฌ ๋งŽ์€ ์กฐ ํ•ฉ์ด ์กด์žฌํ•˜์ง€๋งŒ, ์กฐํ•ฉ์„ ๋„ˆ๋ฌด ๋‹ค์–‘ํ™” ํ•  ๊ฒฝ์šฐ, ์กฐํ•ฉ๋ณ„ ์ถฉ๋ถ„ํ•œ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๊ฐ€ ํ™•๋ณด๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์–ด, ์กฐํ•ฉ์ด ์ œํ•œ๋จ => ์ถฉ๋ถ„ํ•œ Heterogeneous ํ•™์Šต์ด ๋˜์ง€ ์•Š์Œ
  • 7. Problem of Past Researches (3) Heterogeneous Graph ๋Š” ๋งค์šฐ ๋™์ ์ธ๋ฐ(๋ณ€๋™ ๋  ์ˆ˜ ์žˆ๋Š”๋ฐ) ๊ตฌ์กฐ์ ์œผ๋กœ ๊ทธ ๋ณ€ํ™”๋ฅผ ๋ฐ˜์˜ํ•˜๋Š”๋ฐ ํ•œ๊ณ„ https://www.youtube.com/watch?v=lPP6LRqejA4 [Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation] ํŒ๋งค ์‹œ๊ฐ„, ์ข…๋ฃŒ ์‹œ๊ฐ„ ๋“ฑ ๋™์ ์œผ๋กœ ๋ณ€ํ•  ์ˆ˜ ์žˆ๋Š” ์š”์†Œ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด์„œ Time ๋ณ„๋กœ ๋ณ„๋„์˜ ๋…ธ๋“œ๋ฅผ ๋งŒ๋“ค์–ด์„œ ํ•ด๊ฒฐ => ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ฌธ์ œ ๋ฐœ์ƒ (H/W ๋ฆฌ์†Œ์Šค ๋“ฑ)
  • 8. Heterogeneous Graph Transformer Approach [๋™์ผํ•œ ๋…ธ๋“œ ๊ฐ„์—๋„ ๋‹ค์–‘ํ•œ ๊ด€๊ณ„ ํ‘œํ˜„] (1) Instead of attending on node or edge type alone, we use the meta relation โŸจฯ„ (s),ฯ•(e), ฯ„ (t)โŸฉ to decompose the interaction and transform matrices, enabling HGT to capture both the common and specific patterns of different relationships using equal or even fewer parameters. ์ €์ž์™€ ๋…ผ๋ฌธ๊ฐ„์— 1์ €์ž, 2์ €์ž, 3์ €์ž ๋“ฑ ๋™์ผ ๋…ธ๋“œ ๊ฐ„์—๋„ ๋‹ค์–‘ํ•œ ๊ด€๊ณ„๊ฐ€ ์กด์žฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ‘œํ˜„
  • 9. Heterogeneous Graph Transformer Approach [์•„ํ‚คํƒ์ณ ์ž์ฒด๋กœ soft meta path ๊ตฌํ˜„] (2) Different from most of the existing works that are based on customized meta paths, we rely on the nature of the neural architecture to incorporate high-order heterogeneous neighbor information, which automatically learns the importance of implicit meta paths. Due to the nature of its architecture, HGT can incorporate information from high-order neighbors of different types through message passing across layers, which can be regarded as โ€œsoftโ€ meta paths. That said, even if HGT take only its one-hop edges as input without manually designing meta paths, the proposed attention mechanism can automatically and implicitly learn and extract โ€œmeta pathsโ€ that are important for different downstream tasks. ์‚ฌ์ „ ์ง€์‹์„ ๊ฐ€์ง€๊ณ  Meta Path ๋ฅผ ์ •์˜ํ•˜์—ฌ ์‚ฌ์šฉ Attention ๊ธฐ๋ฐ˜์œผ๋กœ ์ค‘์š”ํ•œ path ๋ฅผ ์ฐพ์•„์„œ ์•Œ์•„์„œ ๋ฐฐ์šธ ์ˆ˜ ์žˆ๋Š” ์‹ ๊ฒฝ๋ง์ ์ธ ๊ตฌ์กฐ๋ฅผ ๋งŒ๋“ค์—ˆ์–ด! => ๋’ค์—์„œ ์ƒ์„ธ ์„ค๋ช…
  • 10. Heterogeneous Graph Transformer Approach [Positional Encoding ์ด์šฉ Time Gap ์ ์šฉ] (3) Most previous works donโ€™t take the dynamic nature of (heterogeneous) graphs into consideration, while we propose the relative temporal encoding technique to incorporate temporal information by using limited computational resources. (4) None of the existing heterogeneous GNNs are designed for and experimented with Web-scale graphs, we therefore propose the heterogeneous Mini-Batch graph sampling algorithm designed for Web-scale graph training, enabling experiments on the billion-scale Open Academic Graph. [Heterogeneous Graph ์˜ Balance ๋ฅผ ๊ณ ๋ คํ•œ Mini-Batch ๊ธฐ๋ฒ•]
  • 12. Overall Architecture-๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ ์˜ˆ์‹œ Target Node (Update ๋Œ€์ƒ) Source Node (Type-1) Source Node (Type-2)
  • 13. Overall Architecture-Message Parsing ๊ตฌ์กฐ Attention Message Aggregate
  • 14. Overall Architecture-Transformer ์˜ ์‘์šฉ [Transformer Multi-head Attention] Q K V Attention(Q,E,K) Target Node(t-1) Target Node(t-1) [HGT Overall Architecture]
  • 15. Overall Architecture-Heterogeneous Mutual Attention ์ƒ์„ธ (1) Heterogeneous Mutual Attention S : Source T : Target E : Edge Multi Head Attention
  • 16. Overall Architecture-Heterogeneous Mutual Attention ์ƒ์„ธ (2) we add a prior tensor ยต โˆˆ R |A |ร— |R |ร— |A | to denote the general significance of each meta relation triplet Therefore, unlike the vanilla Transformer that directly calculates the dot product between the Query and Key vectors, we keep a distinct edge based matrix W ATT ฯ•(e) โˆˆ R d h ร— d h for each edge type ฯ•(e). In doing so, the model can capture different semantic relations even between the same node type pairs Linear projection (or fully connected layer) is perhaps one of the most common operations in deep learning models. When doing linear projection, we can project a vector x of dimension n to a vector y of dimension size m by multiplying a projection matrix W of shape [n, m] Heterogeneous Mutual Attention
  • 17. Overall Architecture-Heterogeneous Mutual Attention ์ƒ์„ธ (3) Triplet (T - E - S) ์˜ ์ค‘์š”๋„๋ฅผ ๋ฐ˜์˜ ๊ฐ๊ฐ ๋‹ค๋ฅธ ๋ฐฑํ„ฐ ์‚ฌ์ด์ฆˆ๋ฅผ Fully Connected Layer ์—ฐ์‚ฐ์„ ํ†ตํ•ด ๋™์ผํ•œ ์‚ฌ์ด์ฆˆ๋กœ ๋งž์ถ”๋Š” ์ž‘์—… Q K แ†ž ์›๋ž˜ Transformer ์—์„œ Attention ๋ณ€ํ˜•๋œ Attention (๋‹จ์ˆœ T ์™€ S ์˜ ์œ ์‚ฌ๋„๊ฐ€ ์•„๋‹Œ, T - E - S ๋ฅผ ๊ณ ๋ คํ•œ ์œ ์‚ฌ๋„๋ฅผ ๊ตฌํ•จ) T E แ†ž S แ†ž target edge source Heterogeneous Mutual Attention
  • 18. Overall Architecture-Heterogeneous Message Passing ์ƒ์„ธ (1) Heterogeneous Message Passing
  • 19. Overall Architecture-Target Specific Aggregation ์ƒ์„ธ (1) Target Specific Aggregation
  • 20. Overall Architecture-Relative Temporal Encoding The traditional way to incorporate temporal information is to construct a separate graph for each time slot. However, such a procedure may lose a large portion of structural dependencies across different time slots. We propose the Relative Temporal Encoding (RTE) mechanism to model the dynamic dependencies in heterogeneous graphs. RTE is inspired by Transformerโ€™s positional encoding method .Specifically, given a source node s and a target node t, along with their corresponding timestamps T (s) and T (t), we denote the relative time gap โˆ†T (t,s) = T (t) โˆ’T (s) as an index to get a relative temporal encoding RT E(โˆ†T (t,s)). ์ง ํ™€ ํƒ€๊ฒŸ ๋…ธ๋“œ์™€ ์†Œ์Šค ๋…ธ๋“œ์˜ ์‹œ๊ฐ„ Gap ์„ Positional Encoding ๊ธฐ๋ฒ•์œผ๋กœ Embedding ํ•˜๊ณ  ์ด ๊ฐ’์„ Source ์— (+) ํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ƒ๋Œ€์ ์ธ ์‹œ๊ฐ„ ์ฐจ์ด๋ฅผ ํ‘œํ˜„ํ•œ๋‹ค.
  • 21. ์ฐธ๊ณ  - Layer Dependent Importance Sampling Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks https://arxiv.org/pdf/1911.07323.pdf (a)๊ฐ™์€ ๋…ธ๋“œ๋ฅผ ๋ฐ˜ ๋ณต์ ์œผ๋กœ Sampling ํ•จ (b)๊ทผ์ ‘ํ•˜์ง€ ์•Š์€ ๋…ธ๋“œ๋ฅผ Sampling ํ•จ (a),(b) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐ ํ•˜๊ณ  ์ด์›ƒ ๋…ธ๋“œ๋ฅผ ์ƒ˜ํ”Œ๋ง - ๊ฒ€์€์ƒ‰ : ํƒ€๊ฒŸ ๋…ธ๋“œ - ํŒŒ๋ž€์ƒ‰ : ์ด์›ƒ ๋…ธ๋“œ - ๋ถ‰์€ ํ…Œ๋‘๋ฆฌ : ์ƒ˜ํ”Œ๋ง ๋…ธ๋“œ
  • 22. ์ฐธ๊ณ  - Layer Dependent Importance Sampling Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks https://arxiv.org/pdf/1911.07323.pdf ์‹ค์ œ๋กœ๋Š” Normalized Laplacian (์˜ˆ์‹œ๋Š” D-A) i ๋ฒˆ์งธ
  • 23. Overall Architecture - HGSampling [๊ธฐ์กด Sampling ์ ์šฉ์‹œ ๋ฌธ์ œ์ ] directly using them for heterogeneous graphs is prone to get sub-graphs that are extremely imbalanced regarding different node types, due to that the degree distribution and the total number of nodes for each type can vary dramatically => Sub-Graph ๊ฐ€ ๋…ธ๋“œ ํƒ€์ž…๊ณผ ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ ์ •๋„๋ผ๋Š” ์ธก๋ฉด์— ์„œ ๋งค์šฐ ๋ถˆ๊ท ํ˜• ํ•ด์งˆ ์ˆ˜๊ฐ€ ์žˆ๋‹ค. [ํ•ด๊ฒฐ ๋ฐฉ์•ˆ] 1) keep a similar number of nodes and edges for each type => ๋…ธ๋“œ์™€ ์—ฃ์ง€ ํƒ€์ž…์„ ๊ท ๋“ฑํ•˜๊ฒŒ ๋˜๋„๋ก Type ์„ ๊ตฌ๋ถ„ํ•ด์„œ ๊ด€๋ฆฌํ•˜๋Š” Budget ์„ ์šด์˜ํ•œ๋‹ค. 2) keep the sampled sub-graph dense to minimize the information loss and reduce the sample variance => ์—ฌ๋Ÿฌ๊ฐœ์˜ ์ธ์ ‘ ๋…ธ๋“œ๋“ค ์ค‘์— ๋ฌด์—‡์„ ์„ ํƒํ• ์ง€์˜ ๋ฌธ์ œ์— ์žˆ ์–ด์„œ, ์ธ์ ‘ ๋…ธ๋“œ์˜ Sampling Probability ๋ฅผ ๊ตฌํ•˜์—ฌ Sampling Layer-Dependent Importance Sampling ์ฐธ์กฐ algorithm2
  • 24. Overall Architecture - Inductive Timestamp Assignment Till now we have assumed that each node t is assigned with a timestamp T (t). However, in real-world heterogeneous graphs, many nodes are not associated with a fixed time. Therefore, we need to assign different timestamps to it. We denote these nodes as plain nodes. Algorithm 1 ์˜ Add-In-Budget Method ์ƒ์„ธ (Budget ์— ๋‹ด์„ ๋•Œ, Time Stamp ๊ฐ€ ์—†๋‹ค๋ฉด, Target ์˜ Time ์„ ์ƒ์† ๋ฐ›๋„๋ก ํ•จ) Time ์ด ์žˆ๋Š” ๊ฒฝ์šฐ์—๋Š” D_t ๋ฅผ ๋”ํ•ด ์คŒ
  • 25. Overall Architecture - HGSampling with Inductive Timestamp Assignment Node Type ๋ณ„๋กœ Budget ์ด ๊ด€๋ฆฌ ๋จ ! ๊ท ๋“ฑํ•˜๊ฒŒ Sampling ํ•˜๊ธฐ ์œ„ํ•จ p1 ์„ ์‹œ์ž‘์œผ๋กœ ์ดˆ๊ธฐํ™” P1 ๊ณผ ์—ฐ๊ฒฐ๋œ ๋ชจ๋“  ๋…ธ๋“œ๋ฅผ Budget ์— ์ถ”๊ฐ€ ํ•จ ํƒ€์ž…๋ณ„๋กœ n ๊ฐœ๋ฅผ ๊ณจ๋ผ์„œ Output ์— ์ถ”๊ฐ€ํ•˜๊ณ , Budget ์—์„œ๋Š” pop-out ์‹œํ‚ด ์—ฌ๊ธฐ์„œ n=1 ์ด๊ณ  ๋™๊ทธ ๋ผ๋ฏธ ํƒ€์ž… Budget ์— ์†Œ์Šค ๋…ธ๋“œ ํ•˜๋‚˜ ๋‚จ์•˜์Œ (์—ฌ๊ธฐ์„œ importance sampling ์„ ์‚ฌ์šฉ) ์•ž์—์„œ ์„ ํƒํ•œ type ๋ณ„ n ๊ฐœ์˜ ๋…ธ๋“œ๋กœ ๋ถ€ํ„ฐ ์ธ์ ‘ํ•œ ๋…ธ๋“œ๋ฅผ ๋‹ค์‹œ ์ฐพ์•„์„œ Budget ์— ๋„ฃ๋Š”๋‹ค Budget ์— ๋„ฃ์€ Node ๋ฅผ ๋‹ค์‹œ Importance Sampling ์„ ํ†ตํ•ด์„œ n๊ฐœ๋ฅผ ์„ ํƒํ•ด์„œ Output ์— ์ถ”๊ฐ€ํ•˜๊ณ , Pop-Out ์‹œํ‚ด