SlideShare a Scribd company logo
1 of 15
Download to read offline
Zequn Sun, Wei Hu, Qingheng Zhang and Yuzhong Qu
National Key Laboratory for Novel Software Technology
Nanjing University, China
{zqsun, qhzhang}.nju@gmail.com, {whu, yzqu}@nju.edu.cn
Bootstrapping Entity Alignment
with Knowledge Graph Embedding
1
Background
n Entity Alignment
¡ Find entities in different KGs that refer to the same real-world object
¡ Play a vital role in automatically integrating multiple KGs
n Conventional approaches
¡ Compute entity similarities based on entity attributes
¡ Are not always effective because of the semantic heterogeneity
n Embedding-based approaches
¡ Encode KGs into vector spaces
¡ Measure entity similarities via entity embeddings
2
Challenges
n Although embedding a single KG has been extensively studied
in the past few years, alignment-oriented KG embedding
remains largely unexplored.
n Embedding-based entity alignment usually relies on existing
entity alignment (prior alignment) as training data. However,
the accessible prior alignment usually accounts for a small
proportion.
3
Framework
n We model entity alignment as a classification problem of
using KG2 entities to label KG1 entities.
n To solve the aforementioned two issues, we proposed a
bootstrapping framework:
4
KG1 triples
KG2 triples
Prior
alignment
Supervised
triples
Parameter
swapping
Alignment
predictor
Train alignment-oriented
KG embeddings
Likely
alignment
Parameter
swapping
Alignment
editing
Alignment
labeling
Parameter Swapping
n We swap aligned entities in their triples to calibrate the
embeddings of KG1 and KG2 in the unified vector space.
!(#,%)
'
= {(*, +, ,)|(., +, ,) ∈ !0
1
} ∪ ℎ, +, * ℎ, +, . ∈ !0
1
∪ {(., +, ,)|(*, +, ,) ∈ !5
1
} ∪ ℎ, +, . ℎ, +, * ∈ !5
1
n The supervised triples are fed to our KG embedding model as
positives.
5
KG2’s triples
KG1’s triples
Alignment-Oriented Embedding
n Translational score function: ! " = $ + & − ( )
)
.
n Margin-based ranking loss:
*+ = ∑-∈/0 ∑-1∈/2
3[5 + ! " − ! "6
]8
n Limited loss function:
*9 = ∑-∈/0[! " − 5:]8+ ∑-1∈/3[5) − ! "6
]8
6
! "6
− ! " > 5
not controlled not controlled
! "6
≥ 5) ! " ≤ 5:
! "6 − ! " ≥ 5) − 5:
!-Truncated Negative Sampling
n Conventional uniform negative sampling
(Washington DC, capitalOf, USA) (Tim Berners-Lee, capitalOf, USA)
n !-Truncated negative sampling
(Washington DC, capitalOf, USA) (New York , capitalOf, USA)
7
The replacer is randomly sampled from all entities.
It may be easily distinguished from its original.
The sampling scope is limited to a group of candidates,
i.e., its "-nearest neighbors, where " = 1 − & ' .
Likely Alignment Labeling
n We choose to label likely alignment at the !-th iteration by
solving the following optimization problem:
max %
&∈(
%
)∈*+
,(.|0; 2 3
) 5 6 3
(0, .) ,
s. t. %
&;∈(
6 3
(0<
, .) ≤ 1,
%
);∈*+
6 3
(0, .<
) ≤ 1, ∀0, .
n We transform it to max-weighted matching on bipartite
graphs.
8
( *
one-to-one labeling
Likely Alignment Editing
n Labeling conflicts exist when accumulating the newly-
labeled alignment of different iterations.
¡ ! is labeled as " at the #-th iteration while as "$
at the (#+1)-th
iteration
n We calculate the following likelihood difference:
∆(',),)*)
(,)
= . " !; 0 ,
− .("$
|!; 0 ,
)
¡ If	∆(',),)*)
(,)
> 0,	indicating	labeling	x as	y gives	more	alignment		
likelihood,	we	choose	" to	label	!.	Otherwise	"$
.
9
Experiments
10
n Dataset
¡ DBP15K: three cross-lingual datasets built from the multilingual
versions of DBpedia: DBPZH-EN (Chinese to English), DBPJA-EN
(Japanese to English) and DBPFR-EN (French to English). Each
dataset contains 15 thousand reference entity alignment.
¡ DWY100K: two large-scale datasets extracted from DBpedia,
Wikidata and YAGO3, denoted by DBP-WD and DBP-YG. Each
dataset has 100 thousand reference entity alignment.
Experiments
11
n Comparative Approaches
¡ MTransE [ijcai 2017] learns a linear transformation between KGs.
¡ IPTransE [ijcai 2017] is an iterative method for entity alignment.
¡ JAPE [iswc 2017] combines relation and attribute embeddings for
entity alignment.
n Metrics
¡ Hits@k : the percentage of correct alignment ranked at top k
¡ MRR: the average of the reciprocal ranks of results
Experiments
12
Approaches
DBPZH-EN DBPJA-EN DBPFR-EN DBP-WD DBP-YG
Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR
MTransE 30.83 0.364 27.86 0.349 24.41 0.335 28.12 0.363 25.15 0.334
IPTransE 40.59 0.516 36.69 0.474 33.30 0.451 34.85 0.447 29.74 0.386
JAPE 41.18 0.490 36.25 0.476 32.39 0.430 31.84 0.411 23.57 0.320
AlignE 47.18 0.581 44.76 0.563 48.12 0.599 56.55 0.655 63.29 0.707
BootEA 62.94 0.703 62.23 0.701 65.30 0.731 74.79 0.801 76.10 0.808
n Main results on entity alignment
¡ AlignE outperformed the comparative approaches.
¡ BootEA considerably improved the performance of AlignE after employing
bootstrapping.
Experiments
13
n F1-score w.r.t. Distribution of Relation Triple Numbers
¡ We divided entity links in testing data into several intervals based on
the number of their relation triples.
¡ The performance was assessed by F1-score within a certain interval.
¡ This analysis demonstrated that BootEA can achieve promising
results on sparse data, indicating its practical use for real KGs.
0.0
0.2
0.4
0.6
0.8
1.0
[1,6) [6,11) [11,16) [16,21) [21,∞)
F1-score
Number of relation triples
MTransE IPTransE JAPE BootEA
Number of entity alignment within interval
Conclusion
14
n In this paper, we studied embedding-based entity alignment.
¡ We introduced a KG embedding model to learn alignment-oriented
embeddings across different KGs. It employs an !-truncated uniform
negative sampling method to improve alignment performance.
¡ We conducted entity alignment in a bootstrapping process. It labels
likely alignment as training data and edits alignment during iterations
¡ Our experiment results showed that the proposed approach
significantly outperformed three state-of-the-art embedding-based
ones, on three cross-lingual datasets and two new large-scale
datasets.
Thanks for your attention!
n This work is supported by the National Key R&D Program of China
(No. 2018YFB1004300)
n Codes and datasets of BootEA are now available at
https://github.com/nju-websoft/BootEA
n Welcome to my poster (#1425)
15

More Related Content

Similar to Bootstrapping Entity Alignment with Knowledge Graph Embedding

brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsParham Zilouchian
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgRonald Teo
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...NETFest
 
So sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparisonSo sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparisonbomxuan868
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxMayankChadha14
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets DeconstructedPaul Sterk
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional VerificationSai Kiran Kadam
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in Rherbps10
 
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingExploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingYuanLi589586
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...Tomoyuki Suzuki
 
5 structured programming
5 structured programming 5 structured programming
5 structured programming hccit
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationJason Anderson
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guideprateek kumar
 

Similar to Bootstrapping Entity Alignment with Knowledge Graph Embedding (20)

brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANs
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scg
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
 
So sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparisonSo sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparison
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptx
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingExploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
 
5 structured programming
5 structured programming 5 structured programming
5 structured programming
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guide
 
machine learning
machine learningmachine learning
machine learning
 

Recently uploaded

Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsaqsarehman5055
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 

Recently uploaded (20)

Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 

Bootstrapping Entity Alignment with Knowledge Graph Embedding

  • 1. Zequn Sun, Wei Hu, Qingheng Zhang and Yuzhong Qu National Key Laboratory for Novel Software Technology Nanjing University, China {zqsun, qhzhang}.nju@gmail.com, {whu, yzqu}@nju.edu.cn Bootstrapping Entity Alignment with Knowledge Graph Embedding 1
  • 2. Background n Entity Alignment ¡ Find entities in different KGs that refer to the same real-world object ¡ Play a vital role in automatically integrating multiple KGs n Conventional approaches ¡ Compute entity similarities based on entity attributes ¡ Are not always effective because of the semantic heterogeneity n Embedding-based approaches ¡ Encode KGs into vector spaces ¡ Measure entity similarities via entity embeddings 2
  • 3. Challenges n Although embedding a single KG has been extensively studied in the past few years, alignment-oriented KG embedding remains largely unexplored. n Embedding-based entity alignment usually relies on existing entity alignment (prior alignment) as training data. However, the accessible prior alignment usually accounts for a small proportion. 3
  • 4. Framework n We model entity alignment as a classification problem of using KG2 entities to label KG1 entities. n To solve the aforementioned two issues, we proposed a bootstrapping framework: 4 KG1 triples KG2 triples Prior alignment Supervised triples Parameter swapping Alignment predictor Train alignment-oriented KG embeddings Likely alignment Parameter swapping Alignment editing Alignment labeling
  • 5. Parameter Swapping n We swap aligned entities in their triples to calibrate the embeddings of KG1 and KG2 in the unified vector space. !(#,%) ' = {(*, +, ,)|(., +, ,) ∈ !0 1 } ∪ ℎ, +, * ℎ, +, . ∈ !0 1 ∪ {(., +, ,)|(*, +, ,) ∈ !5 1 } ∪ ℎ, +, . ℎ, +, * ∈ !5 1 n The supervised triples are fed to our KG embedding model as positives. 5 KG2’s triples KG1’s triples
  • 6. Alignment-Oriented Embedding n Translational score function: ! " = $ + & − ( ) ) . n Margin-based ranking loss: *+ = ∑-∈/0 ∑-1∈/2 3[5 + ! " − ! "6 ]8 n Limited loss function: *9 = ∑-∈/0[! " − 5:]8+ ∑-1∈/3[5) − ! "6 ]8 6 ! "6 − ! " > 5 not controlled not controlled ! "6 ≥ 5) ! " ≤ 5: ! "6 − ! " ≥ 5) − 5:
  • 7. !-Truncated Negative Sampling n Conventional uniform negative sampling (Washington DC, capitalOf, USA) (Tim Berners-Lee, capitalOf, USA) n !-Truncated negative sampling (Washington DC, capitalOf, USA) (New York , capitalOf, USA) 7 The replacer is randomly sampled from all entities. It may be easily distinguished from its original. The sampling scope is limited to a group of candidates, i.e., its "-nearest neighbors, where " = 1 − & ' .
  • 8. Likely Alignment Labeling n We choose to label likely alignment at the !-th iteration by solving the following optimization problem: max % &∈( % )∈*+ ,(.|0; 2 3 ) 5 6 3 (0, .) , s. t. % &;∈( 6 3 (0< , .) ≤ 1, % );∈*+ 6 3 (0, .< ) ≤ 1, ∀0, . n We transform it to max-weighted matching on bipartite graphs. 8 ( * one-to-one labeling
  • 9. Likely Alignment Editing n Labeling conflicts exist when accumulating the newly- labeled alignment of different iterations. ¡ ! is labeled as " at the #-th iteration while as "$ at the (#+1)-th iteration n We calculate the following likelihood difference: ∆(',),)*) (,) = . " !; 0 , − .("$ |!; 0 , ) ¡ If ∆(',),)*) (,) > 0, indicating labeling x as y gives more alignment likelihood, we choose " to label !. Otherwise "$ . 9
  • 10. Experiments 10 n Dataset ¡ DBP15K: three cross-lingual datasets built from the multilingual versions of DBpedia: DBPZH-EN (Chinese to English), DBPJA-EN (Japanese to English) and DBPFR-EN (French to English). Each dataset contains 15 thousand reference entity alignment. ¡ DWY100K: two large-scale datasets extracted from DBpedia, Wikidata and YAGO3, denoted by DBP-WD and DBP-YG. Each dataset has 100 thousand reference entity alignment.
  • 11. Experiments 11 n Comparative Approaches ¡ MTransE [ijcai 2017] learns a linear transformation between KGs. ¡ IPTransE [ijcai 2017] is an iterative method for entity alignment. ¡ JAPE [iswc 2017] combines relation and attribute embeddings for entity alignment. n Metrics ¡ Hits@k : the percentage of correct alignment ranked at top k ¡ MRR: the average of the reciprocal ranks of results
  • 12. Experiments 12 Approaches DBPZH-EN DBPJA-EN DBPFR-EN DBP-WD DBP-YG Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR MTransE 30.83 0.364 27.86 0.349 24.41 0.335 28.12 0.363 25.15 0.334 IPTransE 40.59 0.516 36.69 0.474 33.30 0.451 34.85 0.447 29.74 0.386 JAPE 41.18 0.490 36.25 0.476 32.39 0.430 31.84 0.411 23.57 0.320 AlignE 47.18 0.581 44.76 0.563 48.12 0.599 56.55 0.655 63.29 0.707 BootEA 62.94 0.703 62.23 0.701 65.30 0.731 74.79 0.801 76.10 0.808 n Main results on entity alignment ¡ AlignE outperformed the comparative approaches. ¡ BootEA considerably improved the performance of AlignE after employing bootstrapping.
  • 13. Experiments 13 n F1-score w.r.t. Distribution of Relation Triple Numbers ¡ We divided entity links in testing data into several intervals based on the number of their relation triples. ¡ The performance was assessed by F1-score within a certain interval. ¡ This analysis demonstrated that BootEA can achieve promising results on sparse data, indicating its practical use for real KGs. 0.0 0.2 0.4 0.6 0.8 1.0 [1,6) [6,11) [11,16) [16,21) [21,∞) F1-score Number of relation triples MTransE IPTransE JAPE BootEA Number of entity alignment within interval
  • 14. Conclusion 14 n In this paper, we studied embedding-based entity alignment. ¡ We introduced a KG embedding model to learn alignment-oriented embeddings across different KGs. It employs an !-truncated uniform negative sampling method to improve alignment performance. ¡ We conducted entity alignment in a bootstrapping process. It labels likely alignment as training data and edits alignment during iterations ¡ Our experiment results showed that the proposed approach significantly outperformed three state-of-the-art embedding-based ones, on three cross-lingual datasets and two new large-scale datasets.
  • 15. Thanks for your attention! n This work is supported by the National Key R&D Program of China (No. 2018YFB1004300) n Codes and datasets of BootEA are now available at https://github.com/nju-websoft/BootEA n Welcome to my poster (#1425) 15