7. Masked Language Modelと自己教師あり学習
7
① Language Model (LM, 言語モデル) ② Masked Language Model
大規模DNN 大規模DNN
Input: Language models determine
Output: word probability
by analyzing text data
Input: Language models determine [mask]
[mask] by [mask] text data
Input: Language models determine word
probability by analyzing text data
Original: Language models determine word probability by analyzing text data
原文を入力と予測対象に分割
自分(の一部)から自分を予測するため,自己教師あり学習とも呼ばれる