SlideShare a Scribd company logo
1 of 20
QUASI-RECURRENT NEURAL NETWORKS
James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard Socher
2017-05-12
輪読@松尾研究室 M1 田村浩一郎
Agenda
1. Information
2. Introduction
3. Proposed Model
4. Experiment & result
5. Conclusion
1. Information
• Author
- James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard
SocherSalesforce
- Salesforce Researchのグループ
• Submission date
- Submitted on 5 Nov 2016 (v1), last revised 21 Nov 2016 (this version, v2)
• Society
- ICLR2017
- https://arxiv.org/abs/1611.01576
• About
- 時系列データをCNN的に取り扱うモデル (*CNNとRNNを組み合わせたも
のではない)
2. Introduction
• RNNの問題点
- RNNは時系列データを扱う一般的な深層学習モデル
1. 並列計算できないため,非常に長い系列のタスク
を処理できない
• h(t)の出力をするためにはh(t-1)を計算する必要がある
2. 意味解釈が困難*
• 再帰的に同一の重みWを更新していくので,特徴量の意
味解釈が難しい
:
h1
h2
hn
t-1
z1
z2
zn
:
t
W ベクトルの順序に意味がなくなる
特徴量の意味が解釈できない
2. Introduction
• CNNで時系列データを扱う際の問題点
-Fully character-level neural machine translation without explicit segmentation(Lee et al.,
2016)など,CNNを時系列データに用いてよい精度を出している研究もある
1. 時間不変性(time invariance)を仮定しており,過去の全ての情報
が反映されていない
引用:Fully character-level neural machine translation without explicit segmentation
近辺の情報しか反映されていない
長い系列長のデータを
処理することが難しい
2. Introduction
• QRNN
- CNNにしたことで並列計算を可能に
- 要素積を計算し,隠れ層において重みの順伝播を行わないこ
とで,要素の独立して維持(意味解釈可能性)
- Pooling層でLSTM likeに過去の情報を反映させる
2. Introduction
• 3つの実験を行なった
1. document-level sentiment classification
2. language modeling
3. character-level machine translation
• 各実験において,LSTMと同等以上の精度を示した
• Epochあたりの計算時間はLSTMに比べて25〜50%程度だった
• 隠れ層の活性化の可視化によって意味解釈の可能性がある
3. Proposed Model
• QRNNはCNNにおける畳み込み層とPooling層で構成される
• 入力はn次元ベクトル系列長Tのベクトル𝑿 ∈ 𝑹 𝑻×𝒏
• m個のフィルタを用いて時系列方向に畳み込み,Z を得る
- 未来の情報をたたみ込まないように注意(Masked convolution)
- 𝑍 ∈ 𝑅 𝑇×𝑚
3. Proposed Model
• 畳み込みはLSTMに対応させる形で以下の3つを行う
1. 𝑍 = tanh(𝑊𝑧 ∗ 𝑋)
2. 𝐹 = 𝜎(𝑊𝑓 ∗ 𝑋)
3. 𝑂 = 𝜎(𝑊𝑜 ∗ 𝑋)
- * は時系列方向のMasked Convolutionを示す
• 以上の式は,LSTM的に理解すれば以下のようになる
- フィルタのサイズを2として,
1. 𝑧𝑡 = tanh(𝑊𝑧
1
∗ 𝑥 𝑡−1 + 𝑊𝑧
2
∗ 𝑥 𝑡) ->LSTMのinput
2. 𝑓𝑡 = σ(𝑊𝑓
1
∗ 𝑥 𝑡−1 + 𝑊𝑓
2
∗ 𝑥 𝑡) ->LSTMのforget
3. 𝑜𝑡 = σ(𝑊𝑜
1 ∗ 𝑥 𝑡−1 + 𝑊𝑜
2 ∗ 𝑥 𝑡) ->LSTMのoutput
3. Proposed Model
• Pooling
- LSMT的に扱う
- 3つのpoolingを提案
1. f-pooling
ℎ 𝑡 = 𝑓𝑡 ℎ 𝑡−1 + (1 − 𝑓𝑡) 𝑧𝑡
2. fo-pooling
𝑐𝑡 = 𝑓𝑡 𝑐𝑡−1 + (1 − 𝑓𝑡) 𝑧𝑡
ℎ 𝑡 = 𝑜𝑡 𝑐𝑡
3. ifo-pooling
𝑐𝑡 = 𝑓𝑡 𝑐𝑡−1 + 𝑖 𝑡 𝑧𝑡
ℎ 𝑡 = 𝑜𝑡 𝑐𝑡
3. Proposed Model
• Regularization
- 正則化として,LSTMで用いられているzoneoutを用
いる
- 𝐹 = 1 − 𝑑𝑟𝑜𝑝𝑜𝑢𝑡(1 − 𝜎(𝑊𝑓 ∗ 𝑋)) とすれば良い
• Densely-connected layers
- Sequence classificationにおいて,QRNNの各層の間に
skip connection(tからt+dなどのジャンプしている接
続)を追加する方が良い
• Encoder-Decoder Models
- QRNNを翻訳のようなタスクでも用いるため,QRNN
をencoder, decoderとして使うことも可能
4. Experiment & result
• QRNNの精度と計算時間を以下の実験で検証する
1. document-level sentiment classification
2. language modeling
3. character-level machine translation
4. Experiment & result
1. document-level sentiment classification
• データセット: IMDb Dataset
- Input : 映画に関するレビュー文章
- Label : 評価 positive(25,000sample) / negative(25,000sample) の2値分類
• hyper-parameter
- 4層のdensely-connected, 256ユニット
- word vector dimensions: 300
- Dropout = 0.3, L2 = 4 * 10^-6
- Minibatch = 24, RMSprop, lr=0.001, α=0.9, ε=10^-8
4. Experiment & result
1. document-level sentiment classification
• 結果
• 隠れ層の活性化の可視化
ベクトルを要素独立にしたため,
隠れ層の分析が有意に
色は活性化を表す
timestep 120~160くらいで薄くなて
いるが,この部分だけ否定的な
wordが多かった模様
LSTMと同等程度の精度であるが,
計算時間が大幅に向上
4. Experiment & result
2. language modeling
• データセット : Penn Treebank
- コーパスの一つ
- Train: 929,000 words, validation: 73,000 words, test: 82,000words
- Word-level prediction を行う
- Perplexityで評価する(smaller is better)
- exp( 𝑥 𝑝(𝑥) log
1
𝑝(𝑥)
)
• hyper-parameter
- 2層, 640ユニット
- SGD + moumentum, lr=[1 if n_epoch<=6, else lr_{t-1}*0.95]
4. Experiment & result
2. language modeling
• 結果 LSTMと比較して
よりよい結果に
RNNに由来する計算時間
が短縮している
4. Experiment & result
3. character-level machine translation
• データセット : IWSLT German–English spoken-domain translation
- Tedxから209,772の文章のペア
- Train: 929,000 words, validation: 73,000 words, test: 82,000words
- sequence-to-sequence QRNNを評価
• hyper-parameter
- 4層, 320ユニット
- Adam, 10epoch
- 畳み込み一層目: filter size = 6, それ以外:filter size = 2
4. Experiment & result
3. character-level machine translation
• 結果
Character level LSTMよりも良い精度で,計算時間も25%ほど
Word-level attentionとほぼ同等の精度
5. Conclusion
• QRNNは,RNNとCNNの双方の長所を取り込ん
だmodel
- CNNのように並列処理可能
- RNNのように全時系列の影響を反映
• QRNNは,LSTMをはじめ既存手法に対して,同
等以上の精度を高速な学習で出すことができる
• QRNNは,より意味解釈可能性を持っている
~資料参考文献~
• Fully Character-Level Neural Machine Translation without Explicit
Segmentation(Jason Lee, Kyunghyun Cho, Thomas Hofmann, 2016)
https://arxiv.org/abs/1610.03017
*画像引用
• LSTMを超える期待の新星、QRNN(@icoxfog417, Qiita)
http://qiita.com/icoxfog417/items/d77912e10a7c60ae680e
• [DL輪読会]QUASI-RECURRENT NEURAL
NETWORKS(DeepLearningJP2016, slide share)
https://www.slideshare.net/DeepLearningJP2016/dlquasirecurrent-neural-networks

More Related Content

What's hot

[DL輪読会]Pay Attention to MLPs (gMLP)
[DL輪読会]Pay Attention to MLPs	(gMLP)[DL輪読会]Pay Attention to MLPs	(gMLP)
[DL輪読会]Pay Attention to MLPs (gMLP)Deep Learning JP
 
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language ModelsDeep Learning JP
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報Deep Learning JP
 
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs 【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs Deep Learning JP
 
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習Deep Learning JP
 
【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者cvpaper. challenge
 
ランダムフォレスト
ランダムフォレストランダムフォレスト
ランダムフォレストKinki University
 
Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用Shohei Kumagai
 
[DL輪読会]Weight Agnostic Neural Networks
[DL輪読会]Weight Agnostic Neural Networks[DL輪読会]Weight Agnostic Neural Networks
[DL輪読会]Weight Agnostic Neural NetworksDeep Learning JP
 
[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...
[DL輪読会]Revisiting Deep Learning Models for Tabular Data  (NeurIPS 2021) 表形式デー...[DL輪読会]Revisiting Deep Learning Models for Tabular Data  (NeurIPS 2021) 表形式デー...
[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...Deep Learning JP
 
最近のディープラーニングのトレンド紹介_20200925
最近のディープラーニングのトレンド紹介_20200925最近のディープラーニングのトレンド紹介_20200925
最近のディープラーニングのトレンド紹介_20200925小川 雄太郎
 
PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築
PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築
PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築Kosuke Shinoda
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Yoshitaka Ushiku
 
非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2nd非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2ndMika Yoshimura
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP
 
記号創発ロボティクスの狙い
記号創発ロボティクスの狙い 記号創発ロボティクスの狙い
記号創発ロボティクスの狙い Tadahiro Taniguchi
 
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient DescentDeep Learning JP
 
Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験myxymyxomatosis
 
[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoderMasanari Kimura
 

What's hot (20)

[DL輪読会]Pay Attention to MLPs (gMLP)
[DL輪読会]Pay Attention to MLPs	(gMLP)[DL輪読会]Pay Attention to MLPs	(gMLP)
[DL輪読会]Pay Attention to MLPs (gMLP)
 
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報
 
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs 【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs
 
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習
 
【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者
 
ランダムフォレスト
ランダムフォレストランダムフォレスト
ランダムフォレスト
 
Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用
 
[DL輪読会]Weight Agnostic Neural Networks
[DL輪読会]Weight Agnostic Neural Networks[DL輪読会]Weight Agnostic Neural Networks
[DL輪読会]Weight Agnostic Neural Networks
 
[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...
[DL輪読会]Revisiting Deep Learning Models for Tabular Data  (NeurIPS 2021) 表形式デー...[DL輪読会]Revisiting Deep Learning Models for Tabular Data  (NeurIPS 2021) 表形式デー...
[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...
 
最近のディープラーニングのトレンド紹介_20200925
最近のディープラーニングのトレンド紹介_20200925最近のディープラーニングのトレンド紹介_20200925
最近のディープラーニングのトレンド紹介_20200925
 
PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築
PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築
PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
 
非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2nd非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2nd
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
記号創発ロボティクスの狙い
記号創発ロボティクスの狙い 記号創発ロボティクスの狙い
記号創発ロボティクスの狙い
 
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 
実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE
 
Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験
 
[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder[Ridge-i 論文よみかい] Wasserstein auto encoder
[Ridge-i 論文よみかい] Wasserstein auto encoder
 

Viewers also liked

[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
[DL輪読会]QUASI-RECURRENT NEURAL NETWORKSDeep Learning JP
 
未出現事象の出現確率
未出現事象の出現確率未出現事象の出現確率
未出現事象の出現確率Hiroshi Nakagawa
 
[DL輪読会]One Model To Learn Them All
[DL輪読会]One Model To Learn Them All[DL輪読会]One Model To Learn Them All
[DL輪読会]One Model To Learn Them AllDeep Learning JP
 
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...Deep Learning JP
 
[DL輪読会]Understanding Black-box Predictions via Influence Functions
[DL輪読会]Understanding Black-box Predictions via Influence Functions [DL輪読会]Understanding Black-box Predictions via Influence Functions
[DL輪読会]Understanding Black-box Predictions via Influence Functions Deep Learning JP
 
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via InformationDeep Learning JP
 
[DL輪読会]Learning to Act by Predicting the Future
[DL輪読会]Learning to Act by Predicting the Future[DL輪読会]Learning to Act by Predicting the Future
[DL輪読会]Learning to Act by Predicting the FutureDeep Learning JP
 
[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...
[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...
[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...Deep Learning JP
 
Prophet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツールProphet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツールhoxo_m
 
SVM実践ガイド (A Practical Guide to Support Vector Classification)
SVM実践ガイド (A Practical Guide to Support Vector Classification)SVM実践ガイド (A Practical Guide to Support Vector Classification)
SVM実践ガイド (A Practical Guide to Support Vector Classification)sleepy_yoshi
 

Viewers also liked (11)

[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
 
未出現事象の出現確率
未出現事象の出現確率未出現事象の出現確率
未出現事象の出現確率
 
[DL輪読会]One Model To Learn Them All
[DL輪読会]One Model To Learn Them All[DL輪読会]One Model To Learn Them All
[DL輪読会]One Model To Learn Them All
 
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
 
[DL輪読会]Understanding Black-box Predictions via Influence Functions
[DL輪読会]Understanding Black-box Predictions via Influence Functions [DL輪読会]Understanding Black-box Predictions via Influence Functions
[DL輪読会]Understanding Black-box Predictions via Influence Functions
 
言語モデル入門
言語モデル入門言語モデル入門
言語モデル入門
 
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
[DL輪読会]Opening the Black Box of Deep Neural Networks via Information
 
[DL輪読会]Learning to Act by Predicting the Future
[DL輪読会]Learning to Act by Predicting the Future[DL輪読会]Learning to Act by Predicting the Future
[DL輪読会]Learning to Act by Predicting the Future
 
[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...
[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...
[DL輪読会]Deep Direct Reinforcement Learning for Financial Signal Representation...
 
Prophet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツールProphet入門【Python編】Facebookの時系列予測ツール
Prophet入門【Python編】Facebookの時系列予測ツール
 
SVM実践ガイド (A Practical Guide to Support Vector Classification)
SVM実践ガイド (A Practical Guide to Support Vector Classification)SVM実践ガイド (A Practical Guide to Support Vector Classification)
SVM実践ガイド (A Practical Guide to Support Vector Classification)
 

Similar to [DL輪読会]QUASI-RECURRENT NEURAL NETWORKS

[DL輪読会]Training RNNs as Fast as CNNs
[DL輪読会]Training RNNs as Fast as CNNs[DL輪読会]Training RNNs as Fast as CNNs
[DL輪読会]Training RNNs as Fast as CNNsDeep Learning JP
 
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages. Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages. Satoshi Kato
 
NIPS2019 Amazon「think globally, act locally : a deep neural network approach...
NIPS2019  Amazon「think globally, act locally : a deep neural network approach...NIPS2019  Amazon「think globally, act locally : a deep neural network approach...
NIPS2019 Amazon「think globally, act locally : a deep neural network approach...SaeruYamamuro
 
【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...
【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...
【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...Deep Learning JP
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016cvpaper. challenge
 
[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted WindowsDeep Learning JP
 
attention_is_all_you_need_nips17_論文紹介
attention_is_all_you_need_nips17_論文紹介attention_is_all_you_need_nips17_論文紹介
attention_is_all_you_need_nips17_論文紹介Masayoshi Kondo
 
点群深層学習 Meta-study
点群深層学習 Meta-study点群深層学習 Meta-study
点群深層学習 Meta-studyNaoya Chiba
 
LCCC2010:Learning on Cores, Clusters and Cloudsの解説
LCCC2010:Learning on Cores,  Clusters and Cloudsの解説LCCC2010:Learning on Cores,  Clusters and Cloudsの解説
LCCC2010:Learning on Cores, Clusters and Cloudsの解説Preferred Networks
 
Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介
Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介
Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介Masayoshi Kondo
 
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object DetectionDeep Learning JP
 
論文紹介:Dueling network architectures for deep reinforcement learning
論文紹介:Dueling network architectures for deep reinforcement learning論文紹介:Dueling network architectures for deep reinforcement learning
論文紹介:Dueling network architectures for deep reinforcement learningKazuki Adachi
 
Approximate Scalable Bounded Space Sketch for Large Data NLP
Approximate Scalable Bounded Space Sketch for Large Data NLPApproximate Scalable Bounded Space Sketch for Large Data NLP
Approximate Scalable Bounded Space Sketch for Large Data NLPKoji Matsuda
 
R-CNNの原理とここ数年の流れ
R-CNNの原理とここ数年の流れR-CNNの原理とここ数年の流れ
R-CNNの原理とここ数年の流れKazuki Motohashi
 
2014 11-20 Machine Learning with Apache Spark 勉強会資料
2014 11-20 Machine Learning with Apache Spark 勉強会資料2014 11-20 Machine Learning with Apache Spark 勉強会資料
2014 11-20 Machine Learning with Apache Spark 勉強会資料Recruit Technologies
 
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)Morpho, Inc.
 
Using Deep Learning for Recommendation
Using Deep Learning for RecommendationUsing Deep Learning for Recommendation
Using Deep Learning for RecommendationEduardo Gonzalez
 

Similar to [DL輪読会]QUASI-RECURRENT NEURAL NETWORKS (20)

[DL輪読会]Training RNNs as Fast as CNNs
[DL輪読会]Training RNNs as Fast as CNNs[DL輪読会]Training RNNs as Fast as CNNs
[DL輪読会]Training RNNs as Fast as CNNs
 
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages. Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
Dimensionality reduction with t-SNE(Rtsne) and UMAP(uwot) using R packages.
 
NIPS2019 Amazon「think globally, act locally : a deep neural network approach...
NIPS2019  Amazon「think globally, act locally : a deep neural network approach...NIPS2019  Amazon「think globally, act locally : a deep neural network approach...
NIPS2019 Amazon「think globally, act locally : a deep neural network approach...
 
【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...
【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...
【DL輪読会】Learning Instance-Specific Adaptation for Cross-Domain Segmentation (E...
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016
 
[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
 
attention_is_all_you_need_nips17_論文紹介
attention_is_all_you_need_nips17_論文紹介attention_is_all_you_need_nips17_論文紹介
attention_is_all_you_need_nips17_論文紹介
 
点群深層学習 Meta-study
点群深層学習 Meta-study点群深層学習 Meta-study
点群深層学習 Meta-study
 
LCCC2010:Learning on Cores, Clusters and Cloudsの解説
LCCC2010:Learning on Cores,  Clusters and Cloudsの解説LCCC2010:Learning on Cores,  Clusters and Cloudsの解説
LCCC2010:Learning on Cores, Clusters and Cloudsの解説
 
Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介
Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介
Graph-to-Sequence Learning using Gated Graph Neural Networks. [ACL'18] 論文紹介
 
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
 
論文紹介:Dueling network architectures for deep reinforcement learning
論文紹介:Dueling network architectures for deep reinforcement learning論文紹介:Dueling network architectures for deep reinforcement learning
論文紹介:Dueling network architectures for deep reinforcement learning
 
Rainbow
RainbowRainbow
Rainbow
 
Approximate Scalable Bounded Space Sketch for Large Data NLP
Approximate Scalable Bounded Space Sketch for Large Data NLPApproximate Scalable Bounded Space Sketch for Large Data NLP
Approximate Scalable Bounded Space Sketch for Large Data NLP
 
R-CNNの原理とここ数年の流れ
R-CNNの原理とここ数年の流れR-CNNの原理とここ数年の流れ
R-CNNの原理とここ数年の流れ
 
研究を加速するChainerファミリー
研究を加速するChainerファミリー研究を加速するChainerファミリー
研究を加速するChainerファミリー
 
2014 11-20 Machine Learning with Apache Spark 勉強会資料
2014 11-20 Machine Learning with Apache Spark 勉強会資料2014 11-20 Machine Learning with Apache Spark 勉強会資料
2014 11-20 Machine Learning with Apache Spark 勉強会資料
 
PFI Christmas seminar 2009
PFI Christmas seminar 2009PFI Christmas seminar 2009
PFI Christmas seminar 2009
 
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
 
Using Deep Learning for Recommendation
Using Deep Learning for RecommendationUsing Deep Learning for Recommendation
Using Deep Learning for Recommendation
 

More from Deep Learning JP

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについてDeep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-ResolutionDeep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLMDeep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place RecognitionDeep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究についてDeep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...Deep Learning JP
 

More from Deep Learning JP (20)

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
 

Recently uploaded

スマートフォンを用いた新生児あやし動作の教示システム
スマートフォンを用いた新生児あやし動作の教示システムスマートフォンを用いた新生児あやし動作の教示システム
スマートフォンを用いた新生児あやし動作の教示システムsugiuralab
 
TSAL operation mechanism and circuit diagram.pdf
TSAL operation mechanism and circuit diagram.pdfTSAL operation mechanism and circuit diagram.pdf
TSAL operation mechanism and circuit diagram.pdftaisei2219
 
Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介
Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介
Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介Yuma Ohgami
 
論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A surveyToru Tamaki
 
論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNetToru Tamaki
 
[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略
[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略
[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略Ryo Sasaki
 
SOPを理解する 2024/04/19 の勉強会で発表されたものです
SOPを理解する       2024/04/19 の勉強会で発表されたものですSOPを理解する       2024/04/19 の勉強会で発表されたものです
SOPを理解する 2024/04/19 の勉強会で発表されたものですiPride Co., Ltd.
 
Postman LT Fukuoka_Quick Prototype_By Daniel
Postman LT Fukuoka_Quick Prototype_By DanielPostman LT Fukuoka_Quick Prototype_By Daniel
Postman LT Fukuoka_Quick Prototype_By Danieldanielhu54
 
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...Toru Tamaki
 

Recently uploaded (9)

スマートフォンを用いた新生児あやし動作の教示システム
スマートフォンを用いた新生児あやし動作の教示システムスマートフォンを用いた新生児あやし動作の教示システム
スマートフォンを用いた新生児あやし動作の教示システム
 
TSAL operation mechanism and circuit diagram.pdf
TSAL operation mechanism and circuit diagram.pdfTSAL operation mechanism and circuit diagram.pdf
TSAL operation mechanism and circuit diagram.pdf
 
Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介
Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介
Open Source UN-Conference 2024 Kawagoe - 独自OS「DaisyOS GB」の紹介
 
論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey論文紹介:Semantic segmentation using Vision Transformers: A survey
論文紹介:Semantic segmentation using Vision Transformers: A survey
 
論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet論文紹介:Automated Classification of Model Errors on ImageNet
論文紹介:Automated Classification of Model Errors on ImageNet
 
[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略
[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略
[DevOpsDays Tokyo 2024] 〜デジタルとアナログのはざまに〜 スマートビルディング爆速開発を支える 自動化テスト戦略
 
SOPを理解する 2024/04/19 の勉強会で発表されたものです
SOPを理解する       2024/04/19 の勉強会で発表されたものですSOPを理解する       2024/04/19 の勉強会で発表されたものです
SOPを理解する 2024/04/19 の勉強会で発表されたものです
 
Postman LT Fukuoka_Quick Prototype_By Daniel
Postman LT Fukuoka_Quick Prototype_By DanielPostman LT Fukuoka_Quick Prototype_By Daniel
Postman LT Fukuoka_Quick Prototype_By Daniel
 
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
論文紹介:Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...
 

[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS

  • 1. QUASI-RECURRENT NEURAL NETWORKS James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard Socher 2017-05-12 輪読@松尾研究室 M1 田村浩一郎
  • 2. Agenda 1. Information 2. Introduction 3. Proposed Model 4. Experiment & result 5. Conclusion
  • 3. 1. Information • Author - James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard SocherSalesforce - Salesforce Researchのグループ • Submission date - Submitted on 5 Nov 2016 (v1), last revised 21 Nov 2016 (this version, v2) • Society - ICLR2017 - https://arxiv.org/abs/1611.01576 • About - 時系列データをCNN的に取り扱うモデル (*CNNとRNNを組み合わせたも のではない)
  • 4. 2. Introduction • RNNの問題点 - RNNは時系列データを扱う一般的な深層学習モデル 1. 並列計算できないため,非常に長い系列のタスク を処理できない • h(t)の出力をするためにはh(t-1)を計算する必要がある 2. 意味解釈が困難* • 再帰的に同一の重みWを更新していくので,特徴量の意 味解釈が難しい : h1 h2 hn t-1 z1 z2 zn : t W ベクトルの順序に意味がなくなる 特徴量の意味が解釈できない
  • 5. 2. Introduction • CNNで時系列データを扱う際の問題点 -Fully character-level neural machine translation without explicit segmentation(Lee et al., 2016)など,CNNを時系列データに用いてよい精度を出している研究もある 1. 時間不変性(time invariance)を仮定しており,過去の全ての情報 が反映されていない 引用:Fully character-level neural machine translation without explicit segmentation 近辺の情報しか反映されていない 長い系列長のデータを 処理することが難しい
  • 6. 2. Introduction • QRNN - CNNにしたことで並列計算を可能に - 要素積を計算し,隠れ層において重みの順伝播を行わないこ とで,要素の独立して維持(意味解釈可能性) - Pooling層でLSTM likeに過去の情報を反映させる
  • 7. 2. Introduction • 3つの実験を行なった 1. document-level sentiment classification 2. language modeling 3. character-level machine translation • 各実験において,LSTMと同等以上の精度を示した • Epochあたりの計算時間はLSTMに比べて25〜50%程度だった • 隠れ層の活性化の可視化によって意味解釈の可能性がある
  • 8. 3. Proposed Model • QRNNはCNNにおける畳み込み層とPooling層で構成される • 入力はn次元ベクトル系列長Tのベクトル𝑿 ∈ 𝑹 𝑻×𝒏 • m個のフィルタを用いて時系列方向に畳み込み,Z を得る - 未来の情報をたたみ込まないように注意(Masked convolution) - 𝑍 ∈ 𝑅 𝑇×𝑚
  • 9. 3. Proposed Model • 畳み込みはLSTMに対応させる形で以下の3つを行う 1. 𝑍 = tanh(𝑊𝑧 ∗ 𝑋) 2. 𝐹 = 𝜎(𝑊𝑓 ∗ 𝑋) 3. 𝑂 = 𝜎(𝑊𝑜 ∗ 𝑋) - * は時系列方向のMasked Convolutionを示す • 以上の式は,LSTM的に理解すれば以下のようになる - フィルタのサイズを2として, 1. 𝑧𝑡 = tanh(𝑊𝑧 1 ∗ 𝑥 𝑡−1 + 𝑊𝑧 2 ∗ 𝑥 𝑡) ->LSTMのinput 2. 𝑓𝑡 = σ(𝑊𝑓 1 ∗ 𝑥 𝑡−1 + 𝑊𝑓 2 ∗ 𝑥 𝑡) ->LSTMのforget 3. 𝑜𝑡 = σ(𝑊𝑜 1 ∗ 𝑥 𝑡−1 + 𝑊𝑜 2 ∗ 𝑥 𝑡) ->LSTMのoutput
  • 10. 3. Proposed Model • Pooling - LSMT的に扱う - 3つのpoolingを提案 1. f-pooling ℎ 𝑡 = 𝑓𝑡 ℎ 𝑡−1 + (1 − 𝑓𝑡) 𝑧𝑡 2. fo-pooling 𝑐𝑡 = 𝑓𝑡 𝑐𝑡−1 + (1 − 𝑓𝑡) 𝑧𝑡 ℎ 𝑡 = 𝑜𝑡 𝑐𝑡 3. ifo-pooling 𝑐𝑡 = 𝑓𝑡 𝑐𝑡−1 + 𝑖 𝑡 𝑧𝑡 ℎ 𝑡 = 𝑜𝑡 𝑐𝑡
  • 11. 3. Proposed Model • Regularization - 正則化として,LSTMで用いられているzoneoutを用 いる - 𝐹 = 1 − 𝑑𝑟𝑜𝑝𝑜𝑢𝑡(1 − 𝜎(𝑊𝑓 ∗ 𝑋)) とすれば良い • Densely-connected layers - Sequence classificationにおいて,QRNNの各層の間に skip connection(tからt+dなどのジャンプしている接 続)を追加する方が良い • Encoder-Decoder Models - QRNNを翻訳のようなタスクでも用いるため,QRNN をencoder, decoderとして使うことも可能
  • 12. 4. Experiment & result • QRNNの精度と計算時間を以下の実験で検証する 1. document-level sentiment classification 2. language modeling 3. character-level machine translation
  • 13. 4. Experiment & result 1. document-level sentiment classification • データセット: IMDb Dataset - Input : 映画に関するレビュー文章 - Label : 評価 positive(25,000sample) / negative(25,000sample) の2値分類 • hyper-parameter - 4層のdensely-connected, 256ユニット - word vector dimensions: 300 - Dropout = 0.3, L2 = 4 * 10^-6 - Minibatch = 24, RMSprop, lr=0.001, α=0.9, ε=10^-8
  • 14. 4. Experiment & result 1. document-level sentiment classification • 結果 • 隠れ層の活性化の可視化 ベクトルを要素独立にしたため, 隠れ層の分析が有意に 色は活性化を表す timestep 120~160くらいで薄くなて いるが,この部分だけ否定的な wordが多かった模様 LSTMと同等程度の精度であるが, 計算時間が大幅に向上
  • 15. 4. Experiment & result 2. language modeling • データセット : Penn Treebank - コーパスの一つ - Train: 929,000 words, validation: 73,000 words, test: 82,000words - Word-level prediction を行う - Perplexityで評価する(smaller is better) - exp( 𝑥 𝑝(𝑥) log 1 𝑝(𝑥) ) • hyper-parameter - 2層, 640ユニット - SGD + moumentum, lr=[1 if n_epoch<=6, else lr_{t-1}*0.95]
  • 16. 4. Experiment & result 2. language modeling • 結果 LSTMと比較して よりよい結果に RNNに由来する計算時間 が短縮している
  • 17. 4. Experiment & result 3. character-level machine translation • データセット : IWSLT German–English spoken-domain translation - Tedxから209,772の文章のペア - Train: 929,000 words, validation: 73,000 words, test: 82,000words - sequence-to-sequence QRNNを評価 • hyper-parameter - 4層, 320ユニット - Adam, 10epoch - 畳み込み一層目: filter size = 6, それ以外:filter size = 2
  • 18. 4. Experiment & result 3. character-level machine translation • 結果 Character level LSTMよりも良い精度で,計算時間も25%ほど Word-level attentionとほぼ同等の精度
  • 19. 5. Conclusion • QRNNは,RNNとCNNの双方の長所を取り込ん だmodel - CNNのように並列処理可能 - RNNのように全時系列の影響を反映 • QRNNは,LSTMをはじめ既存手法に対して,同 等以上の精度を高速な学習で出すことができる • QRNNは,より意味解釈可能性を持っている
  • 20. ~資料参考文献~ • Fully Character-Level Neural Machine Translation without Explicit Segmentation(Jason Lee, Kyunghyun Cho, Thomas Hofmann, 2016) https://arxiv.org/abs/1610.03017 *画像引用 • LSTMを超える期待の新星、QRNN(@icoxfog417, Qiita) http://qiita.com/icoxfog417/items/d77912e10a7c60ae680e • [DL輪読会]QUASI-RECURRENT NEURAL NETWORKS(DeepLearningJP2016, slide share) https://www.slideshare.net/DeepLearningJP2016/dlquasirecurrent-neural-networks