SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Learning to learn by
gradient descent by
gradient descent
citation: 9 -> 38 Katy, 2016/11/25@DataLab
NIPS 2016
Background
• learn:
1. a task
2. training experience
3. a performance measure
• a computer program is said to learn if its
performance at the task improves with experience.
Mitchell [Mitchell, 1993]
Background
• learning to learn:
1. a family of tasks
2. training experience for each of these tasks
3. a family of performance measures
• an algorithm is said to learn to learn if its
performance at each task improves with
experience and with the number of tasks.
Thrun, Sebastian, and Lorien Pratt, eds. Learning to learn. Springer Science & Business Media, 2012.
Background
• Frequently, tasks in machine learning can be
expressed as the problem of optimizing an
objective function defined over some domain
• The goal is to find the minimizer
• the standard approach for differentiable functions
is some form of gradient descent, resulting in a
sequence of updates
Motivation
• Most of the modern work is based around
designing update rules for specific classes of
problems, it might perform poorly on other class of
problems
Motivation
• In this work we take a different tack and instead
propose to replace hand-designed update rules
with a learned update rule
Outline
• Related work
• Main idea
• Evaluation
• Conclusion
Outline
• Related work
• Main idea
• Evaluation
• Conclusion
Related Work
• C. Daniel, J. Taylor, and S. Nowozin. Learning step
size controllers for robust neural network training. In
Association for the Advancement of Artificial
Intelligence, 2016.
Outline
• Related work or naïve methods
• Main idea
• Evaluation
• Conclusion
Learning to learn with RNN
• In this work, they proposed to replace hand-
designed update rules with a learned update rule,
which we called the optimizer(a LSTM) m, with its
own parameter
• This results in updates to the optimizee f of the form
φ
gt is the output of LSTM
How to train the optimizer
• For training the optimizer, we have an objective that
depends on the trajectory for a time horizon T
• θ the optimizee parameters
• ϕ: the optimizer parameters
• f: the function in question
m is the LSTM
Intuition of Trajectory
old trajectory
trajectory with new φ
Challenge
• too many parameters in LSTM
• solution?
Coordinatewise LSTM
Optimizer
gt
Information Sharing
Between Coordinates
• global average cells(GAC) designate a subset of
the cells in each LSTM layer for communication.
their outgoing activations are averages at each
step across all coordinates.
• allowing different LSTMs to communicate with each
other
Outline
• Related work
• Main idea
• Evaluation
• Conclusion
Experiment 1
Experiment 2: change
structure
Experiment 3: systematically
changing NN architecture
LSTM train on one-hiddent-layer 20-units NN
Experiment 4 on covnet on
CIFAR-10
LSTM-sub: train on only hold out dataset
Experiment 5 on Neural Style, optimizer
train on only one style and 1800 content
image from imageNet
Outline
• Related work or naïve methods
• Main idea
• Evaluation
• Conclusion
Conclusion
• So far the learning process is handcraft, but this
work shows how to train a NN by a NN
• generalize well on different architecture but not on
different activation function
• execution time?
• sometimes, when you are confused for long, try to
email the author(all of them). A typo can kill you.

Weitere ähnliche Inhalte

Was ist angesagt?

Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine LearningSri Ambati
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learningmilad abbasi
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Databricks
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine LearningUpekha Vandebona
 
Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingYu Huang
 
Graph Coloring : Greedy Algorithm & Welsh Powell Algorithm
Graph Coloring : Greedy Algorithm & Welsh Powell AlgorithmGraph Coloring : Greedy Algorithm & Welsh Powell Algorithm
Graph Coloring : Greedy Algorithm & Welsh Powell AlgorithmPriyank Jain
 
Introduction to Some Tree based Learning Method
Introduction to Some Tree based Learning MethodIntroduction to Some Tree based Learning Method
Introduction to Some Tree based Learning MethodHonglin Yu
 
Time Series Forecasting Project Presentation.
Time Series Forecasting Project  Presentation.Time Series Forecasting Project  Presentation.
Time Series Forecasting Project Presentation.Anupama Kate
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...Kei Nakagawa
 

Was ist angesagt? (20)

Shap
ShapShap
Shap
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine Learning
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Bayesian Global Optimization
Bayesian Global OptimizationBayesian Global Optimization
Bayesian Global Optimization
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Xgboost
XgboostXgboost
Xgboost
 
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
Graph coloring problem
Graph coloring problemGraph coloring problem
Graph coloring problem
 
LeNet to ResNet
LeNet to ResNetLeNet to ResNet
LeNet to ResNet
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Graph Coloring : Greedy Algorithm & Welsh Powell Algorithm
Graph Coloring : Greedy Algorithm & Welsh Powell AlgorithmGraph Coloring : Greedy Algorithm & Welsh Powell Algorithm
Graph Coloring : Greedy Algorithm & Welsh Powell Algorithm
 
Introduction to Some Tree based Learning Method
Introduction to Some Tree based Learning MethodIntroduction to Some Tree based Learning Method
Introduction to Some Tree based Learning Method
 
Time Series Forecasting Project Presentation.
Time Series Forecasting Project  Presentation.Time Series Forecasting Project  Presentation.
Time Series Forecasting Project Presentation.
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...
 

Ähnlich wie Learning to Learn by Gradient Descent by Gradient Descent

Optimization as a model for few shot learning
Optimization as a model for few shot learningOptimization as a model for few shot learning
Optimization as a model for few shot learningKaty Lee
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxMohammadAsim91
 
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...dyyjkd
 
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Jeromy Anglim
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Wuhyun Rico Shin
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learningDongHyun Kwak
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyNUPUR YADAV
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsViral Gupta
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningNAVER Engineering
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningDongHyun Kwak
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Introduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural netsIntroduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural netsSayak Paul
 
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...ssuser4b1f48
 
ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016InVID Project
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNINGMLReview
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.pptJayaprasanna4
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.pptJayaprasanna4
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 

Ähnlich wie Learning to Learn by Gradient Descent by Gradient Descent (20)

Optimization as a model for few shot learning
Optimization as a model for few shot learningOptimization as a model for few shot learning
Optimization as a model for few shot learning
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptx
 
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
 
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation Systems
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
Introduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural netsIntroduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural nets
 
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppt
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppt
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 

Mehr von Katy Lee

ICML 2017 Meta network
ICML 2017 Meta networkICML 2017 Meta network
ICML 2017 Meta networkKaty Lee
 
Technical interview experience sharing
Technical interview experience sharingTechnical interview experience sharing
Technical interview experience sharingKaty Lee
 
Overcoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural networkOvercoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural networkKaty Lee
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural networkKaty Lee
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionKaty Lee
 
FinalReport
FinalReportFinalReport
FinalReportKaty Lee
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_InterpreterKaty Lee
 

Mehr von Katy Lee (7)

ICML 2017 Meta network
ICML 2017 Meta networkICML 2017 Meta network
ICML 2017 Meta network
 
Technical interview experience sharing
Technical interview experience sharingTechnical interview experience sharing
Technical interview experience sharing
 
Overcoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural networkOvercoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural network
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural network
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursion
 
FinalReport
FinalReportFinalReport
FinalReport
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_Interpreter
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Learning to Learn by Gradient Descent by Gradient Descent

  • 1. Learning to learn by gradient descent by gradient descent citation: 9 -> 38 Katy, 2016/11/25@DataLab NIPS 2016
  • 2. Background • learn: 1. a task 2. training experience 3. a performance measure • a computer program is said to learn if its performance at the task improves with experience. Mitchell [Mitchell, 1993]
  • 3. Background • learning to learn: 1. a family of tasks 2. training experience for each of these tasks 3. a family of performance measures • an algorithm is said to learn to learn if its performance at each task improves with experience and with the number of tasks. Thrun, Sebastian, and Lorien Pratt, eds. Learning to learn. Springer Science & Business Media, 2012.
  • 4. Background • Frequently, tasks in machine learning can be expressed as the problem of optimizing an objective function defined over some domain • The goal is to find the minimizer • the standard approach for differentiable functions is some form of gradient descent, resulting in a sequence of updates
  • 5. Motivation • Most of the modern work is based around designing update rules for specific classes of problems, it might perform poorly on other class of problems
  • 6. Motivation • In this work we take a different tack and instead propose to replace hand-designed update rules with a learned update rule
  • 7. Outline • Related work • Main idea • Evaluation • Conclusion
  • 8. Outline • Related work • Main idea • Evaluation • Conclusion
  • 9. Related Work • C. Daniel, J. Taylor, and S. Nowozin. Learning step size controllers for robust neural network training. In Association for the Advancement of Artificial Intelligence, 2016.
  • 10. Outline • Related work or naïve methods • Main idea • Evaluation • Conclusion
  • 11. Learning to learn with RNN
  • 12. • In this work, they proposed to replace hand- designed update rules with a learned update rule, which we called the optimizer(a LSTM) m, with its own parameter • This results in updates to the optimizee f of the form φ gt is the output of LSTM
  • 13. How to train the optimizer • For training the optimizer, we have an objective that depends on the trajectory for a time horizon T • θ the optimizee parameters • ϕ: the optimizer parameters • f: the function in question m is the LSTM
  • 14. Intuition of Trajectory old trajectory trajectory with new φ
  • 15.
  • 16. Challenge • too many parameters in LSTM • solution?
  • 18. Information Sharing Between Coordinates • global average cells(GAC) designate a subset of the cells in each LSTM layer for communication. their outgoing activations are averages at each step across all coordinates. • allowing different LSTMs to communicate with each other
  • 19. Outline • Related work • Main idea • Evaluation • Conclusion
  • 22. Experiment 3: systematically changing NN architecture LSTM train on one-hiddent-layer 20-units NN
  • 23. Experiment 4 on covnet on CIFAR-10 LSTM-sub: train on only hold out dataset
  • 24. Experiment 5 on Neural Style, optimizer train on only one style and 1800 content image from imageNet
  • 25. Outline • Related work or naïve methods • Main idea • Evaluation • Conclusion
  • 26. Conclusion • So far the learning process is handcraft, but this work shows how to train a NN by a NN • generalize well on different architecture but not on different activation function • execution time? • sometimes, when you are confused for long, try to email the author(all of them). A typo can kill you.