Reinfocement learning

•Als PPTX, PDF herunterladen•

0 gefällt mir•486 views

GDG Korea

GDG DevFest Seoul 2016 발표자료

Technologie

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Introduction of Reinforcement Learning
1
곽동현
서울대학교 바이오지능 연구실

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Background
• 기존의 강화학습(Reinforcement Learning)에서 Q function을
DNN 혹은 CNN으로 근사하여 문제를 해결하는 시도가 최근
Google DeepMind를 필두로 활발히 연구가 되고 있다.
• 최근 연구에서는 Atari 2600, 바둑을 인간보다 더 잘 플레이하
는 수준의 경이적인 성과를 보이고 있으며, 나아가 3D 게임이
나 로봇 컨트롤 문제에도 적용되고 있다.
2

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
What is AI? ML?
3https://www.linkedin.com/pulse/deep-dive-venture-landscape-ai-ajit-nazre-rahul-garg-nazre

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Various Field with ML
4https://www.linkedin.com/pulse/how-exceed-your-goals-2016-dr-travis-bradberry-1

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Various Algorithm in ML
5

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Function Approximation
6http://arxiv.org/pdf/1411.4555.pdf https://people.mpi-inf.mpg.de/~kkim/supres/supres.htm

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
What is Deep Learning?
7

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Machine Learning
• Supervised Learning :
y = f(x)
• Unsupervised Learning :
x ~ p(x) , x = f(x)
• Reinforcement Learning :
??
8

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Agent-Environment Interaction
• Objective : Maximize the expected sum of future rewards
• Algorithms
1) Planning : Dynamic Programming Based
2) Reinforcement Learning : Machine Learning Based
9

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Example of Supervised
Learning
10

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Polynomial Curve Fitting
11
Microsoft Excel 2007의 추세선

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Example of
Unupervised Learning
12

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Clustering
13
http://www.frankichamaki.com/data-driven-market-segmentation-more-effective-marketing-to-
segments-using-ai/

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Example of
Reinforcement Learning
14

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Videos
• A crawling robot: a Q-learning example
https://www.youtube.com/watch?v=2iNrJx6IDEo
• Deep Reinforcement Learning for Robotic
Manipulation
https://youtu.be/ZhsEKTo7V04?t=1m27s
15

© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
THANK YOU
16

Weitere ähnliche Inhalte

Andere mochten auch

GKAC 2015 Apr. - Xamarin forms, mvvm and testing

GDG Korea

GKAC 2015 Apr. - Android Looper

GDG Korea

Java Micro Edition Platform & Android - Seminar on Small and Mobile Devices

juricde

Memory Networks, Neural Turing Machines, and Question Answering

Akram El-Korashy

Intro to Android : Making your first App!

Stacy Devino

http://360andev.com/sessions/100-async-task-threads-pools-and-executors/ Frome 360 AnDev conference There are many ways to use Threads and in the multithreaded world in which we live, it can be confusing when, where, and how to use these functions correctly. Still, that assumes you know what they all mean and how to manipulate them. Novices and Experts welcome as there are many schools of thought, but we will explore them all together. Don’t worry, we have you covered. Animated Version : but.ly/asyncThread

Async task, threads, pools, and executors oh my!

Stacy Devino

GKAC 2015 Apr. - Battery, 안드로이드를 위한 쉬운 웹 API 호출

GDG Korea

Tensorflow 101

GDG Korea

GKAC 2014 Nov. - Android Wear 개발, 할까요 말까요?

GDG Korea

GKAC 2015 Apr. - RxAndroid

GDG Korea

GKAC 2014 Nov. - RxJava를 활용한 Functional Reactive Programming

GDG Korea

접근성(Accessibility)과 안드로이드

GDG Korea

GKAC 2014 Nov. - 안드로이드 스튜디오로 생산성 올리기

GDG Korea

Android - Preventing common memory leaks

Ali Muzaffar

Instalasi Android 7.0 "Nougat"

anafatwa21

Google Firebase로 레고블럭 조립하기 - IO Extended 2016

Chiung Choi

Andere mochten auch (16)

GKAC 2015 Apr. - Xamarin forms, mvvm and testing

GKAC 2015 Apr. - Android Looper

Java Micro Edition Platform & Android - Seminar on Small and Mobile Devices

Memory Networks, Neural Turing Machines, and Question Answering

Intro to Android : Making your first App!

Async task, threads, pools, and executors oh my!

GKAC 2015 Apr. - Battery, 안드로이드를 위한 쉬운 웹 API 호출

Tensorflow 101

GKAC 2014 Nov. - Android Wear 개발, 할까요 말까요?

GKAC 2015 Apr. - RxAndroid

GKAC 2014 Nov. - RxJava를 활용한 Functional Reactive Programming

접근성(Accessibility)과 안드로이드

GKAC 2014 Nov. - 안드로이드 스튜디오로 생산성 올리기

Android - Preventing common memory leaks

Instalasi Android 7.0 "Nougat"

Google Firebase로 레고블럭 조립하기 - IO Extended 2016

Ähnlich wie Reinfocement learning

Bayesian networks in AI

Byoung-Hee Kim

deep-learning-and-what's-next-with-Chinese-annotation

Tao Wang

Big Data LDN 2017: Deep Learning Demystified

Matt Stubbs

深層学習フレームワーク概要とChainerの事例紹介

Kenta Oono

Session-based recommendations with recurrent neural networks

Zimin Park

Project by Rohan Karunaratne, Heng Yang, and Sahan Karunaratne completed in 2021. The project uses gyroscopic datasets related to walking and falling to train a neural net that can then predict an individual's action in real-time. This system's application was also proven through a reaction wheel belt that uses PWM signals to dynamically change speed and self-stabilize the falling individual. 1st Place Project at the Alameda County Science Fair

Using Neural Net Algorithms to Classify Human Activity, with Applications in ...

Rohan Karunaratne

Convolutional Neural Network

Junho Cho

Ähnlich wie Reinfocement learning (7)

Bayesian networks in AI

deep-learning-and-what's-next-with-Chinese-annotation

Big Data LDN 2017: Deep Learning Demystified

深層学習フレームワーク概要とChainerの事例紹介

Session-based recommendations with recurrent neural networks

Using Neural Net Algorithms to Classify Human Activity, with Applications in ...

Convolutional Neural Network

Kürzlich hochgeladen

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

Discord is a free app offering voice, video, and text chat functionalities, primarily catering to the gaming community. It serves as a hub for users to create and join servers tailored to their interests. Discord’s ecosystem comprises servers, each functioning as a distinct online community with its own channels dedicated to specific topics or activities. Users can engage in text-based discussions, voice calls, or video chats within these channels. Understanding Discord Servers Discord servers are virtual spaces where users congregate to interact, share content, and build communities. Servers may revolve around gaming, hobbies, interests, or fandoms, providing a platform for like-minded individuals to connect. Communication Features Discord offers a range of communication tools, including text channels for messaging, voice channels for real-time audio conversations, and video channels for face-to-face interactions. These features facilitate seamless communication and collaboration. What Does NSFW Mean? The acronym NSFW stands for “Not Safe For Work,” indicating content that may be inappropriate for professional or public settings. NSFW Content NSFW content encompasses material that is sexually explicit, violent, or otherwise graphic in nature. It often includes nudity, profanity, or depictions of sensitive topics.

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

UK Journal

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

debabhi2

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

Scaling API-first – The story of a global engineering organization

Radu Cotescu

The Good, the Bad and the Governed - Why is governance a dirty word? David O'Neill, Chief Operating Officer - APIContext Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

apidays

Top 10 Most Downloaded Games on Play Store in 2024

SynarionITSolutions

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Imagine a world where information flows as swiftly as thought itself, making decision-making as fluid as the data driving it. Every moment is critical, and the right tools can significantly boost your organization’s performance. The power of real-time data automation through FME can turn this vision into reality. Aimed at professionals eager to leverage real-time data for enhanced decision-making and efficiency, this webinar will cover the essentials of real-time data and its significance. We’ll explore: FME’s role in real-time event processing, from data intake and analysis to transformation and reporting An overview of leveraging streams vs. automations FME’s impact across various industries highlighted by real-life case studies Live demonstrations on setting up FME workflows for real-time data Practical advice on getting started, best practices, and tips for effective implementation Join us to enhance your skills in real-time data automation with FME, and take your operational capabilities to the next level.

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Safe Software

Increase engagement and revenue with Muvi Live Paywall! In this presentation, we will explore the five key benefits of using Muvi Live Paywall to monetize your live streams. You'll learn how Muvi Live Paywall can help you: Monetize your live content easily: Set up pay-per-view access to your live streams and start generating revenue from your content. Increase audience engagement: Provide exclusive, premium content behind the paywall to keep your viewers engaged. Gain valuable viewer insights: Track viewer data and analytics to better understand your audience and tailor your content accordingly. Reduce content piracy: Muvi Live Paywall's security features help protect your content from unauthorized distribution. Streamline your workflow: The all-in-one platform simplifies the process of managing and monetizing your live streams. With Muvi Live Paywall, you can take control of your live stream monetization and create a sustainable business model for your content. Learn more about Muvi Live Paywall and start generating revenue from your live streams today!

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams

Roshan Dwivedi

Manulife - Insurer Innovation Award 2024

The Digital Insurer

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison

AWS Community Day CPH - Three problems of Terraform

Boost Fertility New Invention Ups Success Rates.pdf

Apidays New York 2024 - The value of a flexible API Management solution for O...

Automating Google Workspace (GWS) & more with Apps Script

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Axa Assurance Maroc - Insurer Innovation Award 2024

Exploring the Future Potential of AI-Enabled Smartphone Processors

GenAI Risks & Security Meetup 01052024.pdf

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

🐬 The future of MySQL is Postgres 🐘

Scaling API-first – The story of a global engineering organization

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Top 10 Most Downloaded Games on Play Store in 2024

Strategies for Landing an Oracle DBA Job as a Fresher

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams

Manulife - Insurer Innovation Award 2024

Reinfocement learning

2. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Background • 기존의 강화학습(Reinforcement Learning)에서 Q function을 DNN 혹은 CNN으로 근사하여 문제를 해결하는 시도가 최근 Google DeepMind를 필두로 활발히 연구가 되고 있다. • 최근 연구에서는 Atari 2600, 바둑을 인간보다 더 잘 플레이하 는 수준의 경이적인 성과를 보이고 있으며, 나아가 3D 게임이 나 로봇 컨트롤 문제에도 적용되고 있다. 2

9. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Agent-Environment Interaction • Objective : Maximize the expected sum of future rewards • Algorithms 1) Planning : Dynamic Programming Based 2) Reinforcement Learning : Machine Learning Based 9

15. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Videos • A crawling robot: a Q-learning example https://www.youtube.com/watch?v=2iNrJx6IDEo • Deep Reinforcement Learning for Robotic Manipulation https://youtu.be/ZhsEKTo7V04?t=1m27s 15

Hinweis der Redaktion

고전적인 AI 분류에서, 원래 ML은 작은 한 파트였다. 그리고 이 AI를 구현하는 방법 중의 하나가 원래 ML이고 그 안에 Deep이 있다. 아주 작은 일부분 그런데 지금은 흐름이 바뀌어서 AI에서 제시된 분야의 상당 부분이 ML을 통해 연구되고 있다. 따라서 지금의 트랜드는 거의 AI = ML 처럼 되어가고 있지만, 아직도 고전적인 AI 만 연구하는 사람이 많아서 이렇게 말하면 큰일날 수도 있다.
머신러닝은 이렇게 방대한 분야들로부터 탄생한 학문이다. 따라서 처음에 공부하면 굉장히 두서가 없고, 난해하다. 그래서 초반에는 좋은 교재와 세미나를 통한 학습이 필수적이다.
알고리즘마다 경계면을 찾는 방식이 다름
이런식으로 어떤 데이터가 들어왔을 때, ouput을 내는 함수 f를 학습을 통해 찾는다. 명시적인 구현이아니라.

Reinfocement learning

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (16)

Ähnlich wie Reinfocement learning

Ähnlich wie Reinfocement learning (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Reinfocement learning

Hinweis der Redaktion