This talk briefly covers deep reinforcemeent learning on spark and the benefits of using large scale commodity compute with gpus for ease of running simulations as well as distributed training for use cases that aren't games such as network intrusion and risk. This talk also briefly mentions rl4j and our work with openai gym.
3. Overview
● Why am I up here?
● Reinforcement Learning
● Use cases
● Demo!
● Deep Reinforcement Learning
● Rl4j
● Dl4j
● Spark/RL - why?
4. Why am I up
here?
Wrote this -->
Book Giveaway!
5. Reinforcement
Learning
● Learn a “policy” with repeated trial
and error
● An agent explores a search space
● Learns from rewards and penalties
each time it takes a step
● Think of win/lose scenarios
● Rewards/punishment set by an
“environment”
Credit:
http://ai.berkeley.edu/reinforcement.ht
ml
6. Use cases (not
games!)
● Risk analysis (loans)
● Network Intrusion
● Learning patterns from
simulations (MCMC)
8. Deep
Reinforcement
Learning
● Teach a neural net from environment
● Policy determines gradient descent steps
● Most work has been based on raw frames
from games (pixel input)
● Various techniques (A3C,Policy Gradients,Deep
Q,..)
● Core idea: Neural net has a softmax
(probability distribution) mapped to actions to
take in an environment
9. RL4j
● Deep Reinforcement Learning
library for Java
● Openai Gym Intregration
● Deep Reinforcement Learning
with DL4j
● Implementations of A3C,DeepQ,
Policy Gradients
● Openai Gym Java Bindings
11. Dl4j
● Import keras models
● Focus on running in production
● Integrate with existing big data ecosystem
● Transparent usage of cpus and gpus
● End to end ecosystem for building data
products (not just algorithms!)
12. Spark/RL Why?
● Spark is distributed compute
● A lot of simulations and
environments to run
● Distributed workers running
experiments in parallel
● Data Parallelism with neural nets
13. Summary
● Spark for orchestrating simulations
● Spark for distributed training
● Integrated storage with HDFS
● Orchestrate GPU based spark jobs
● Easy to hook in to production (java/scala)
● Great streaming ecosystem for incremental
updates