Offline Reinforcement Learning Hands-On.
Louis MonierJakub KmecAlexandre LaterreThomas PierrotValentin CourgeauOlivier SigaudKarim BeguirPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- real time
- real life
- state space
- optimal policy
- learning algorithm
- temporal difference
- machine learning
- stochastic approximation
- action selection
- model free
- optimal control
- markov decision processes
- key concepts
- dynamic programming
- hidden markov models
- evolutionary algorithm
- multi agent
- temporal difference learning
- decision making
- world class
- direct policy search