Analyzing the Sensitivity to Policy-Value Decoupling in Deep Reinforcement Learning Generalization.
Nasik Muhammad NafiRaja Farrukh AliWilliam H. HsuPublished in: AAMAS (2023)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- action selection
- function approximators
- state space
- markov decision process
- partially observable environments
- markov decision processes
- input output
- reinforcement learning problems
- approximate dynamic programming
- control policy
- supervised learning
- policy evaluation
- state and action spaces
- control policies
- average reward
- policy iteration
- sensitivity analysis
- reward function
- temporal difference
- continuous state spaces
- reinforcement learning algorithms
- partially observable
- partially observable domains
- action space
- function approximation
- state action
- multi agent
- dynamic programming
- decision problems
- markov decision problems
- model free
- machine learning
- control problems
- agent learns
- transfer learning
- model free reinforcement learning
- inverse reinforcement learning
- actor critic
- long run
- reinforcement learning methods
- temporal difference learning
- deep learning
- policy makers