An Entropy Regularization Free Mechanism for Policy-based Reinforcement Learning.
Changnan XiaoHaosen ShiJiajun FanShihong DengPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- action selection
- markov decision process
- control policy
- action space
- partially observable
- markov decision processes
- function approximation
- policy iteration
- control policies
- partially observable environments
- approximate dynamic programming
- actor critic
- reinforcement learning algorithms
- reinforcement learning problems
- state space
- reward function
- model free
- policy evaluation
- state and action spaces
- control problems
- function approximators
- machine learning
- dynamic programming
- policy gradient
- state dependent
- learning algorithm
- information theoretic
- decision problems
- optimal control
- inverse reinforcement learning
- mutual information
- similarity measure
- multi agent
- state action
- access control
- average cost
- learning mechanism