Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning.
Jacob AdamczykArgenis ArriojasStas TiomkinRahul V. KulkarniPublished in: CoRR (2022)
Keyphrases
- reward shaping
- reinforcement learning
- reinforcement learning algorithms
- complex domains
- function approximation
- prior knowledge
- state space
- least squares
- machine learning
- optimal policy
- markov decision process
- model free
- learning process
- multi agent
- markov decision processes
- transfer learning
- decision theoretic
- optimal solution
- objective function
- markov decision problems
- learning algorithm