Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning.
Rodrigo Toro IcarteToryn Q. KlassenRichard Anthony ValenzanoSheila A. McIlraithPublished in: CoRR (2020)
Keyphrases
- reward function
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- state space
- optimal policy
- partially observable
- inverse reinforcement learning
- multiple agents
- policy search
- transition model
- markov decision process
- model free
- learning algorithm
- initially unknown
- hierarchical reinforcement learning
- state variables
- transition probabilities
- function approximation
- average reward
- markov decision problems
- evaluation function
- action space