Counting Reward Automata: Sample Efficient Reinforcement Learning Through the Exploitation of Reward Function Structure.
Tristan BesterBenjamin RosmanSteven JamesGeraud Nangue TassePublished in: CoRR (2023)
Keyphrases
- reward function
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- state space
- optimal policy
- inverse reinforcement learning
- markov decision process
- policy search
- partially observable
- multiple agents
- state variables
- transition model
- transition probabilities
- finite state
- hierarchical reinforcement learning
- markov chain
- state action
- model free
- initially unknown
- learning agent
- reward shaping
- markov decision problems
- control policies
- average reward
- action space
- complex systems
- prior knowledge
- multi agent