Open-Ended Reinforcement Learning with Neural Reward Functions.
Robert MeierAsier MujikaPublished in: CoRR (2022)
Keyphrases
- open ended
- reward function
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- fitted q iteration
- policy search
- state space
- learning outcomes
- partially observable
- inverse reinforcement learning
- optimal policy
- markov decision process
- multi agent
- machine learning
- state variables
- function approximation
- transition probabilities
- transition model
- state action
- multiple agents
- initially unknown
- dynamic programming
- inquiry learning
- learning algorithm
- markov decision problems
- average reward
- partially observable markov decision processes
- temporal difference
- action selection
- generative model
- learning environment