Evolution of an Internal Reward Function for Reinforcement Learning.
Weiyi ZuoJoachim Winther PedersenSebastian RisiPublished in: GECCO Companion (2023)
Keyphrases
- reward function
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- state space
- optimal policy
- partially observable
- inverse reinforcement learning
- policy search
- hierarchical reinforcement learning
- markov decision process
- multiple agents
- transition model
- learning agent
- function approximation
- average reward
- transition probabilities
- initially unknown
- model free
- temporal difference
- state variables
- state action
- markov decision problems
- pairwise
- learning algorithm
- data mining
- decision problems
- search algorithm
- machine learning