Identifying Reward Functions using Anchor Actions.
Sinong GengHoussam NassifCarlos A. ManzanaresA. Max ReppenRonnie SircarPublished in: CoRR (2020)
Keyphrases
- reward function
- reinforcement learning
- markov decision processes
- multiple agents
- state space
- reinforcement learning algorithms
- partially observable
- inverse reinforcement learning
- optimal policy
- initially unknown
- simple examples
- state variables
- policy search
- markov decision process
- state action
- transition probabilities
- multi agent
- machine learning
- data mining
- function approximation
- average reward
- generative model
- markov chain
- higher order