Policy Invariance under Reward Transformations for General-Sum Stochastic Games.
Xiaosong LuHoward M. SchwartzSidney Nascimento GivigiPublished in: J. Artif. Intell. Res. (2011)
Keyphrases
- partially observable environments
- inverse reinforcement learning
- average reward
- reward function
- image transformations
- optimal policy
- reinforcement learning
- total reward
- policy gradient
- invariant representations
- markov decision processes
- expected reward
- long run
- machine learning
- discriminative power
- average cost
- control policy
- agent receives
- finite state
- object recognition
- preference elicitation
- state space
- policy makers
- invariant features