Approximate regret based elicitation in Markov decision process.
Pegah AlizadehYann ChevaleyreJean-Daniel ZuckerPublished in: RIVF (2015)
Keyphrases
- markov decision process
- reward function
- minimax regret
- state space
- markov decision processes
- reinforcement learning
- optimal policy
- finite horizon
- utility elicitation
- policy iteration
- infinite horizon
- markov games
- transition matrices
- initial state
- lower bound
- factored mdps
- transition probabilities
- multiple agents
- utility function
- average cost
- decision problems
- preference elicitation
- markov chain
- learning algorithm