Solving MDPs with Unknown Rewards Using Nondominated Vector-Valued Functions.
Pegah AlizadehYann ChevaleyreFrançois LévyPublished in: STAIRS (2016)
Keyphrases
- markov decision processes
- vector valued functions
- reinforcement learning
- semi markov decision processes
- transition matrices
- optimal policy
- state space
- markov decision problems
- factored mdps
- finite state
- reward function
- sequential decision making problems
- policy iteration
- algebraic decision diagrams
- decision theoretic planning
- dynamic programming
- finite horizon
- partially observable
- stochastic shortest path
- average cost
- reinforcement learning algorithms
- factored markov decision processes
- combinatorial optimization