Calcul d'une politique déterministe dans un MDP avec récompenses imprécises.
Pegah AlizadehAomar OsmaniEmiliano TraversiPublished in: EGC (2019)
Keyphrases
- markov decision processes
- markov decision process
- optimal policy
- state space
- reinforcement learning
- planning under uncertainty
- linear programming
- utility function
- action sets
- dynamic programming algorithms
- policy iteration
- finite state
- neural network
- probabilistic planning
- initial state
- decision problems
- search algorithm
- case study
- transition probabilities
- decision makers
- artificial intelligence
- data sets
- bayesian reinforcement learning