Reinforcement learning for DEC-MDPs with changing action sets and partially ordered dependencies.
Thomas GabelMartin A. RiedmillerPublished in: AAMAS (3) (2008)
Keyphrases
- action sets
- partially ordered
- reinforcement learning
- dec mdps
- partial order
- markov decision processes
- state space
- totally ordered
- function approximation
- finite state
- event calculus
- learning algorithm
- machine learning
- total order
- optimal control
- temporal difference
- markov decision process
- incomplete information
- optimal policy
- dynamic programming
- policy iteration
- batch mode