Using Bisimulation for Policy Transfer in MDPs.
Pablo Samuel CastroDoina PrecupPublished in: AAAI (2010)
Keyphrases
- optimal policy
- markov decision processes
- markov decision process
- markov decision problems
- finite horizon
- policy search
- policy iteration
- average cost
- average reward
- reinforcement learning
- reward function
- partially observable
- infinite horizon
- dynamic programming
- state space
- state and action spaces
- finite state
- long run
- policy evaluation
- reinforcement learning problems
- reinforcement learning algorithms
- transfer learning
- multistage
- decision problems
- factored mdps
- control policies
- linear programming
- decision processes
- knowledge transfer
- continuous state spaces
- cross domain
- inverse reinforcement learning
- decision theoretic planning
- initial state
- action selection
- function approximation
- decision diagrams
- probabilistic planning
- state dependent
- sufficient conditions
- least squares
- supply chain
- expected reward
- approximate policy iteration
- stochastic shortest path
- factored markov decision processes