Towards a combinatorial approach for undiscounted MDPs: student research abstract.
Vahid HashemiPublished in: SAC (2016)
Keyphrases
- markov decision processes
- markov decision problems
- average reward
- policy iteration
- optimal policy
- reinforcement learning
- state space
- factored mdps
- learning environment
- finite state
- learning process
- knowledge level
- infinite horizon
- markov decision process
- student learning
- partially observable
- dynamic programming
- stochastic games
- learning styles
- high school students
- finite horizon
- average cost
- factored markov decision processes
- decision processes
- tutoring system
- reinforcement learning algorithms
- student model
- high level
- long run
- linear programming
- online course
- decision problems
- undergraduate students
- science education
- higher level
- dec pomdps
- stochastic shortest path