Discovering an Aid Policy to Minimize Student Evasion Using Offline Reinforcement Learning.
Leandro M. de LimaRenato A. KrohlingPublished in: IJCNN (2021)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- state space
- action selection
- learning process
- partially observable environments
- markov decision processes
- actor critic
- policy iteration
- reinforcement learning problems
- approximate dynamic programming
- reinforcement learning algorithms
- control policy
- action space
- student learning
- partially observable
- function approximation
- control policies
- state and action spaces
- intelligent tutoring systems
- learning environment
- dynamic programming
- policy evaluation
- policy gradient
- continuous state spaces
- decision problems
- function approximators
- reward function
- markov decision problems
- partially observable markov decision processes
- average reward
- rl algorithms
- high school students
- real time
- optimal control
- state action
- student model
- state dependent
- infinite horizon
- model free
- learning styles
- machine learning
- continuous state
- e learning
- policy makers
- countermeasures
- inverse reinforcement learning
- finite state
- tutoring system
- policy gradient methods