SampleViz: Concept based Sampling for Policy Refinement in Deep Reinforcement Learning.
Zhaohui LiangGuan LiRuiqi GuYang WangGuihua ShanPublished in: PacificVis (2024)
Keyphrases
- reinforcement learning
- optimal policy
- approximate policy iteration
- policy search
- action selection
- markov decision process
- markov decision problems
- policy iteration
- control policy
- partially observable
- reward function
- function approximation
- partially observable environments
- approximate dynamic programming
- markov decision processes
- state space
- action space
- model free
- function approximators
- state and action spaces
- dynamic programming
- policy gradient
- control policies
- rl algorithms
- machine learning
- policy evaluation
- random sampling
- reinforcement learning problems
- model free reinforcement learning
- reinforcement learning algorithms
- actor critic
- partially observable markov decision processes
- optimal control
- continuous state spaces
- learning algorithm
- sampling strategy
- text categorization
- sampling algorithm
- temporal difference
- average reward
- multi agent
- reinforcement learning methods
- control problems
- long run
- finite state
- transition model
- infinite horizon
- monte carlo
- transfer learning
- decision problems
- video retrieval
- continuous state
- text clustering
- state dependent
- state action