Playing 20 Question Game with Policy-Based Reinforcement Learning.
Huang HuXianchao WuBingfeng LuoChongyang TaoCan XuWei WuZhan ChenPublished in: CoRR (2018)
Keyphrases
- reinforcement learning
- game playing
- optimal policy
- computer games
- temporal difference learning
- policy search
- agent learns
- card game
- markov decision process
- markov games
- board game
- reinforcement learning algorithms
- action selection
- imperfect information
- pac man
- online game
- multi player
- reward function
- markov decision processes
- game players
- markov decision problems
- reinforcement learning problems
- state and action spaces
- games played
- partially observable environments
- policy gradient
- control policies
- video games
- partially observable
- actor critic
- control policy
- policy iteration
- human players
- function approximators
- state space
- action space
- optimal control
- partially observable domains
- continuous state spaces
- game tree search
- transition model
- decision problems
- game design
- infinite horizon
- policy evaluation
- game theory
- nash equilibrium
- game based learning
- game play
- rl algorithms
- educational games
- state action
- game tree
- temporal difference
- learning algorithm
- multi agent
- dynamic programming
- average reward
- stochastic games
- function approximation
- reinforcement learning methods
- partially observable markov decision processes
- two player games
- minimax search
- game theoretic
- computer poker