PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards.
Prasoon GoyalScott NiekumRaymond J. MooneyPublished in: CoRL (2020)
Keyphrases
- reinforcement learning
- natural language
- markov decision processes
- function approximation
- state space
- machine learning
- reinforcement learning algorithms
- learning algorithm
- optimal policy
- natural language processing
- input image
- knowledge representation
- image pixels
- natural language interface
- natural language generation
- partially observable
- hidden state
- reward shaping
- temporal difference
- semantic analysis
- language processing
- model free
- learning process
- multi agent
- pixel values
- question answering
- average reward
- learning classifier systems
- pixel wise
- action selection
- optimal control
- reward function
- transfer learning
- partially observable markov decision processes
- action space
- control policy
- information extraction
- multiarmed bandit