Text2Reward: Reward Shaping with Language Models for Reinforcement Learning.
Tianbao XieSiheng ZhaoChen Henry WuYitao LiuQian LuoVictor ZhongYanchao YangTao YuPublished in: ICLR (2024)
Keyphrases
- language model
- reward shaping
- reinforcement learning
- information retrieval
- language modeling
- reinforcement learning algorithms
- complex domains
- n gram
- language modelling
- retrieval model
- document retrieval
- text retrieval
- probabilistic model
- query expansion
- state space
- markov decision problems
- function approximation
- reward function
- test collection
- statistical language models
- text mining
- machine learning
- vector space model
- model free
- dynamic programming
- policy search
- keywords
- relevance model
- web documents
- learning algorithm
- markov decision processes
- optimal control
- semantic information
- temporal difference
- mobile robot
- policy gradient
- domain knowledge
- learning process