TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning.
Shangding GuAlois KnollMing JinPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- computer programming
- learning process
- multi agent
- computer programs
- function approximation
- rl algorithms
- state space
- reinforcement learning algorithms
- temporal difference
- model free
- markov decision processes
- solve complex tasks
- cooperative
- team members
- learning algorithm
- control problems
- partially observable domains
- e learning
- learning environment
- machine learning
- virtual learning community
- elementary school students
- elementary school
- actor critic
- transfer learning
- problem based learning
- direct policy search
- robocup soccer
- computer networking
- autonomous learning
- distance learning
- temporal difference learning
- policy iteration
- markov decision process
- online learning
- partially observable
- multi agent reinforcement learning
- children learn
- action selection
- td learning
- high school
- computer technology
- continuous state and action spaces
- higher education
- reward function