Contractual Reinforcement Learning: Pulling Arms with Invisible Hands.
Jibang WuSiyu ChenMengdi WangHuazheng WangHaifeng XuPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- function approximation
- service providers
- temporal difference
- neural network
- multi agent
- data sets
- multi armed bandits
- multi agent reinforcement learning
- learning agents
- partially observable
- reinforcement learning algorithms
- action selection
- markov decision processes
- supervised learning
- dynamic programming
- learning algorithm
- hidden markov models
- computer vision
- partially observable markov decision processes
- temporal difference learning
- machine learning
- robotic control
- real time