Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards.
Ashwinkumar BadanidiyuruZhe FengTianxi LiHaifeng XuPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- learning algorithm
- reinforcement learning algorithms
- state space
- multi agent
- optimal policy
- model free
- machine learning
- reward shaping
- online auctions
- dynamic programming
- temporal difference
- learning problems
- transfer learning
- multi issue
- control policy
- learning classifier systems
- reward function
- complex domains
- bidding strategies
- hidden state
- language generation
- optimal control
- learning process
- partially observable
- policy iteration
- action space
- electronic marketplaces
- combinatorial auctions
- policy search