Login / Signup
Continuous Value Function Approximation for Sequential Bidding Policies.
Craig Boutilier
Moisés Goldszmidt
Bikash Sabata
Published in:
UAI (1999)
Keyphrases
</>
state space
optimal policy
basis functions
special case
markov decision process
neural network
markov decision processes
approximation algorithms
online auctions
temporal difference
temporal difference learning
fitted q iteration