Tight Performance Guarantees of Imitator Policies with Continuous Actions.
Davide MaranAlberto Maria MetelliMarcello RestelliPublished in: CoRR (2022)
Keyphrases
- continuous action
- decision processes
- upper bound
- policy search
- initial state
- optimal policy
- temporally extended
- action space
- selective perception
- multiagent reinforcement learning
- lower bound
- reward function
- goal directed
- macro actions
- worst case
- reasoning process
- action selection
- state transitions
- management policies
- human activities
- action recognition
- sufficient conditions
- reinforcement learning
- fitted q iteration