Keyphrases
- reward function
- inverse reinforcement learning
- partially observable environments
- reinforcement learning
- average reward
- expected reward
- optimal policy
- policy gradient
- total reward
- infinite horizon
- control policy
- database
- website
- eligibility traces
- agent receives
- discounted reward
- control policies
- state action
- access control
- state space
- dynamic programming
- data sets