Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization.
Shixiang Shane GuManfred DiazC. Daniel FreemanHiroki FurutaSeyed Kamyar Seyed GhasemipourAnton RaichukByron DavidErik FreyErwin CoumansOlivier BachemPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- data driven
- human behavior
- engineering design
- artificial intelligence
- state space
- optimal policy
- engineering problems
- function approximation
- average reward
- complex domains
- total reward
- learning agent
- policy iteration
- real robot
- reinforcement learning algorithms
- model free
- long run
- markov decision processes
- mobile robot