Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning.
Tianjian ChenZhanpeng HeMatei T. CiocarliePublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- computational power
- state space
- low cost
- mathematical programming
- markov decision process
- optimization algorithm
- markov decision processes
- action selection
- hardware and software
- optimization problems
- control policies
- reward function
- function approximation
- policy iteration
- partially observable
- partially observable environments
- policy gradient methods
- computer systems
- learning process
- global optimization
- machine learning
- reinforcement learning algorithms
- image processing
- approximate dynamic programming
- policy gradient
- function approximators
- control policy
- neural network
- optimization process
- markov decision problems
- state dependent
- decision problems
- optimization method
- combinatorial optimization
- hardware implementation