Efficient CUDA stream management for multi-DNN real-time inference on embedded GPUs.
Weiguang PangXiantong LuoKailun ChenDong JiLei QiaoWang YiPublished in: J. Syst. Archit. (2023)
Keyphrases
- real time
- graphics hardware
- gpu implementation
- general purpose
- gpu accelerated
- information management
- management system
- low cost
- data streams
- decision support
- graphics processors
- information systems
- real time systems
- neural network
- probabilistic inference
- parallel architectures
- continuous stream
- limited memory
- quality of service
- sliding window
- knowledge management
- probabilistic model