Fast block distributed CUDA implementation of the Hungarian algorithm.
Paulo Alexandre Crisóstomo LopesSatyendra Singh YadavAleksandar IlicSarat Kumar PatraPublished in: J. Parallel Distributed Comput. (2019)
Keyphrases
- parallel implementation
- search space
- detection algorithm
- optimization algorithm
- theoretical analysis
- neural network
- objective function
- recognition algorithm
- dynamic programming
- experimental evaluation
- simulated annealing
- high accuracy
- times faster
- expectation maximization
- probabilistic model
- cost function
- general purpose
- worst case
- computational cost
- segmentation algorithm
- tree structure
- computational complexity
- real time
- k means
- preprocessing
- hardware implementation
- learning algorithm
- parallel computation
- gpu implementation
- gpu accelerated