Versatile Direct and Transpose Matrix Multiplication with Chained Operations: An Optimized Architecture Using Circulant Matrices.
Taras IakymchukAlfredo Rosado MuñozManuel Bataller-MompeánJosé Vicente Francés-VílloraEmmanuel Ovie OsimiryPublished in: IEEE Trans. Computers (2016)