So, there are actually three gemmK kernels (corresponding to different
values), and perform the operations:
,
,
. All input arrays () are
column-major (they are still used as performance kernels for row-major
BLAS as well, so don't worry). Additionally, and are in block-major
format, such that
.
Subsections
Clint Whaley
2012-07-10