gpucoder.stridedMatrixMultiply
Optimized GPU implementation of strided and batched matrix multiply operation
Since R2020a
Description
performs strided matrix-matrix multiplication of a batch of matrices. The input matrices
D
= gpucoder.stridedMatrixMultiply(A
,B
)A
and B
for each instance of the batch are located
at fixed address offsets from their addresses in the previous instance. The
gpucoder.stridedMatrixMultiply
function performs matrix-matrix
multiplication of the form:
where is a scalar multiplication factor, A
,
B
, and D
are matrices with dimensions
m
-by-k
,
k
-by-n
, and
m
-by-n
respectively. You can optionally transpose or
hermitian-conjugate A
and B
. By default, is set to one and the matrices are not transposed. To specify a different
scalar multiplication factor and perform transpose operations on the input matrices, use the
Name,Value
pair arguments.
All the batches passed to the
gpucoder.stridedMatrixMultiply
function must be uniform. That is,
all instances must have the same dimensions
m,n,k
.
___ = gpucoder.stridedMatrixMultiply(___,
performs strided batched matrix multiply operation by using the options specified by one or
more Name,Value
)Name,Value
pair arguments.
Examples
Input Arguments
Output Arguments
Version History
Introduced in R2020a
See Also
Apps
Functions
codegen
|coder.gpu.kernel
|coder.gpu.kernelfun
|gpucoder.stridedMatrixMultiplyAdd
|gpucoder.batchedMatrixMultiply
|gpucoder.batchedMatrixMultiplyAdd