gpucoder.stridedMatrixMultiply
Optimized GPU implementation of strided and batched matrix multiply operation
Description
D = gpucoder.stridedMatrixMultiply(A,B)A and B for each instance of the batch are located
        at fixed address offsets from their addresses in the previous instance. The
          gpucoder.stridedMatrixMultiply function performs matrix-matrix
        multiplication of the form:
where  is a scalar multiplication factor, A,
          B, and D are matrices with dimensions
          m-by-k,
        k-by-n, and
          m-by-n respectively. You can optionally transpose or
        hermitian-conjugate A and B. By default,  is set to one and the matrices are not transposed. To specify a different
        scalar multiplication factor and perform transpose operations on the input matrices, use the
          Name,Value pair arguments.
All the batches passed to the
          gpucoder.stridedMatrixMultiply function must be uniform. That is,
        all instances must have the same dimensions
          m,n,k.
___ = gpucoder.stridedMatrixMultiply(___,
        performs strided batched matrix multiply operation by using the options specified by one or
        more Name,Value)Name,Value pair arguments.
Examples
Input Arguments
Name-Value Arguments
Output Arguments
Version History
Introduced in R2020a
See Also
Apps
Functions
- codegen|- coder.gpu.kernel|- coder.gpu.kernelfun|- gpucoder.stridedMatrixMultiplyAdd|- gpucoder.batchedMatrixMultiply|- gpucoder.batchedMatrixMultiplyAdd