Efficient large matrix operations in C MEX S-functions

3 次查看(过去 30 天)
In custom programmed C MEX S-functions I need to perform operations using large three-dimensional matrices (in the order of 300k ... 10M data points). The C MEX S-functions also contain inputs and outputs of such size. In the C MEX S-functions, the 3d-data is inherently handled as vectors which therefore require proper indexing.
Currently, I'm using three nested for-loops (for every dimension) to acces the 3d-matrix data and to write output data - one data point at a time. Is there a way or are there recommended practices on how to increase computational perfomance in such case? I'm thinking of matrix multiplication or similar operations. It seems using nested for-loops yield in bad computational efficiency, even though I use C MEX S-functions which are compiled before executing the Simulink model.
  2 个评论
James Tursa
James Tursa 2023-10-17
Writing text or binary? What will be reading this output data? How is writing output data related to matrix multiplication or similar operations?
v k
v k 2023-10-19
编辑:v k 2023-10-19
I'm not sure what you mean. I'm writing c code and use the MinGW64 Compiler to get the .mex functions. The result is a block in Simulink. Its outputs are read by other C MEX S-functions and standard Simulink blocks.
My question consists of basically two similar aspects. The first is about handling large matrices/vectors within the C MEX S-functions; this is where I am looking for a more efficient way than running through nested for-loops.
The second aspect is about writing outputs faster, because this is also done by running nested for-loops.
Currently, something like this is implemented:
// 'my_3D_data1' and 'my_3D_data2' are 3D matrices of size A_MAX x B_MAX x C_MAX and type real_T. But in the C function, they are handled as a vector, which is why I need this kind of indexing
// 'my_output' is a 3D output of the function of size A_MAX x B_MAX x C_MAX. But in the C function, it is handled as well as a vector, which is why I need this kind of indexing
// nested for loops for proper indexing
for (a=0;a<A_MAX;a++){ // loop all entries in dim a
for(b=0;b<B_MAX;b++){ // loop all entries in dim b
for(c=0;c<C_MAX;c++){ // loop all entries in dim c
n = a + b*A_MAX + c*A_MAX*B_MAX; // determine the correct vector index n corresponding to the 3D matrix index
my_3D_data2[n] = my_3D_data1[n] + some_scalar_data; // perform math operations and write result into a specific index n of the 3D matrix
my_output[n] = my_3D_data2[n]; // write data into a specific index n of the 3D output
}
}
}
So in both cases, manipulating/writing data into a 3D matrix (which is handled as a vector) as well as writing data into the output (which is also handled as a vector), I use nested loops.
I would prefer something like this, omitting the loops. But it does not seem to work at all (no compiling errors, but looks like the operation is simply skipped).
my_3D_data2 = my_3D_data1 + some_scalar_data; // perform math operations and write result into 3D matrix
my_output = my_3D_data2; // write data into the 3D output
As I am not sure how the c code is exactly compiled, I cannot tell if there is any kind of optimization. Nonetheless, I guess there is a way to improve the code and make it run faster.

请先登录,再进行评论。

回答(1 个)

Dheeraj
Dheeraj 2023-10-26
Hi,
I understand you are trying to improve performance of your C MEX S-functions when working with large three-dimensional matrices. There are a few things you can do to improve the computational performance of your C MEX S-functions for large 3D matrices.
  1. Avoid nested for loops: Nested for loops can be very inefficient for large matrices, as they require the compiler to generate a lot of redundant code. Instead, try to use vectorized operations whenever possible. For example, instead of using a nested for loop to multiply two 3D matrices, you could use a library function such as BLAS or LAPACK.
  2. Use efficient data structures. When working with large matrices, it is important to use data structures that are efficient for both memory usage and computation. For example, instead of storing your matrices as simple arrays, you could use a sparse matrix format. Sparse matrix formats are particularly efficient for matrices that are mostly empty.
  3. Also, you’ve used nested loops for both reading and writing data, you could use multi-threading if there is no race conditions on the matrices.
Hope this Helps!

产品


版本

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by