How can I efficiently save and access large arrays generated in nested loops?

6 次查看(过去 30 天)
I need to run nested for-loops over the variables J1 and J2. The range for J1 is 1 to 41, and the range for J2 is 1 to 9. Inside these loops, I evaluate 16 functions, each of which returns an array of complex numbers with a size of 500 by 502.
I used the following given method to save the data, and it produced an 11 GB file, which seems very large. Is this normal? What is an efficient way to save this data at the end of the calculation?
What I want to do with this data afterward:
I will need to access the 16 arrays, A1 to A16, within the same J1 and J2 loop to perform other operations. Therefore, I want to store the data in a way that allows easy access to these 16 arrays within the loops.
My method to store data:
all_data = cell(41,9);
for J1 = 1:41
for J2 = 1:9
%evaluate 16 function to get 16 arrays (A1 to A16) of size 500 x 502:
all_data{J1,J2} = struct("A1", A1,...
"A2", A2,...
"A3", A3,...
"A4", A4,...
"A5", A5,...
"A6", A6,...
"A7", A7,...
"A8", A8,...
"A9", A9,...
"A10", A10,...
"A11", A11,...
"A12", A12,...
"A13", A13,...
"A14", A14,...
"A15", A15,...
"A16", A16);
end
end
save('Saved_Data.mat','-v7.3');

采纳的回答

Matt J
Matt J 2024-8-22
编辑:Matt J 2024-8-22
I used the following given method to save the data, and it produced an 11 GB file, which seems very large.
The memory consumption is about right if you are using double floats,
numGB=prod([500,502,16, 41,9])*8/2^30
numGB = 11.0410
In terms of RAM access, it would probably be faster to organize it is a multidimensional array, as below, and as single floats if you don't need double precision.
all_data=rand(500,502,16, 41,9,"single");
for J2 = 1:9
for J1 = 1:41
for J3=1:16 %evaluate 16 functions func{J3}
all_data(:,:,J3,J1,J2)=func{J3}(___) ;
end
end
end
  3 个评论
Luqman Saleem
Luqman Saleem 2024-8-22
编辑:Luqman Saleem 2024-8-22
Alright, I tried saving the data in 41*9=369 folders with 16 csv files each. The total size of all the files combined is again 12 GB.
Matt J
Matt J 2024-8-22
编辑:Matt J 2024-8-22
Yes, I don't think you hope for much compression on disk. Unless perhaps the data is sparse, or consists of integers?

请先登录,再进行评论。

更多回答(1 个)

Walter Roberson
Walter Roberson 2024-8-22
all_data{J1,J2} = struct("A1", A1,...
You are creating a separate struct for each {J1,J2}, complete with all of the struct overhead. It would be more efficient if you use
all_data(J1,J2) = struct("A1", A1,...
so as to create a struct array. struct arrays have lower overhead compared to creating a seperate struct for each case.
You will need to initialize all_data differently. I suggest,
clear all_data
for J1 = 41:-1:1
for J2 = 9:-1:1
%evaluate 16 function to get 16 arrays (A1 to A16) of size 500 x 502:
all_data(J1,J2) = struct("A1", A1,...
Counting backwards like this will have the side effect of initializing the struct array to its largest size, and then to fill in the pieces. This approach avoids growing the struct array dynamically.

类别

Help CenterFile Exchange 中查找有关 Whos 的更多信息

产品


版本

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by