Speed up loading struct from file.
20 次查看(过去 30 天)
显示 更早的评论
Hi,
I am looking for a way to speed up saving & loading ~8GB of data. Currently, it is all contained within one structure. The structure has a format similar to the code below - there is also some metadata at each level of the struct not shown here.
for subNum=1:10; % 10 subjects
for trialNum=1:50; % 50 trials per subject
for dataStreamNum=1:50; % 50 data streams per subject
dataMatrix=rand(3,3000); % Each data stream is 3x3000
structName.Subject(subNum).Trial(trialNum).Data(dataStreamNum).Matrix=dataMatrix; % Data in matrix form
end
end
end
I looked into matfile to be able to load just part of the structure, but found that matfile doesn't allow for accessing specific fields. This post made me start thinking about splitting up each trial into its own separate .mat file (in this example there would be 500 .mat files, each of which is a smaller struct). So, I have two questions in total:
- Is there an alternative to matfile that would allow me to load just one trial at a time, from an 8GB struct, such as:
structName.Subject(4).Trial(15);
2. If there is no such alternative, if I use the load() command on 500 .mat files one at a time (for a total of 8GB of data), would that be noticeably slower or faster than using load() on 1 8GB .mat file?
Thank you!
1 个评论
Walter Roberson
2021-8-28
With files over 2 GB, to save as a .mat file, you would have to be using -v7.3 flag, which causes the writing to be done in HDF5 format. HDF5 format is not all that efficient for arrays of struct; it more or less requires that each array member be stored as a sub-dataset and then have the struct array internally be an array of references to sub-datasets.
Because of this, you might want to experiment to see what you can do with NetCDF 3 -- 3.6 and later has large file support. But beware that NetCDF 4 is HDF5 underneath...
回答(1 个)
Chunru
2021-8-28
It seems that you have very regular data. Instead of using struct, you can simply use N-D numerical array which is faster and more efficient. Using matfile, you can easily get a small portion of data.
% for subNum=1:10; % 10 subjects
% for trialNum=1:50; % 50 trials per subject
% for dataStreamNum=1:50; % 50 data streams per subject
% dataMatrix=rand(3,3000); % Each data stream is 3x3000
% structName.Subject(subNum).Trial(trialNum).Data(dataStreamNum).Matrix=dataMatrix; % Data in matrix form
% end
% end
% end
Data(3, 3000, 50, 50, 10);
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Structures 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!