inconsistent mat-file sizes

3 次查看(过去 30 天)
Christian
Christian 2020-2-21
Hi guys!
I'm facing a very weird issue and I have run out of ideas how to identify the problem. So any help or hint is highly appreciated!
I have data from three different data acquisition systems from my test bench and in one of my scripts I merge all of them into one mat-file and synchronize them based on a global time stamp. The file sizes are pretty constant and from DAQ #1 around 40mb, from DAQ #2 around 10mb and from DAQ #3 around 50mb. Furthermore all test runs are performed on the same configuration (output quantaties, sample rate etc.).
Now, in 9 out of 10 times merging, syncing and exporting results in a .mat-File of around 150mb. But every once in a while I come across a test run which forces me to use the save(...,'-v7.3') command, because the file size will exceed 2gb. Eventually saving it with will result in a file of around 5gb! The strange thing though, when I'm looking at the size of the struct-fields with "whos", I'll always end up with around 2e+09 bytes for the structs in the 150mb files and for the structs in the 5gb files. Do you have any idea, why the exported file size differs so much, although all the DAQ-files have almost the same size and the "whos" command results in similar values as well?
Cheers, Christian
  5 个评论
Christian
Christian 2020-2-21
编辑:Christian 2020-2-21
I can totally do that, but I think I just came across the cause of the problem!
I analyzed 6 test runs and monitored the "internal" struct file size before exporting. And here are the results:
#1: 1971mb -> *.mat-file size: ~160mb
#2: 1976mb -> *.mat-file size: ~160mb
#3: 1984mb -> *.mat-file size: ~160mb
#4: 1914mb -> *.mat-file size: ~160mb
#5: 1970mb -> *.mat-file size: ~160mb
#6: 2073mb -> *.mat-file size: 5gb
Now, since the "internal" size of test runs 1-5 is below 2gb, they can be exported with the normal save command. Test run #6 exceeds the 2gb threshold and has to be exported with the v7.3 save command, which will result in a significantilly higher file size, because v7.3 uses a completely different format.
I haven't looked at the "internal" size before. Instead I thought the 2gb threshold is related to the exported mat-file, which is obviously not the case! But that means, the normal "save" command somehow compresses the data?
And other than that, it is just unfortune, that all the test runs are close to the internal 2gb threshold and it's just going to be a hit or miss situation. Which sucks, because 1mb difference can result in 5gb difference once they've been exported.
Walter Roberson
Walter Roberson 2020-2-21
Yes, -v5 and -v7 compress numeric data. -v7.3 stores data a very different way, and compression can be wonky for it.
The -v7.3 method has significantly more overhead for every struct field of every struct array index, and for every cell of a cell array. If I understand correctly, the format basically requires that a fake variable be created for each different one of those, and that fake variable has to have its dimensions described and so on. The -v5 and -v7 do not require the creation of fake variables, and although they have overhead for each structure entry and cell array entry, the overhead is less than for -v7.3.

请先登录,再进行评论。

回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Structures 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by