accessing large MAT file

48 次查看(过去 30 天)
I am trying to access data stored in a large MAT file. The file is 72G of Simulink sim data.
Now, obviously I cannot use the LOAD command on my laptop with 16G of RAM. I thought the reason MathWorks provides the MATFILE command was to allow for accessing large MAT files without loading them.
But that doesn't seem to be the case.
When I attempt to access the file using the MATFILE command, Matlab behaves as if it were loading all that data into memory. My memory utilization goes to 98%, I get an out of memory error, and then Matlab silently crashes and exits.
So I go back to my big linux machine that I used to run Simulink and create this file, and run the MATFILE command there. And indeed it looks like Matlab is loading the whole file into RAM. I am hoping to divide the file up there into separate MAT files, but it is taking a really long time to load this data, and also using all available RAM.
Which leads to my questions: What is the MATFILE command doing? Is this expected behavior? Am I stuck rerunning my simulations and putting all results into separate MAT files? How are truely huge datasets stored and manipulated in Matlab? Evidently it is not with MAT files...
Thanks.
  1 个评论
Sean Little
Sean Little 2019-11-11
After about an hour and a half of waiting, the MATFILE command returned an object without any memory errors on my Linux machine. Now when I try to access individual elements in the MAT file, it appears that the whole file has to be loaded into memory again, even for small informational data fields in the file. I am going to need a different approach. Unless someone can suggest a workaround, I am going to have to abandon using the MATFILE command.

请先登录,再进行评论。

采纳的回答

Sara Nadeau
Sara Nadeau 2019-11-11
I believe you are having trouble with the matfile function because of the format of the logged data.
If you logged the data in Simulink using Dataset format (default format for several releases), you can create Simulink.SimulationData.DatasetRef objects that reference the data in the file without loading it into memory. To access and manipulate data for individual signals, you can create matlab.io.datastore.SimulationDatastore objects.
These additional topics may be helpful for guiding you through creating and using DatasetRef and SimulationDatastore objects:
I hope this helps!
  1 个评论
Sean Little
Sean Little 2019-11-11
I am glad there is a way to do this. I will take a look at those doc links and try this out. Thanks for the help!

请先登录,再进行评论。

更多回答(1 个)

Guillaume
Guillaume 2019-11-11
See the limitations section of matfile to see what it can and can't do. In particular, the granularity of matfile is typically at the variable level. I.e you can select which variables to load, however apart from numerical matrices, if you load a variable you load all of it.
It's unclear what's in your mat file but it sounds like it's objects, perhaps just one object, in which case you won't benefit much from matfile.
  2 个评论
Sean Little
Sean Little 2019-11-11
When I read the limitations section, I was assuming that "user defined objects" did not include MathWorks defined Simulink objects. Obviously that was a bad assumption on my part.
It is really surprising to me that there is so much overhead required to access data in a large MAT file. I thought that is exactly what this command was supposed to avoid.
Guillaume
Guillaume 2019-11-11
It's not designed for objects unfortunately, it's designed for accessing large numerical matrices.
Since you have such a large mat file I assume you're using the 7.3 format. This format is based on HDF5, which you can read using various functions. I've no idea if that would make reading the file easier and you'd have to figure out the data structure yourself as mathworks do not document their format.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Sources 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by