MATLAB Answers

Long time to load 40 gig file

4 views (last 30 days)
AA
AA on 20 Oct 2017
Commented: Steven Lord on 20 Oct 2017
Hi, I have a gaming laptop which I recently bought for 4000€. Massive CPU and lots of ram. However when I want to load a 30 gig matlab file from my hard drive which consists of variables with huge cell array tables, I still need a lot of time. How can I make this process faster? Is data store useful?

  4 Comments

Show 1 older comment
AA
AA on 20 Oct 2017
It is on an ssd
Jan
Jan on 20 Oct 2017
Please mention the details: How long is "a lot of time"? What is the acceptable speed for you? Which MAT format is used?
Steven Lord
Steven Lord on 20 Oct 2017
There's some additional information that would be useful in trying to determine what's consuming most of the time and whether there's a better approach you can use.
  • Is this a text file, a binary file, a MAT-file, etc?
  • Do you need all the variables in the file at once or could you do what you need accessing each variable in turn, keeping only one at a time in memory?
  • Can you post a small sample (not all 30-40 GB, but 5-10 lines) of what the data looks like in the file, and whether the file has a consistent format throughout or if the format changes periodically
  • Copy and paste the command that you're using to try to read in the file, or describe the interactive process you're using (if you're using the Import Tool, for example.)
  • Can you clarify what "a lot of time" means: a couple minutes, half an hour, an hour, etc.?
  • Can you clarify what "lots of ram" means: how many GB?

Sign in to comment.

Answers (2)

Edric Ellis
Edric Ellis on 20 Oct 2017
You don't mention how you're loading the data right now. If you have Parallel Computing Toolbox, you can read the data in parallel using parfor together with datastore. Or, better still, use tall arrays, which can automatically take advantage of Parallel Computing Toolbox.

  0 Comments

Sign in to comment.


Jeremy Hughes
Jeremy Hughes on 20 Oct 2017
"cell array tables"
A good bet is that the cell arrays are the issue. Each cell of the array takes a 114 bytes of overhead. Without knowing anything about your data, except that you described them as tables, I suggest looking at the MATLAB datatype, table.
Data can be efficiently stored as tables for many common usages. I wish I could give a better answer, but I'd need to see what's in the file.
Cheers,
Jeremy

  0 Comments

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by