Cell Array Size and Saving

28 次查看(过去 30 天)
Sven
Sven 2018-7-11
评论: Sven 2018-7-12
Hi,
I wanted to save my quite complex and large class to a file and experienced a much larger filesize than I would have expected. So I examined which parts were driving the size. I was quite surprised how simple cell arrays of string consumed overdimensional space. Is there any easy way to avoid this?
Here my MWE:
Names50 = cell(50,1);
Names2 = cell(2,1);
for i=1:length(Names2)
Names{i} = 'a';
end % for i
for i=1:length(Names50)
Names{i} = 'b';
end % for i
When I check for saving size with a small routine I found, I get quite confusing results:
getSize(Names2) --> 228
getSize(Names50) --> 5700
getSize(Names2{1}) --> 2
The single element is just 2 bytes, while a cell array of 2*2 bytes is 228, or even 5700 if there are 50 rows. Is the overhead so unproportional large in cell arrays? Can that somehow be avoided when saving?
Thanks in advance
Best
Sven
P.S.: Codes for getSize:
function [ bytes ] = getSize( variable )
props = properties(variable);
if size(props, 1) < 1, bytes = whos(varname(variable)); bytes = bytes.bytes;
else %code of Dmitry
bytes = 0;
for ii=1:length(props)
currentProperty = getfield(variable, char(props(ii)));
s = whos(varname(currentProperty));
fprintf('Property: %s : %d bytes\n',props{ii},s.bytes)
bytes = bytes + s.bytes;
end
end
end
function [ name ] = varname( ~ )
name = inputname(1);
end

采纳的回答

Guillaume
Guillaume 2018-7-11
Yes, there is necessary overhead for cell arrays. Note that whos (which your getsize uses|) does not actually show all the memory used by variables.
By necessity a cell array cannot just store the content of the data (your 2 bytes consumed by 'a'). It also needs to store:
  • where that content is actually stored in memory (since the content of the cell array can be anything, the content is not actually stored inside the cell array, just a pointer to the content)
  • the matrix header for that content which includes:
  • the type of content
  • how many dimensions that content has
  • the length of each dimension of that content
This result in an overhead of 112 bytes per non-empty cell (empty cells only need 8 bytes to store a null pointer)
To that you need to add more bytes that whos doesn't show and that are required for every variable in matlab:
  • the type of the variable (i.e it's a cell array)
  • how many dimensions that variable has
  • the length of each dimension
  1 个评论
Sven
Sven 2018-7-12
Thank you very much for this detailed answer. I feared there was an overhead, but did not expect it to be that large and for each cell. So I guess there is no workaround.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Data Type Identification 的更多信息

产品


版本

R2016a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by