Creating multiple equally sized matrices from a single numerical cell

1 次查看(过去 30 天)
I have a very large text file composed of, in essence one row of numbers. Once I have reorganized the file into a matrix of, for example 500 x 10, I wish to create new matrices every 10 rows and have these save with their own title. A major problem I've experienced with my text file is that it's too big for Matlab, with an out of memory error appearing. This is why I need to separate each matrix into its own set of data. I have already turned a row of 1049600 numbers into a matrix of 1025 x 1024 but now the file is 50 of these sets in one file (1049600 x 50) and I need to create 50 1025 x 1024 matrices.
fid = fopen('test0001.asc');
Cell = textscan( fid, '%d', 'delimiter', ';');
Data = cell2mat(Cell);
N = 1024;
Finish = reshape(Data, N, [])';
The above is the code i had for the smaller files
I considered organizing the data into 51250 rows of 1024 and then creating a while ~ feof loop but this seems like it would require too much code and would thus be too slow. My thought was to have say:
F1 = Data(1:1025, :);
f2 = Data(1026:2051, :);
.....
Any thoughts at all would be much appreciated

采纳的回答

Stephen23
Stephen23 2017-2-7
编辑:Stephen23 2017-2-10
Firstly, the idea of generating lots of variables is popular with beginners, but really should be avoided:
Also note that the MATLAB documentation is really good. It is readable, and has articles on lots of topics. Such as this one, which gives a good, robust method for reading a large file into MATLAB:
The core idea of that code is to call textscan in a loop, use textscan's N option to specify how much data to read, and save the data into a cell array. The N option simply defines how many times the format is applied when reading the file.
You should be able to work it out from the examples in the documentation.
As an alternative you might like to read about Tall Arrays, which are a special kind of data type especially for working with very large data files that cannot be read into memory:
EDIT 2017-02-10: add code from comment:
%%Create Fake Datafile %%
% fid = fopen('temp2.txt','wt');
% for k = 1:50,
% fprintf(fid,'%d;',randi([0,255],1,1025*1024));
% end
% fclose(fid);
%%Read DataFile %%
R = 1025;
C = 1024;
opt = {'EndOfLine',';', 'CollectOutput',true};
fid = fopen('temp2.txt','rt');
k = 0;
while ~feof(fid)
Z = textscan(fid,'%d', R*C, opt{:});
if ~isempty(Z{1})
k = k+1;
S = sprintf('temp2_%02d.txt',k);
dlmwrite(S,reshape(Z{1},[],R).',';') % might need to translate
end
end
fclose(fid);
  12 个评论
Aaron Smith
Aaron Smith 2017-2-9
I have the code working fairly well, I just had one thing I'm not too sure about, what does the opt = {'EndOfLine', ';'}; line in your code do? What is its purpose? Thanks again Stephen
Stephen23
Stephen23 2017-2-9
编辑:Stephen23 2017-2-9
@Aaron Smith: take a look at these two lines:
opt = {'EndOfLine',';'};
...
Z = textscan(fid,'%d', R*C, opt{:});
one defines the cell array opt, the other provides the elements of opt as inputs to textscan. So it is simply a convenient way to write the inputs without writing them all in one line like this:
Z = textscan(fid,'%d', R*C, 'EndOfLine',';');
For just two arguments it does not make much difference, but sometimes there can be quite a few arguments, and I find the cell array keeps things tidy. It is just a personal choice to do it like that, there is no deeper meaning. You can write the inputs on one line, if you wish to.

请先登录,再进行评论。

更多回答(1 个)

Guillaume
Guillaume 2017-2-8
编辑:Guillaume 2017-2-8
Matlab, since R2014b, has had tools to allow reading in chunks files that are too big to fit in memory. Why not use these? See datastore and in your particular case tabulartextdatastore.
Since R2016b, that support has been made even easier, with the introduction of Tall arrays.

类别

Help CenterFile Exchange 中查找有关 Text Data Preparation 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by