This file poses two problems when using TEXTSCAN.
- The number of headerlines is not known beforehand.
- The number of columns is not known beforehand.
There are many alternative ways to read this file. (If speed is important one might use fread.) One solution with textscan is shown below. The approach is
- read the complete file and store each row as one string
- find the row with the keyword, "END", which indicates the end of the header and delete the header
- determine the number of columns and the number of rows
- convert the cell array of strings to a string array, which can be read by textscan. textscan reads by columns and thus the array need to be permuted. textscan swallows the newline characters and thus we need to add new ones.
- generate a format string and read the string array
- convert the cell array, which textscan returns, to a numerical array.
I've added a variant based on fread. The basic approach is the same.
- read the complete file to a character column vector and permute it to a row
- find the position of the keyword, END. To be sure to find the right END: search for [new_line, any_space, END, any_space, new_line]
- catch the data part of the character row
- extract the first row of data and count the columns
- create a format string and read the character row with textscan
The textscan variant takes care of the space_characters, but requires code to "unpack" cell arrays.
With the fread variant there is no need to "unpack" cell arrays, but code is needed to take care of the space_characters. Especially, char(13), must not be overlooked.
--- The End ---
function M = Answer( )
fid = fopen( 'cssm.txt', 'r' );
cac = textscan( fid, '%s', 'Delimiter', '\n' );
sts = fclose( fid );
ixe = find( strncmp( 'END', cac{1}, 3 ), 1, 'last' );
buf = cac{1}( ixe+1:end, : );
ncol = numel( regexp( buf{1}, ' +', 'split' ) );
nrow = size( buf, 1 );
str = cat( 1, permute(char(buf),[2,1]), repmat(sprintf('\n'),1,nrow) );
frmt = repmat( '%f', 1, ncol );
cac = textscan( str, frmt );
M = [ cac{:} ];
end
and the variant based on fread
function M = Answer( )
fid = fopen( 'cssm.txt', 'r' );
str = permute( fread( fid, '*char' ), [2,1] );
sts = fclose( fid );
ixe = regexp( str, '\n\s*END\s*\n', 'end' );
buf = str( ixe+1 : end );
ix2 = regexp( buf, '\d\s*\n', 'start', 'once' );
ncol = numel( regexp( strtrim( buf(1:ix2) ), ' +', 'split' ) );
frmt = repmat( '%f', 1, ncol );
cac = textscan( buf, frmt );
M = [ cac{:} ];
end