显示 更早的评论
I have some questions about importing data. Here is an example of the data file to import:
!!!!!!
! text text
! stuff
0.1 2.53 2.5
0.2 2.59 2.43
0.3 2.5 2.54
0.4 2.48 2.53
0.5 2.52 2.48
1
ABC 0.123 123
DE
0.456 0.456 456
0.1 2.56 2.34 2.63
0.2 2.61 2.48 2.43
0.3 2.54 2.51 2.6
0.4 2.57 2.54 2.49
0.5 2.48 2.63 2.5
Here is the code I'm using to import this data:
Test=fopen('TestData.txt'); % open the file
for n=1
mystruct(n).Header1 = fgetl(Test); %line1 goes to header1
fgetl(Test); %skip line
mystruct(n).Header2 = fgetl(Test);
fgetl(Test);
mystruct(n).Header3 = fgetl(Test);
mystruct(n).meas = fscanf(Test, '%f', [3, 5])';
end
for n=2
for j=1:6 % skips to the 6th line
fgetl(Test);
end
mystruct(n).T = fscanf(Test, '%f', 1); % call out value for T
for j=1:2 % skips 2 empty lines
fgetl(Test);
end
mystruct(n).meas = fscanf(Test, '%f', [4, 5])';
end
fclose(Test); % Close the file
I want to preserve the headers at the top and I don't necessarily care about the midfile headers with the exception of my T-value. My question is how I can import this to allow for variable amounts of headers at the top and in the middle of the file without having to look through each data file? This would be helpful since I have multiple data files and with varying contents (mainly the headers). I think I need something like skip until that includes skipping empty spaces and allows for individual treatment of the matrices as I have it now. Any help is much appreciated thanks!
采纳的回答
per isakson
2012-8-9
编辑:per isakson
2012-8-12
I have deleted a sketchy outline, which was not helpful.
--- Working code ---
Purpose:
- learn to use Matlab
Approach:
- the data file consists of consecutive blocks of headers and data
- a data block is a number of consecutive rows containing an equal number of "numerical strings"
- a header block is a number of consecutive rows, which do not belong to a data block
Implementation:
- cssm, main function
- getblocks, subfunction
Hopefully, the code works with more data files than the example above, cssm.txt
Example:
>> [ header_blocks, data_blocks ] = cssm()
header_blocks =
{6x1 cell}
{4x1 cell}
data_blocks =
[5x3 double]
[5x4 double]
.
Left as excersice:
- understand the code
- write comments
====
function [ header_blocks, data_blocks ] = cssm()
fid = fopen( 'cssm.txt' );
cac = textscan( fid, '%s', 'Whitespace','', 'Delimiter','\n' );
fclose( fid );
number_of_floats = cellfun( @(c) size(c,2) ...
, regexp( cac{:}, '[+|-]?\d*\.\d+', 'match' ) ...
, 'uni', true );
number_of_stuff = cellfun( @(c) size(c,2) ...
, regexp( cac{:}, '[^([+|-]?\d*\.\d+) ]', 'match' ) ...
, 'uni', true );
is_data = ( number_of_floats >= 1 & number_of_stuff == 0 );
number_of_data_columns = number_of_floats;
number_of_data_columns( not(is_data) ) = nan;
[ ~, ix1, ix2 ] = getblocks( number_of_data_columns, 2 );
data_blocks = cell(0);
for ii = 1 : numel( ix1 )
data_blocks = cat( 1, data_blocks ...
, {str2num( char(cac{1}{ix1(ii):ix2(ii)}))} );
end
ix3 = cat( 2, 1, ix2+1 );
ix4 = cat( 2, ix1-1, size( cac{1}, 1 ) );
header_blocks = cell(0);
for ii = 1 : numel( ix1 )
header_blocks = cat( 1, header_blocks ...
, {cac{:}(ix3(ii):ix4(ii))} );
end
end
====
function [ col, ix1, ix2 ] = getblocks( sequence, min_nrows )
% without comments
seq = cat( 2, nan, transpose( sequence(:) ), nan );
change = diff( double( diff( seq ) == 0 ) );
ix1 = strfind( change, +1 );
ix2 = strfind( change, -1 );
col = sequence( ix1 );
if min_nrows >= 2
isg = ix2-ix1+1 >= min_nrows;
col = col( isg );
ix1 = ix1( isg );
ix2 = ix2( isg );
else
ix_sngl = find( not( logical( cumsum( change ) ...
+ double( change==-1 ) ) ) );
ix1 = cat( 2, ix1, ix_sngl );
ix2 = cat( 2, ix2, ix_sngl );
col = cat( 2, col, sequence( ix_sngl ) );
[~,ix] = sort( ix1 );
ix1 = ix1( ix );
ix2 = ix2( ix );
col = col( ix );
end
end
15 个评论
I’m not sure that I follow this, which is probably due to my lack of experience…using textscan, imports the entire file as an nx1 cell array as you say. I haven’t been successful using STRCMP to find specific lines and that’s maybe how it’s called. All I know is using TF = strcmp(cac,'ABC'); %compares strings
finds the string and just equals 0 and I haven’t been able to use STRFIND. The help file says it finds a string within a string.
Using
str = transpose( char( cell_array_of_data ) );
nl = sprintf('\n');
str = cat( 1, str, repmat( nl(:), [length(nl),size(str,2)] ) );
array = textscan( str, appropriate_format_specifier, 'Delimiter',... ' ', 'Whitespace', '' );
With,
cell_array_of_data = cac{1,1}{8,1};
appropriate_format_specifier = '%f';
only extracts the first value of that row, 0.1, when it should be 0.1 2.53 2.5
further explanation is greatly appreciated, thanks!
per isakson
2012-8-10
编辑:per isakson
2012-8-10
- Try to use the markup features. That makes the text so much easier to read.
- You need to carefully read the help on STRCMP, STRFIND, and TEXTSCAN format specifier
- What rule should be used to decide where the "headers at the top" ends?
- "preserve" does that mean store in a separate variable?
- How shall "midfile headers" be identified?
Is this appropriate to use as data file in a test?
!!!!!!
! text text
! stuff
0.1 2.53 2.5
0.2 2.59 2.43
0.3 2.5 2.54
0.4 2.48 2.53
0.5 2.52 2.48
1
ABC 0.123 123
DE
0.456 0.456 456
0.1 2.56 2.34 2.63
0.2 2.61 2.48 2.43
0.3 2.54 2.51 2.6
0.4 2.57 2.54 2.49
0.5 2.48 2.63 2.5
yes, only there are empty spaces...just figured out how to actually put this in here properly:
!!!!!!
! text text
! stuff
0.1 2.53 2.5
0.2 2.59 2.43
0.3 2.5 2.54
0.4 2.48 2.53
0.5 2.52 2.48
1
ABC 0.123 123
DE
0.456 0.456 456
0.1 2.56 2.34 2.63
0.2 2.61 2.48 2.43
0.3 2.54 2.51 2.6
0.4 2.57 2.54 2.49
0.5 2.48 2.63 2.5
per isakson
2012-8-10
编辑:per isakson
2012-8-10
I read
- the first block of data contains three columns
- the second block of data contains four columns
- the lines "1" and " 0.456 0.456 456" belong to the middle header(?)
yes, that's correct
per isakson
2012-8-10
编辑:per isakson
2012-8-10
What the purpose of reading this file?
- learn to use Matlab
- present these specific data in a diagram. If so, clean-up the file interactively with an editor and read with IMPORTDATA
- develop a code to read data files, to which this is an example. If so, i) one need to better specify what kind of lines may occur in the files, ii) the code must be easy to maintain, i.e. modify when new files present unexpected features, iii) does the size and number of files make performance an issue
The above example might be more helpful to me if you can provide the example data file it is importing. Then I could see similarities and modify it to import my data file. Thanks!
per isakson
2012-8-13
编辑:per isakson
2012-8-13
It imports the data you provided in your question and confirmed in a comment above! Did you give it a chance?
It doesn't work...I'm getting rather confused with this approach and will likely revert back to my original file at this point. It would be great to learn how to write this so I could skip/store headerlines and extract data for more generic files as my future data files will have varying headerlines and matrices within the file, but for now I will have to stick to doing this on an individual basis.
As you please!
I have tested my code and it works here. You say: "It doesn't work...". That isn't very helpful.
It returns a blank answer 0x0 cell...not sure what it returns for you. Seems odd that I can't see anything from it. I would think I should at least be able to see some part of cac.
actually, commenting out the function call for cssm allows me to see more of what's happening...it looks like the number_of_floats portion finds the two matrices. I didn't see how you actually extracted the two matrices from the data file. Maybe because data_blocks is giving a 0x0 cell???
i think i have something that works! Thank you! I had to comment number_of_data ( not(is_data) ) = nan; and data_blocks returned the two matrices. Not totatlly sure where the discrepancy is since it worked as is for you. Thanks again!
per isakson
2012-8-14
编辑:per isakson
2012-8-14
I cannot guess what problems you see. However, here is what i get when I run the code above:
with "% number_of_data ( not(is_data) ) = nan;" commented out
>> [ header_blocks, data_blocks ] = cssm()
header_blocks =
{0x1 cell}
{0x1 cell}
{4x1 cell}
data_blocks =
[]
[5x3 double]
[5x4 double]
>> header_blocks{:}
ans =
Empty cell array: 0-by-1
ans =
Empty cell array: 0-by-1
ans =
'1'
'ABC 0.123 123'
' DE'
' 0.456 0.456 456'
>> data_blocks{:}
ans =
[]
ans =
0.1000 2.5300 2.5000
0.2000 2.5900 2.4300
0.3000 2.5000 2.5400
0.4000 2.4800 2.5300
0.5000 2.5200 2.4800
ans =
0.1000 2.5600 2.3400 2.6300
0.2000 2.6100 2.4800 2.4300
0.3000 2.5400 2.5100 2.6000
0.4000 2.5700 2.5400 2.4900
0.5000 2.4800 2.6300 2.5000
>>
.
with "number_of_data ( not(is_data) ) = nan;" in place
>> [ header_blocks, data_blocks ] = cssm()
header_blocks =
{6x1 cell}
{4x1 cell}
data_blocks =
[5x3 double]
[5x4 double]
>> header_blocks{:}
ans =
'!!!!!!'
''
'! text text'
''
'! stuff'
''
ans =
'1'
'ABC 0.123 123'
' DE'
' 0.456 0.456 456'
>> data_blocks{:}
ans =
0.1000 2.5300 2.5000
0.2000 2.5900 2.4300
0.3000 2.5000 2.5400
0.4000 2.4800 2.5300
0.5000 2.5200 2.4800
ans =
0.1000 2.5600 2.3400 2.6300
0.2000 2.6100 2.4800 2.4300
0.3000 2.5400 2.5100 2.6000
0.4000 2.5700 2.5400 2.4900
0.5000 2.4800 2.6300 2.5000
>>
.
Comment
In the text file there should be an empty line between "ABC..." and " DE". Adding that blank line doesn't cause any problems. I get
...
ans =
'1'
'ABC 0.123 123'
''
' DE'
' 0.456 0.456 456'
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Matrices and Arrays 的更多信息
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!选择网站
选择网站以获取翻译的可用内容,以及查看当地活动和优惠。根据您的位置,我们建议您选择:。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
