How can I read a text file with fixed widths in MATLAB

17 次查看(过去 30 天)
Hello, It's been a long day and I am trying to figure out how to import a text file which has fixed width data. I tried to write the code like this but it does not work :( . What I need to do is
  • Import selected piece of the data and use if conditions to progress further.
  • My task is checking a particular item from 2014 data and 2013 data for any difference.
  • For example I need to check a particular field, in the data . It has a width of 3 so how do it do it NBIdata(1:3)=='356' ??
  • I am attaching the text file too please refer it.
fid = fopen('NM145.txt');
c=zeros(1,3000);
count =0;
%loops through the file and read each record in sequence
while ~feof(fid)
NBIData = fgetl(fid);
count= count+1;
% selects only valid highway bridges
if (NBIData(19)== '1')&&(NBIData(374) == 'Y')&&((NBIData(200) =='1')||(NBIData(200) == '5')||(NBIData(200) == '6')||(NBIData(200) == '7')||(NBIData(200) == '8'))
if (a==NBIData(1:3)&& b==356)
strcmp(a,b)
c(count)= NBIData(4:18);
end
end

回答(2 个)

dpb
dpb 2015-4-3
Why didn't you also rewrite the test for the number as I suggested in the previous thread--much easier to read the logic???
To read a given fixed width field either convert the substring to numeric as
str2num(NBIdata(1:3))==356
or use a format string of the correct field width and type and read the fields directly initially instead of using the line-by-line parsing. This is a little more difficult than it should be with Matlab as string input fields aren't really counted as characters when blanks are present so depending on the data you may have to actually use the counted substrings.
The first comes back to a point I was going to make in the original posting of comparing the bridge numbers as the original snippet does -- by looking at the single character the numbers in the set to be selected are limited to being 0 thru 9 inclusive (unless they've encoded them in a higher-than-ten-based system). This may be all that is in the field but seems a little limiting as a general case.
  4 个评论
adrooney
adrooney 2015-4-3
I don't understand what does one swell foop mean. As you said about field definition, is creating a specformat(%f %s %s %s.....) kind of thing? Sorry correction is not equal widths but has specified widths. I you can look at the attached NM145.txt file in the question that would be great. You will get an idea. Please give me an approach and I will write the code. Thanks!
dpb
dpb 2015-4-3
编辑:dpb 2015-4-5
It's a spoonerism (see spoonerism) for "one fell swoop" meaning in a single pass...sorry, figured it would be apparent in context as a little sidebar attempt at some levity.
Again, it depends on the field width definition and content which you've not given altho it looks like it does have the problem mentioned above of there being embedded blanks within comment/descriptive fields in which case C (and hence Matlab since its formatted i/o is built around the C standard library) will not parse those fields correctly even if they are specified as being of some fixed length.
As noted, the only way then is to use substring addressing to select a section of the line (or 2D array if load the whole thing as a character array) and assign that substring to a character array or cellstring array. It's "a sorry way to run a railroad" to use another vernacular but that's the way strings are parsed.

请先登录,再进行评论。


dpb
dpb 2015-4-3
OK, since have gone this far rather than amplify further on the above which is more comment than answer, try the following demonstration. I took a small subset of the file and pasted in in the editor to look at the columns. Some are possible to discern; others "not so much" without knowing "who's who in the zoo" as an identification of what are the actual fields. But, as an example, here's an outline of how I'd probably attack handling the file...
fid=fopen('adroon.txt'); % open the file
roon=fread(fid,inf,'*char'); % read in as character image in entirety
fid=fclose(fid);
roon=reshape(roon,435,[]).'; % put into row order as expected, count the \n char in line length
Ok, now to do the selection...this uses the identified column location and the previous logic of setting a numeric array; in reality I'd probably have named values for the columns that indicate what the column means instead of a magic number but this gives the idea of how to handle the selection
% return a logical vector by row for those wanted...
ikeep=roon(:,19)=='1' & roon(:,374)=='Y' & ismember(roon(:,200),nos)
One of the name fields is, it appears from column 38 to the next field begins at 63. Let's get the first for the wanted records--
names=cellstr(roon(ikeep,38:62)); % select the field NAMES for those records wanted
What does that look like for some of them????
>> names(1:10)
ans =
'Channel'
'Spillway'
'RioGrande'
'Spillway'
'Canal'
'Rio Grande'
'Canal'
'Brown Arroyo'
'Low Flow channel'
'Cannel'
>>
Note this is a cell array (the ' around the string value is the giveaway) but note also we did get the full field including the embedded blanks.
I'd generalize on the above ideas once knew what the actual field definitions are. One might be able to use the new data table; my release isn't recent enough to know how well it would be able to parse the in memory array; other alternatives would include a name structure array or a conventional database.
This pretty obviously was written to be used with one of those kinds of applications; Fortran FORMATted i/o would handle it with ease as it "knows" what was intended when fixed-width fields were invented which the creators of C forgot.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by