fprintf and fscanf same format fail to read file. appreciate help

Question

Jörgen 2020-2-7

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/504170-fprintf-and-fscanf-same-format-fail-to-read-file-appreciate-help

编辑： dpb 2020-2-9

I wrote a text file (on a windows PC). (A is large matrix of size [29909961, 4] elements)

fid=fopen(filename,'w')
fprintf(fid,'%5i%5i   %17.4E%17.4E\r\n', A);
fclose(fid)

but fscanf fails to read it in again. have tried a bit with different options, but no success

fid=fopen(filename,'r')
K51=fscanf(fid,'%5i%5i   %17.4E%17.4E\r\n');
%K51=fscanf(fid,'%5i%5i%20.4E%17.4E%\r%*c%\n%*c');
fclose(fid)

appreciate the help!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

dpb 2020-2-7

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/504170-fprintf-and-fscanf-same-format-fail-to-read-file-appreciate-help#answer_414388

编辑：dpb 2020-2-8

在 MATLAB Online 中打开

MATLAB is NOT Fortran.

As Stephen notes above and is documented, fscanf returns a single array of a given class; if there are any doubles in the input file, then the output returned has to be double.

What is also documented but you have to read every stinkin' line through to the end to find it is:

"Format specifiers for the reading functions sscanf and fscanf differ from the formats for the writing functions sprintf and fprintf. The reading functions do not support a precision field.

So, the thing that is breaking your input read is the ".4" in the e-format fields in your format string. It simply isn't supported by MATLAB. (I don't know if this true of underlying C fscanf or is only a limitation in the TMW-suppled vectorized versions of MATLAB.)

To read all numeric data you don't need the specific format and are better off without it...just use '%f'

>> x=[randi(100,20,2) rand(20,2)];   % dummy data set
>> fid=fopen('jorgen.dat','w');
>> fprintf(fid,'%5i %5i   %17.4E %17.4E\n', A.');
>> fid=fclose(fid);
>> fid=fopen('jorgen.dat','r');
>> fscanf(fid,'%f',[4,inf]).'
ans =
0000   17.0000    0.1066    0.8530
0000   80.0000    0.9619    0.6221
0000   32.0000    0.0046    0.3509
0000   53.0000    0.7749    0.5132
0000   17.0000    0.8173    0.4018
0000   61.0000    0.8687    0.0760
0000   27.0000    0.0844    0.2399
0000   66.0000    0.3998    0.1233
0000   69.0000    0.2599    0.1839
0000   75.0000    0.8001    0.2399
0000   46.0000    0.4314    0.4173
0000    9.0000    0.9106    0.0497
0000   23.0000    0.1819    0.9027
0000   92.0000    0.2638    0.9448
0000   16.0000    0.1455    0.4909
0000   83.0000    0.1361    0.4893
0000   54.0000    0.8693    0.3377
0000  100.0000    0.5797    0.9001
0000    8.0000    0.5499    0.3693
0000   45.0000    0.1449    0.1112
>> fid=fclose(fid);
>> type jorgen.dat  % to compare
 17          1.0665E-01       8.5303E-01
 80          9.6190E-01       6.2206E-01
 32          4.6342E-03       3.5095E-01
 53          7.7491E-01       5.1325E-01
 17          8.1730E-01       4.0181E-01
 61          8.6869E-01       7.5967E-02
 27          8.4436E-02       2.3992E-01
 66          3.9978E-01       1.2332E-01
 69          2.5987E-01       1.8391E-01
 75          8.0007E-01       2.3995E-01
 46          4.3141E-01       4.1727E-01
  9          9.1065E-01       4.9654E-02
 23          1.8185E-01       9.0272E-01
 92          2.6380E-01       9.4479E-01
 16          1.4554E-01       4.9086E-01
 83          1.3607E-01       4.8925E-01
 54          8.6929E-01       3.3772E-01
100          5.7970E-01       9.0005E-01
  8          5.4986E-01       3.6925E-01
 45          1.4495E-01       1.1120E-01
>> 

Even simpler is the now sadly deprecated textread --

>> textread('jorgen.dat')
ans =
0000   17.0000    0.1066    0.8530
0000   80.0000    0.9619    0.6221
0000   32.0000    0.0046    0.3509
0000   53.0000    0.7749    0.5132
0000   17.0000    0.8173    0.4018
0000   61.0000    0.8687    0.0760
0000   27.0000    0.0844    0.2399
0000   66.0000    0.3998    0.1233
0000   69.0000    0.2599    0.1839
0000   75.0000    0.8001    0.2399
0000   46.0000    0.4314    0.4173
0000    9.0000    0.9106    0.0497
0000   23.0000    0.1819    0.9027
0000   92.0000    0.2638    0.9448
0000   16.0000    0.1455    0.4909
0000   83.0000    0.1361    0.4893
0000   54.0000    0.8693    0.3377
0000  100.0000    0.5797    0.9001
0000    8.0000    0.5499    0.3693
0000   45.0000    0.1449    0.1112
>> 

it saves messing around with file handles and format string entirely for all-numeric data arrays. textscan can do the same excepting have to use file handles and then cast the cell array to double manually.

4 个评论
显示 2更早的评论隐藏 2更早的评论

Jörgen 2020-2-7

Thank you very much. See that I should have spent more time on rtfm. Then the problem is solved!

dpb 2020-2-7

编辑：dpb 2020-2-8

The limitation certainly ought to be highlighted in the section on input arguments for the format string at a minimum instead of stuck down at the bottom as a "Tip" :(

Seems like ought to also generate error or at least a warning....

请先登录，再进行评论。

Answer 2

dpb 2020-2-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/504170-fprintf-and-fscanf-same-format-fail-to-read-file-appreciate-help#answer_414342

在 MATLAB Online 中打开

fprintf(fid,'%5i%5i %17.4E%17.4E\r\n', A);

will write A in column-major order, not row-major as you're trying to read it in later. Use

fprintf(fid,'%5i%5i %17.4E%17.4E\n', A.')

6 个评论
显示 4更早的评论隐藏 4更早的评论

dpb 2020-2-8

编辑：dpb 2020-2-8

在 MATLAB Online 中打开

"Your fprintf format has a potential bug because you did not include any delimiters between the numbers. I strongly recommend adding spaces to ensure that the numbers to not end up merging together"

That's even more strongly fraught with danger--on write extra white space is transmitted verbatim to the output line. If building a fixed-width format field for Fortran application as appears might be the object here, that could be a fatal error.

Examine the following:

>> length(sprintf('%5i %5i',1,1))
ans =
    11
>> length(sprintf('%5i%5i',1,1)
ans =
    10
>> 

OTOMH, I don't know what the C Standard says re: embedded blanks in a formatting string on input nor what MATLAB does if anything unique in their implementation and haven't tried testing it.

C formatted i/o is a real bugger as implemented--would have been much nicer if TMW had stayed with the roots of MATLAB and FORTRAN and implemented it via emulating FORMAT instead--we would then have field repeat specifiers, complex variables and all kinds of other niceties lacking the way they chose (the easy way out from a programming/development standpoint, I'm sure, though).

Stephen23 2020-2-9

编辑：Stephen23 2020-2-9

"That's even more strongly fraught with danger--on write extra white space is transmitted verbatim to the output line. If building a fixed-width format field for Fortran application as appears might be the object here, that could be a fatal error."

I presumed that the author of the code would reduce the fieldwidth by one to compensate.

Are these two exactly equivalent? No, of course not, but given the differences between FORTRAN and C-style parsing there is no 100% equivalence. But with a bit of flexibility, it is possible to define file formats that write/read robustly using both.

dpb 2020-2-9

编辑：dpb 2020-2-9

The big problem though and the reason for making the observation is that often the Fortran is legacy code that can't be modified and that does indeed use fixed format READ statements.

Since input parsing with Fortran FORMAT field widths counts every character, the white space can cause silent errors in input interpretation.

If one is building both at the same time then yes, one can do much better in designing input formatting and also in coding; using '*' list-directed input instead of fixed formats is one of the best options for user hand-modified inputs. But, for automated data transfer between, it's more reliable to just use the input format identical to that of the application; even if the two I5 fields "run together" visually, the Fortran application will have no problem reading it correctly despite the problems that might make in looking at the input file visually.

Frequently/Most(?) often, the input format will have been set up with the expectation the input for that field will never have as many significant digits as the field width so in practice it won't happen even if it could.

ADDENDUM:

Also, the comment was intended to ensure the OP was aware of the difference between the two if doing so as his FORMAT statement itself might be causing a problem he wasn't actually aware of in that the blanks in the ML/C-style formatting string are significant whereas embedded in a FORMAT statement blanks are insignificant.

请先登录，再进行评论。

Answer 3

Jörgen 2020-2-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/504170-fprintf-and-fscanf-same-format-fail-to-read-file-appreciate-help#answer_414373

The fprintf function does what it should do. We have read the resulting text file with help of a fortran executable with the same format spec and that worked fine. But I wanted to check some numbers and read it with fscanf as to speed up reading compared to matlab's import function. Unfortunately I think I have spend time on this another time. Thank you for your answers though.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

fprintf and fscanf same format fail to read file. appreciate help

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

4 个评论
显示 2更早的评论隐藏 2更早的评论

更多回答（2 个）

6 个评论
显示 4更早的评论隐藏 4更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

fprintf and fscanf same format fail to read file. appreciate help

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

4 个评论 显示 2更早的评论隐藏 2更早的评论

更多回答（2 个）

6 个评论 显示 4更早的评论隐藏 4更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

4 个评论
显示 2更早的评论隐藏 2更早的评论

6 个评论
显示 4更早的评论隐藏 4更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论