Read string from files since R2020a
15 次查看(过去 30 天)
显示 更早的评论
I have a large binary data file with some ASCII formatted metadata header at the beginning. To read this header, I use 'string-oriented' read functions like fgetl(~), fscanf(~, '%s', ~) or fread(~, ~, '*char'). In Matlab versions prior to R2020a (I have R2014b and R2019b) this worked just fine, however in the R2020a something changed.
Now the very first, but only the first, attempt to read any string from the file will use extensive amount of memory and freezes the whole thing. I have a guess that Matlab is trying to read the whole file into memory. And in my case the file itself is larger than available RAM which probably cause the freezing.
Here what I do:
% Here everything works just fine
fd = fopen('file.name', 'r');
arr1 = fread(fd, 1);
arr2 = fread(fd, 1);
fclose(fd);
% Here I have a problem
fd = fopen('file.name', 'r');
arr1 = fread(fd, 1); % fast and smooth
arr2 = fread(fd, 1, '*char'); % uses extensive amount of RAM and slow
arr3 = fread(fd, 1, '*char'); % fast and smooth again
fclose(fd);
1) It does not matter what part of the file I read.
2) All numeric type returning read functions are always fast.
3) The first string returning read function is always slow and does not matter what function I use (as long as it returns string).
4) All successive string reads are as fast as numeric ones.
5) Once the read function returns the string the memory is released.
6) File position pointer is always at expected position (does not move to end of the file).
7) It does not matter if the file is opened in text or binary mode.
8) The issue is presented both on Windows and Linux.
Any idea?
0 个评论
采纳的回答
Sindar
2020-4-22
From the release notes:
"As of R2020a, character-oriented file I/O functions such as fscanf, fgets, and fgetl trigger automatic character set detection when reading a file that was opened using fopen without a specified encoding."
My suspicion then is that the "automatic character set detection" may require looking through the full file.
Try specifying the encoding in fopen, e.g.,
fd = fopen('file.name', 'r','n','UTF-8');
2 个评论
Walter Roberson
2020-4-22
See also the discussion at https://www.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Low-Level File I/O 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!