Errors while reading binary data files

18 次查看(过去 30 天)
I am trying to read binary files with uint32 data entries for example. The function reads the data correctly up to a certain elelment and then reads unexpected data (which I am sure do not exist in the original file). In some cases, these data blowup to very large values (for example, I was reading a uint32 data file with a maximum value of ~8000 and the maximum of data read is ~4.127*10^9. The code I am using is shown below (note: the asterisck does not have an effect, I repeated this with different files and checked the data using other programming languages/softwares):
function [X,Y,Z,Volume] = GetBin(filename,volumeSize,nOfBytes)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Reads a binary formatted file into a 3D MATLAB matrix.
%
% INPUT:
% filename: string, name of binary file for reading
%
% OUTPUT:
% X,Y,Z: integer, size of matrix in cartesian coordinates
% Volume: integer 3D matrix, voxel values (labels)
% Ahmed Zankoor , April 2021
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
file = "data\" + filename;
fid = fopen(file, 'rt');
if fid == -1
error('Cannot open file for reading: %s', file);
end
X = volumeSize(1);Y = volumeSize(2);Z = volumeSize(3);
% Read binary file, By default, fread reads a file 1,2 or 3 byte at a time,
% interprets one byte as an 8-bit unsigned integer (uint8),two byte as an 16-bit unsigned integer (uint16)
% three byte as an 32-bit unsigned integer (uint32).
if nOfBytes == 1
data = fread(fid,Inf,'*uint8');
data = uint8(data);
elseif nOfBytes == 2
data = fread(fid,Inf,'*uint16');
data = uint16(data);
elseif nOfBytes == 4
data = fread(fid,Inf,'*uint32');
data = uint32(data);
else
error('Unrecognized number of bytes per entry.')
end
fclose(fid);
if length(data)~= X*Y*Z
disp('Size of data does not match size of Volume.')
disp(['Size of data = ' num2str(length(data))])
disp(['Size of volume = ' num2str(X*Y*Z)])
end
Z = floor(length(data)/(X*Y));
data = data(1:X*Y*Z);
Volume = reshape(data,X,Y,Z);
end
For visualization, the image attached shows an example of the read data, where the top is correctly read data and then the mess below is because of the errors. I wonder if anyone knows why this may be happening?
Thank you.
  2 个评论
Walter Roberson
Walter Roberson 2021-4-21
Does the source of the data happen to be a FORTRAN program?
Ahmed Zankoor
Ahmed Zankoor 2021-4-21
编辑:Ahmed Zankoor 2021-4-21
No, I tried with files written by C++ and others exported from a commercial software which I think is also written in C++. Same problem.

请先登录,再进行评论。

采纳的回答

Jan
Jan 2021-4-21
编辑:Jan 2021-4-21
The problem is hidden here:
fid = fopen(file, 'rt');
This opens the file in "text"-mode on Windows. Then e.g. a CHAR(8) is converted to a backspace, which means, that the former byte is deleted. ^Z is interpreted as end of file and there are a lot of further gimmicks. Therefore a file with arbitrary bytes can contain less characteres after the import than the files has bytes on the disk.
The solution is easy and makes the code unspecific for the platform it runs on: Open the file in binary mode by omitting the 't':
fid = fopen(file, 'r');
I prefer this for text files also, because the old DOS control characters are a common source of unexpected behaviour. The interpretation in text moder costs runtime also.
A hint: fread(fid, inf, '*uint8') replies an UINT8 already. So you can omit the lines:
data = uint8(data);

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Large Files and Big Data 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by