Read binary with ascii header

4 次查看(过去 30 天)
David
David 2015-11-24
评论: Guillaume 2015-11-24
Hi,
I have a .vtu-file which contains numerical data (velocity value(s) at certain x,y,z-points) in binary format. The format and information about the file is given in a short (ascii!) header at the beginning of the file:
<VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian" compressor="vtkZLibDataCompressor">
<UnstructuredGrid>
<Piece NumberOfPoints=" 318105" NumberOfCells=" 1779159">
<Points>
<DataArray type="Float64" NumberOfComponents="3" format="binary">
The NumberOfPoints specifies the length of the file, and there should be 7 columns of data. As specified, the data array is given in Float64 format.
Anyway, the question is: how can I read in the ascii header, extract information from there, and the proceed by reading the rest of the file as binary.
Any ideas?

回答(2 个)

Guillaume
Guillaume 2015-11-24
You can still use text reading operation with a binary file, so use fgetl (or fgets) to read your text (assuming the header actually consists of lines ending with a newline), then normal binary reading (with fread).
fid = fopen('yourfile.vtu');
header = arrayfun(@(~) fgetl(fid), 1:5, 'UniformOutput', false);
numpoints = str2double(regexp(header{3}, '(?<=NumberOfPoints=" +)\d+', 'match', 'once'));
datatype = regexp(header{5}, '(?<=type=")[^"]+(?=")', 'match', 'once');
points = fread(fid, [numpoints 7], lower(datatype)); %assume datatype is a valid type for fread. 'float64' is
fclose(fid);
Note that I assumed that the format of your header was constant. Always 5 lines with number of points on the 3rd and data type on the fifth.
  3 个评论
Guillaume
Guillaume 2015-11-24
float64 (in lowercase) is a valid type for fread. It is equivalent to double.
There can be several reasons for the binary part not to read properly:
- end of line issues. As you probably know there are different conventions for marking the end of a text line. Windows text files typically use '\r\n', Unix text files use '\n'. Other exotic formats may use something different. When you read files in text mode, matlab automatically detect the correct line ending and strip these off. In binary mode, I'm not sure what happens (doc is not clear on that), so possibly part of the end of the last line is left over and causes an offset error.
- Are you sure the data starts immediately after the text portion? This would also cause an offset error
- I assumed that 'float64' means IEEE 754 double precision float. It could mean something else.
Guillaume
Guillaume 2015-11-24
Another two things:
- endianness seems to be encoded in the file. It may be safer to explicitly use it in the fread code. Although if you are on windows, you were already reading in little endian:
endianness = regexp(header{1}, '(?<=byte_order=")[^"]+', 'match', 'once');
points = fread(fid, [numpoints 7], lower(datatype), 0, lower(endianness(1))); %assume endiannes either starts with 'L' or 'B'
- more importantly, your header includes compressor="vtkZLibDataCompressor"> which would indicate that your data is compressed with zlib. If that is the case, then it is a lot more complicated to read it. There are submissions on the file exchange to decompress zlib.

请先登录,再进行评论。


Thorsten
Thorsten 2015-11-24
编辑:Thorsten 2015-11-24
Use fgets to read the first ASCII lines and then use fread (with appropriate arguments) for the remaining binary stuff.
  2 个评论
David
David 2015-11-24
Hi,
thanks for the reply. I can easily read the first header lines using fgets, but I'm not sure what arguments to give fread (or whatever good function to use for the binary reading) in order for it to know how to skip the header lines.
Do you have any suggestion on that?
Image Analyst
Image Analyst 2015-11-24
Here's 2 lines of code from one of my utilities that reads 2-D image slices from a raw 3-D image:
dataLengthString = '*uint16'; % You need the *, otherwise fread returns doubles.
oneFullSlice = fread(fileHandle, [rows, columns], dataLengthString);

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Large Files and Big Data 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by