Distinguish between ASCII and Binary

6 次查看(过去 30 天)
What could be an elegant way to distinguish an ASCII file from a Binary one? Specifically, I'm working with STL files that can be both, and I need a solution how to seperate those two
Thanks,
Tero
  2 个评论
Stephen23
Stephen23 2020-11-5
编辑:Stephen23 2020-11-5
The elegent way is to read the file format description. Wikipedia gives an outline:
Apparently STL text files must start with the string "solid", whereas STL binary files must NOT start with that string. So to know the difference, you just need to read the first five characters. And testing those five characters is easy in "an elegant way", certainly much faster and more elegant than parsing the entire file.
Chris Hooper
Chris Hooper 2024-3-25
I read a claim that some binary .stl files can still begin with "solid". not sure if its true.

请先登录,再进行评论。

采纳的回答

Bruno Luong
Bruno Luong 2020-11-5
I don't know if it's an elegant way but I just test if any charater is > 255
fid = fopen(stlfilename,'rt');
if fid > 0
try
c = textscan(fid,'%s','delimiter','\n');
fclose(fid);
catch ME
message = ME.message;
h = errordlg(message);
waitfor(h);
OK = -2;
return
end
else
OK = -2;
message = 'Cannot open STL file';
h = errordlg(message);
waitfor(h);
return
end
c = c{1};
c(cellfun(@isempty,c)) = [];
if max(cellfun(@max,c)) > 255
% Binary
...
else
% Ascii
...
end
  3 个评论
Ameer Hamza
Ameer Hamza 2020-11-5
This test can produce false negatives. For example
fid = fopen('file.bin', 'w');
fwrite(fid, [65 66 67 68], 'uint8')
fclose(fid)
Test
fid = fopen('file.bin','rt');
c = textscan(fid,'%s','delimiter','\n');
fclose(fid);
c = c{1};
c(cellfun(@isempty,c)) = [];
Result
>> max(cellfun(@max,c)) > 255
ans =
logical
0
Bruno Luong
Bruno Luong 2020-11-5
We are talking about STL file, that can be ascii/binary, no any binary file.

请先登录,再进行评论。

更多回答(1 个)

DGM
DGM 2024-3-25
编辑:DGM 2025-7-12
See also stlGetFormat() from stltools on the FEX:
FWIW, I tested this and the accepted answer on a list of 1000 STL files of different encodings, from various sources. In all cases, they produced matching results, but stlGetFormat was faster.
  • for the accepted answer: 131.298121 seconds.
  • for stlGetFormat(): 0.204969 seconds.
That's a big difference, but nobody is trying to read a thousand STL files as fast as possible, so it doesn't really matter. stlGetFormat uses a concise three point test. It won't be fooled by a bastard binary file which has a header starting with 'solid', but I'm not an expert on all the other ways an STL encoder can create novel problems.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by