How do I perform a formatted read on exponential data?
显示 更早的评论
I want to read a few hundred thousand lines of data. Here are the first few lines:
TABLED1 1
+ 0.000+0-0.000+0 1.000-2-3.297+1 2.000-2-4.924+1 3.000-2-5.692+1+
+ 4.000-2-1.301+2 5.000-2-1.128+2 6.000-2-1.031+2 7.000-2-9.388+1+
+ 8.000-2-8.941+1 9.000-2-9.161+1 1.000-1-9.107+1 1.100-1-9.013+1+
I've searched, read answers and tried textscan() with different options as well as some even less successful methods of reading data - all to no avail. textscan() doesn't recognize '%e' and '%f' doesn't return the desired result. I've even tried reading one line at a time and writing my own parsing algorithm. What I wouldn't give for a good Fortran read() statement! lol
Thank you in advance for your time and consideration.
5 个评论
Rik
2022-10-26
I don't see how these numbers are exponential numbers. If you describe the format in a more conventional way, we might be able to help you.
VBBV
2022-10-26
Use readtable function
I agree with Rik. The wanted output of this line is not clarified yet:
+ 0.000+0-0.000+0 1.000-2-3.297+1 2.000-2-4.924+1 3.000-2-5.692+1+
Additional information is required, e.g. if all numbers have 3 digits after the decimal point. Without a clear definition, creating a parsing code is guessing only. Is there a fixed number of spaces at the beginning?
Is there one head line only, or does the file contain multiple blocks? Posting your "best" code would be very useful also.
This file format is such ugly.
Matthew Koebbe
2022-10-26
Matthew Koebbe
2022-10-26
移动:Rik
2022-10-27
回答(2 个)
Edit at the top: You can get the data in this format with my readfile function, or with data=cellstr(readlines(filename));.
There are perhaps more direct ways, but this is how you can do it with a regular expression.
data={'+ 0.000+0-0.000+0 1.000-2-3.297+1 2.000-2-4.924+1 3.000-2-5.692+1+',
'+ 4.000-2-1.301+2 5.000-2-1.128+2 6.000-2-1.031+2 7.000-2-9.388+1+',
'+ 8.000-2-8.941+1 9.000-2-9.161+1 1.000-1-9.107+1 1.100-1-9.013+1+'};
tokens=regexp(data,'-?([0-9\.]+)([\+\-]\d+)','tokens')
tokens{1}
output = NaN(numel(tokens),numel(tokens{1}));
for n=1:numel(tokens)
for m=1:numel(tokens{n})
tmp = str2double(tokens{n}{m});
output(n,m) = tmp(1) * 10^tmp(2);
end
end
output
Mathieu NOE
2022-10-26
Coming a bit too late in the show...
filename = 'tabled1.txt'
header_lines = 1;
data = get_my_data(filename,header_lines)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function data = get_my_data(filename,header_lines)
lines = readlines(filename);
data = []; % concat
for ci = 1+header_lines:numel(lines)
line = char(lines(ci));
if numel(line)>1
line = line(end-64:end-1) % keep only 8 numbers coded on 8 characters
nums_one_line = [];
for cc = 1:8
ind1 = 1+(cc-1)*8;
ind2 = ind1+7;
numb = line(ind1:ind2); % example : '-8.941+1'
% + or - for exponant is in position 7
mantissa = str2double(numb(1:6));
expo = str2double(numb(7:8));
num = mantissa*10^(expo);
nums_one_line = [nums_one_line num]; % concat
end
data = [data; nums_one_line]; % concat
end
end
end
类别
在 帮助中心 和 File Exchange 中查找有关 Text Data Preparation 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!