Parsing .txt File with Unique Format
7 次查看(过去 30 天)
显示 更早的评论
Hi,
I am struggling to parse a .txt document. A sample format of the .txt document is attached. Essentially, I need to extract the first column, L, as well as the third to last column, MCA, into separate tables in MATLAB, one table for each sample. There are several thousands of lines of data, all in the aforementioned format. The sample text file attached has four samples (#S 1, #S 2, etc.).
I have a feeling my current approach is not efficient. I used the pound symbol as the delimiter, and got everything loaded into a cell with the following code. I even was able to ascertain the lines in which the "headers" were found. There must be a more straightforward way.
filename='sample.txt';
fid =fopen(filename);
c=textscan(fid,'%s','delimiter','#');
index=find(contains(c{1,1},'L X Scan H K L V Epoch Monitor Voltage Ion 2 Trans Ni Cu MCA Seconds Detector'));
fclose(fid);
I am a novice in MATLAB, and would really appreciate any help regarding this.

0 个评论
采纳的回答
Star Strider
2022-5-12
This will read it successfully —
fidi = fopen('sample.txt','rt');
k1 = 1;
while ~feof(fidi)
C = textscan(fidi, repmat('%f',1,15), 'HeaderLines',31, 'CollectOutput',true);
M = cell2mat(C);
if isempty(M) % Empty Matrix Indicates End-Of-File
break
end
D{k1,:} = M;
fseek(fidi, 0, 0);
k1 = k1 + 1
end
fclose(fidi);
Out = cell2mat(D);
OutTable = array2table(Out);
For the posted file, it produces a (132x15) double array that you can then use to create a table with the array2table function (as I have done here), and create one or more derivative table arrays with the information you want to work with.
The first five rows of ‘OutTable’ are:
Out1 Out2 Out3 Out4 Out5 Out6 Out7 Out8 Out9 Out10 Out11 Out12 Out13 Out14 Out15
______ ______ _________ __________ ______ ____ ____ ____ _____ _____ _____ _____ _____ _____ _____
2.7609 1.1335 -0.030778 -0.0090013 -5e-05 190 3001 0 45528 0 3 10 249 0.267 5755
2.7641 1.1335 -0.030778 -0.0090013 -5e-05 192 3001 0 45343 0 2 9 206 0.266 5732
2.7673 1.1335 -0.030778 -0.0090013 -5e-05 194 3001 0 46540 0 2 6 228 0.273 5549
2.7705 1.1335 -0.030778 -0.0090013 -5e-05 196 3001 0 45609 0 3 6 223 0.268 5948
2.7737 1.1335 -0.030778 -0.0090013 -5e-05 197 3001 0 44275 0 1 10 224 0.26 5734
.
.
更多回答(1 个)
Walter Roberson
2022-5-12
Cases like this are often most easily processed by reading the entire file as text and then using regexp() to extract information.
You might, however, be able to use textscan in a loop, making use of the CommentStyle option to skip the headers, and probably using a format repeat count of 1.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Import and Export 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!