Parsing .txt File with Unique Format

Question

Nigel Caprotti 2022-5-12

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1717215-parsing-txt-file-with-unique-format

评论： Star Strider 2022-5-12

sample.txt

Hi,

I am struggling to parse a .txt document. A sample format of the .txt document is attached. Essentially, I need to extract the first column, L, as well as the third to last column, MCA, into separate tables in MATLAB, one table for each sample. There are several thousands of lines of data, all in the aforementioned format. The sample text file attached has four samples (#S 1, #S 2, etc.).

I have a feeling my current approach is not efficient. I used the pound symbol as the delimiter, and got everything loaded into a cell with the following code. I even was able to ascertain the lines in which the "headers" were found. There must be a more straightforward way.

filename='sample.txt';

fid =fopen(filename);

c=textscan(fid,'%s','delimiter','#');

index=find(contains(c{1,1},'L X Scan H K L V Epoch Monitor Voltage Ion 2 Trans Ni Cu MCA Seconds Detector'));

fclose(fid);

I am a novice in MATLAB, and would really appreciate any help regarding this.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Star Strider 2022-5-12

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1717215-parsing-txt-file-with-unique-format#answer_962300

在 MATLAB Online 中打开

This will read it successfully —

fidi = fopen('sample.txt','rt');

k1 = 1;

while ~feof(fidi)

C = textscan(fidi, repmat('%f',1,15), 'HeaderLines',31, 'CollectOutput',true);

M = cell2mat(C);

if isempty(M) % Empty Matrix Indicates End-Of-File

break

end

D{k1,:} = M;

fseek(fidi, 0, 0);

k1 = k1 + 1

end

fclose(fidi);

Out = cell2mat(D);

OutTable = array2table(Out);

For the posted file, it produces a (132x15) double array that you can then use to create a table with the array2table function (as I have done here), and create one or more derivative table arrays with the information you want to work with.

The first five rows of ‘OutTable’ are:

Out1      Out2       Out3          Out4        Out5     Out6    Out7    Out8    Out9     Out10    Out11    Out12    Out13    Out14    Out15
______    ______    _________    __________    ______    ____    ____    ____    _____    _____    _____    _____    _____    _____    _____
7609    1.1335    -0.030778    -0.0090013    -5e-05    190     3001     0      45528      0        3       10       249     0.267    5755 
7641    1.1335    -0.030778    -0.0090013    -5e-05    192     3001     0      45343      0        2        9       206     0.266    5732 
7673    1.1335    -0.030778    -0.0090013    -5e-05    194     3001     0      46540      0        2        6       228     0.273    5549 
7705    1.1335    -0.030778    -0.0090013    -5e-05    196     3001     0      45609      0        3        6       223     0.268    5948 
7737    1.1335    -0.030778    -0.0090013    -5e-05    197     3001     0      44275      0        1       10       224      0.26    5734 

.

2 个评论
显示无隐藏无

Nigel Caprotti 2022-5-12

Thank you for this correspodence -- answered my question.

Star Strider 2022-5-12

As always, my pleasure!

请先登录，再进行评论。

Answer 2

Walter Roberson 2022-5-12

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1717215-parsing-txt-file-with-unique-format#answer_962295

Cases like this are often most easily processed by reading the entire file as text and then using regexp() to extract information.

You might, however, be able to use textscan in a loop, making use of the CommentStyle option to skip the headers, and probably using a format repeat count of 1.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Nigel Caprotti 2022-5-12

Appreciate this.

请先登录，再进行评论。

Parsing .txt File with Unique Format

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（1 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Parsing .txt File with Unique Format

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（1 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论