Extracting Time Data from Text File

12 次查看(过去 30 天)
In short, I've been trying to put together a script that scans a .txt file and pulls number values from it. The context is Rubik's Cube times; I have a program that produces .txt files from sessions, and the contents include somee basic statistics at the top, a list of times, and the moves used to scramble the cube for each time. I've included an example of the beginning of one of these files below:
Average: 10.05
Best: 9.34
Worst: 14.26
Mean: 10.75
Standard Deviation: 1.79
1: 9.91 D2 L2 B2 L2 F2 D L2 B2 D' U L' U' L D' L F' R' B' U B2
2: 9.86 R2 U2 B2 F2 D L2 D' U' R2 U' B2 R U F' U' B' D B2 U R' D2 F2
3: (14.26) D2 L2 B' L2 B2 D2 L2 F2 D2 F R' D B R2 B' D' R2 B2 F L
4: (9.34) B F U2 B R2 B' U2 F' D2 R' D' B2 D' B' U2 R' B' D' B'
5: 10.39 B F' U2 B2 L2 U2 R2 B' R2 B U B2 R2 D2 R2 B' L B R D2
My goal is to pull the times from these files to run my own stats on them, but there are a lot of nuances that have made this difficult!
The biggest issues I've had:
1) As you can see, some of the times contain 3 sig figs, and some contain 4. I haven't found a clean way to pull both from the file at the same time.
2) I haven't found a way to ignore the statistics at the top (if this is too much for a script, it would be easy to just delete that section of the text file before I run the script.
What I have so far isn't pretty, but so far this has successfully created an array of the 4-digit times for me:
fileID = fopen("txt Files/text-4AB30AA43EF2-1.txt")
raw = fscanf(fileID,"%c")
fclose(fileID)
num = '[0123456789][0123456789].[0123456789][0123456789]';
out = regexp(raw,num,'match');
gen1 = str2double(out);
list = gen1'
Any help would be greatly appreciated, thank you!

采纳的回答

Stephen23
Stephen23 2022-2-7
编辑:Stephen23 2022-2-7
Simpler and more efficient:
str = fileread('test_1.txt');
tkn = regexp(str,'^\s*(\d+):\D+(\d+\.?\d*)\W+([^\n\r]+)', 'tokens', 'lineanchors');
tkn = vertcat(tkn{:})
tkn = 5×3 cell array
{'1'} {'9.91' } {'D2 L2 B2 L2 F2 D L2 B2 D' U L' U' L D' L F' R' B' U B2' } {'2'} {'9.86' } {'R2 U2 B2 F2 D L2 D' U' R2 U' B2 R U F' U' B' D B2 U R' D2 F2'} {'3'} {'14.26'} {'D2 L2 B' L2 B2 D2 L2 F2 D2 F R' D B R2 B' D' R2 B2 F L' } {'4'} {'9.34' } {'B F U2 B R2 B' U2 F' D2 R' D' B2 D' B' U2 R' B' D' B'' } {'5'} {'10.39'} {'B F' U2 B2 L2 U2 R2 B' R2 B U B2 R2 D2 R2 B' L B R D2' }
mat = str2double(tkn(:,1:2))
mat = 5×2
1.0000 9.9100 2.0000 9.8600 3.0000 14.2600 4.0000 9.3400 5.0000 10.3900
Optional extras:
% tkn(:,1:2) = num2cell(mat); % optional, but it is much better to store numeric
% data in a numeric array, so probably best avoided.
spl = regexp(tkn(:,3),'\w+','match')
spl = 5×1 cell array
{1×20 cell} {1×22 cell} {1×20 cell} {1×19 cell} {1×20 cell}
spl{:}
ans = 1×20 cell array
{'D2'} {'L2'} {'B2'} {'L2'} {'F2'} {'D'} {'L2'} {'B2'} {'D'} {'U'} {'L'} {'U'} {'L'} {'D'} {'L'} {'F'} {'R'} {'B'} {'U'} {'B2'}
ans = 1×22 cell array
{'R2'} {'U2'} {'B2'} {'F2'} {'D'} {'L2'} {'D'} {'U'} {'R2'} {'U'} {'B2'} {'R'} {'U'} {'F'} {'U'} {'B'} {'D'} {'B2'} {'U'} {'R'} {'D2'} {'F2'}
ans = 1×20 cell array
{'D2'} {'L2'} {'B'} {'L2'} {'B2'} {'D2'} {'L2'} {'F2'} {'D2'} {'F'} {'R'} {'D'} {'B'} {'R2'} {'B'} {'D'} {'R2'} {'B2'} {'F'} {'L'}
ans = 1×19 cell array
{'B'} {'F'} {'U2'} {'B'} {'R2'} {'B'} {'U2'} {'F'} {'D2'} {'R'} {'D'} {'B2'} {'D'} {'B'} {'U2'} {'R'} {'B'} {'D'} {'B'}
ans = 1×20 cell array
{'B'} {'F'} {'U2'} {'B2'} {'L2'} {'U2'} {'R2'} {'B'} {'R2'} {'B'} {'U'} {'B2'} {'R2'} {'D2'} {'R2'} {'B'} {'L'} {'B'} {'R'} {'D2'}
  5 个评论
Simon Montrose
Simon Montrose 2022-2-7
@Stephen Of course, I didn't know I could switch them my apologies! Still getting acquainted with the forums, but I've implemented your suggestions and have the script completely up and running :) Cheers!

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Standard File Formats 的更多信息

产品


版本

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by