Parsing Through HTLM Style .txt File

1 次查看(过去 30 天)
I'm trying to read in a txt file with a series of messages describing what is happening during a sporting event. The general format seems to be based on HTML but the file is given as .txt. I'm able to open the file and save it as a variable using fscanf (not sure if this is the most appropiate way for this application). I'm having trouble wrapping my head around how to search for blocks of code that I want to record for further analysis. For example one of things ill be doing is looking for a block like the one below.
<FLAG>
<ID>e9f40b4b-decd-4b47-ae83-e6f9c972c570</ID>
<DT>01.07.2023 16:30:00.000</DT>
<FL>Green</FL>
<LV>Track</LV>
<FI />
</FLAG>
So in plain English i want to look for every time there is <FLAG> and then record the time of day which is denoted by <DT>. I'm pretty rusty with my MATLAB never any reading from .txt files when i was at my best so I think once I get the hand of reading and searching through the file I'll be able to figure it all out from there.

回答(2 个)

Ganesh
Ganesh 2024-6-13
编辑:Ganesh 2024-6-13
You can consider using the "readstruct()" function in MATLAB. It allows you to input an "XML" file and parses it into a structure. Please find the code implementation of the same below:
a = readstruct('Data.txt',FileType="xml")
a = struct with fields:
FLAG: [1x2 struct]
% You can now access the fields like any normal MATLAB structure
a.FLAG(1)
ans = struct with fields:
ID: "e9f40b4b-decd-4b47-ae83-e6f9c972c570" DT: 01.07.2023 16:30:00.000 FL: "Green" LV: "Track" FI: ""
for i=1:length(a.FLAG)
disp(num2str(i)+ " -- "+a.FLAG(i).ID)
end
1 -- e9f40b4b-decd-4b47-ae83-e6f9c972c570 2 -- e9f40b4b-decd-4b47-ae83-e6f9c972c571
You can read up more on readstruct from the following documentation:
  3 个评论
Adam Holland
Adam Holland 2024-6-13
Thanks so much! I really like how doing it this way breaks it down into a structure variable. For some reason it says that my file isnt a valid xml format but I'll have to keep this method in mind for future projects.
Ganesh
Ganesh 2024-6-14
Do refer to the ".txt" file attached. You might be missing to enclose the whole data within a tag.

请先登录,再进行评论。


Hassaan
Hassaan 2024-6-13
% Open the file
fileID = fopen('your_file.txt', 'r');
if fileID == -1
error('Could not open file');
end
% Read the file
fileContent = fscanf(fileID, '%c');
fclose(fileID);
% Find all occurrences of <FLAG> blocks
flagStart = strfind(fileContent, '<FLAG>');
flagEnd = strfind(fileContent, '</FLAG>');
for i = 1:numel(flagStart)
% Extract the FLAG block
flagBlock = fileContent(flagStart(i):flagEnd(i)+6); % +6 to include </FLAG>
% Find and extract the time of day (DT)
dtStart = strfind(flagBlock, '<DT>') + 4;
dtEnd = strfind(flagBlock, '</DT>') - 1;
timeOfDay = flagBlock(dtStart:dtEnd);
% Display or save the time of day
disp(['Time of day: ' timeOfDay]);
end
------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
Feel free to contact me.
  2 个评论
Hassaan
Hassaan 2024-6-13
@Adam Holland You are welcome.
------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
Feel free to contact me.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Adding custom doc 的更多信息

产品


版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by