extracting numbers with decimal places from the body of text file and assigning to a variable

23 次查看(过去 30 天)
Hi,
I have a text file that I have read in to Matlab as a character array. This file has text written in the body of it, however I am after the specific variables.
I want to extract specific values from the text to assign to their specific variable.
For example my text has something like the following in italics:
Header with text and comments
other text that I am not interested in, etc.
AAA = 18.457
BBB = 34.6
CCC = 4
I would like my results to be a series of variables
AAA = 18.457
BBB = 34.6
CCC = 4
Which I could then use to perform operations on.
I tried using the following:
fid = fopen(“file”,’r’)
text = textscan(fid,'%s','Delimiter','','endofline','');
text = text{1}{1};
fid = fclose(fid);
Expression = ‘AAA = (\d+)';
AAA = regexp(text,expression,'tokens');
However, this only printed out “18” rather than my desired “18.457” (so stopping at the decimal character). Is there a way to extract a number that may or may not have decimal places?
Ideally, I would also make it so it wasn’t sensitive to matching the exact number of spaces after the variable either “e. It just needs “AAA”, rather than “AAA “.
Is there a way to use Matlab to achieve what I want?

采纳的回答

Stephen23
Stephen23 2021-1-1
编辑:Stephen23 2021-1-1
%str = fileread(..) % <- simpler way to import the file data.
str = sprintf('%s\n','Header with text and comments','other text that I am not interested in, etc.','AAA = 18.457','BBB = 34.6','CCC = 4')
str =
'Header with text and comments other text that I am not interested in, etc. AAA = 18.457 BBB = 34.6 CCC = 4 '
rgx = '^\s*(\w+)\s*=\s*(\d+\.?\d*)';
tkn = regexp(str,rgx,'tokens','lineanchors');
tkn = vertcat(tkn{:}).';
tkn(2,:) = num2cell(str2double(tkn(2,:)));
out = struct(tkn{:})
out = struct with fields:
AAA: 18.4570 BBB: 34.6000 CCC: 4
out.AAA
ans = 18.4570
Personally I would use a different approach: open the file, read the header lines using fgetl, then import the data using textscan. It would probably be easier than messing about with matching number formats (i.e. don't reinvent the wheel).
  4 个评论
James Browne
James Browne 2021-1-2
Thanks, I made that work with my code.
I added in "\-?" so the token is now "(\-?\d+\.?\d*)" because I also wanted to include negative numbers as possible outputs.
Instead of pulling out individual variables from the structure array (ie. with out.aaa) is it possible to make each variable in the structure array into a variable along with it's name?
Stephen23
Stephen23 2021-1-2
编辑:Stephen23 2021-1-2
"is it possible to make each variable in the structure array into a variable along with it's name?"
Possible yes, but only if you want to force yourself into writing slow, complex, obfuscated, buggy code that is difficult to debug:
There are so many reasons why that is a fragile, bad approach to writing your code. For example, consider what your code would do if the header name happens to be the same name as any existing variable: it would simply overwrite that variable without any warning. Such bad code design allows for all sorts of latent bugs that are difficult to track down because they depend on specific data... ugh.
If you know the headers/variables in advance then by all means allocate them explicitly:
If you do NOT know the headers in advance then magically creating variables from them would be a fragile, buggy, ugly approach: how would you even know what header had been imported? (trivially easy to do with the structure, quite tricky to do with randomly named variables in a workspace)

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by