How to extract data from table variable names?

MIM Maestro puts volume data inside the table variable names:
Rectum (4)(Volume: 57.77) Bladder (5)(Volume: 139.40)
How do we extract this volume data?

回答(2 个)

This method works, but I suspect there is a more elegant solution.
% get list of variable names
opts = detectImportOptions(filepath, 'NumHeaderLines', 1);
% we find the column containing the rectum volume
RectumSearching = regexp(opts.VariableNames,'Rectum_');
for loop = 1:numel(opts.VariableNames)
if RectumSearching{loop} == 1
index = loop;
end
end
% extract volume from said string
volume = extractAfter(opts.VariableNames{index},'Volume_');
volume = str2num(strrep(volume(1:end-1),'_','.'))
Result:
volume =
57.7700
>> C = {'Rectum (4)(Volume: 57.77)','Bladder (5)(Volume: 139.40)'};
>> str2double(regexp(C,'\d+(\.\d+)?(?=\)$)','once','match'))
ans =
57.770 139.400
Or to require the preceding 'Volume' substring:
str2double(regexp(C,'(?<=Volume: )\d+(\.\d+)?(?=\)$)','once','match'))

6 个评论

Looking at the regexp documentation I have not yet been able to interpret your expression. Using that code results in NaN, unfortunately.
Namely, replacing
volume = extractAfter(opts.VariableNames{index},'Volume_');
volume = sscanf(strrep(volume,'_','.'),'%f');
with
volume = str2double(regexp(opts.VariableNames{index},'\d+(\.\d+)?(?=\)$)','once','match'));
results in a 40x1 double array of NaN, without any time saved.
(My initial thought upon seeing this answer: It seems once someone learns regular expressions one should be given an honorary degree in Computer Science ... perhaps a Bachelor's ...)
@Daniel Bridges: please upload opts in a .mat file.
@Daniel Bridges: sorry, my old MATLAB can't read that object type. Please upload this .mat file:
vn = opts.VariableNames;
save('varnames.mat','vn')
@Daniel Bridges: thank you for uploading that .mat file. The char vectors in that cell array have a different format to the one that you showed in your question, apparently with the parentheses and decimal point replaced by underscores. You can easily process this by first replacing the underscore with period characters:
>> S = load('varnames.mat');
>> C = strrep(S.vn,'_','.');
>> str2double(regexp(C,'\d+(\.\d+)?(?=\.$)','once','match'))
ans =
Columns 1 through 7
NaN 18494 27.49 4.9 57.77 139.4 479.8
Columns 8 through 14
1.08 29.15 95.8 97.26 72.3 93.45 80.69
Columns 15 through 16
83.06 32.39
As an alternative you could skip using detectImportOptions (which I guess makes these character replacements) and read the header lines using fgetl. This line could then be trivially process by a similar regular expression to the one I showed you.

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Data Distribution Plots 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by