How would I create a matrix from the following strings

4 次查看(过去 30 天)
I am trying to write a code with minimal preprocessing. I have many entries in a text file like this:
NODE{1 0 11 0 1.000000e+00 1.000000e-02 -1.000000e-02 1.500000e-03}
There are many rows of these in an Excel file. I want to try and read the Columns 1 and 6 to 8 inside the brackets, what would the best way to do this. I have tried fileread and textscan, but I haven't got anywhere because of the text NODE at the front of the brackets.
  3 个评论
John D'Errico
John D'Errico 2019-12-26
Please learn to use comments intead of answers if you are just responding to a followup question or wish to make a comment.
I've moved the file, attaching it to this comment instead.

请先登录,再进行评论。

采纳的回答

Stephen23
Stephen23 2019-12-26
编辑:Stephen23 2019-12-26
Note that specifying a suitable format string is much more efficient than importing as character/string and then converting afterwards (like the other answers):
opt = {'HeaderLines',2,'CollectOutput',true};
fmt = ['NODE{',repmat('%f',1,8),'}'];
[fid,msg] = fopen('ThinPlateNodes.txt','rt');
assert(fid>=3,msg)
C = textscan(fid,fmt,opt{:});
fclose(fid);
M = C{1}
Giving:
M =
1.00000 0.00000 11.00000 0.00000 1.00000 0.01000 -0.01000 0.00150
2.00000 0.00000 11.00000 0.00000 1.00000 0.01000 0.01000 0.00150
3.00000 0.00000 11.00000 0.00000 1.00000 0.01000 -0.01000 -0.00150
4.00000 0.00000 11.00000 0.00000 1.00000 0.01000 0.01000 -0.00150
5.00000 0.00000 11.00000 0.00000 1.00000 -0.01000 0.01000 0.00150
6.00000 0.00000 11.00000 0.00000 1.00000 -0.01000 0.01000 -0.00150
... lots of lines here
179495.00000 0.00000 15.00000 0.00000 1.00000 0.00952 -0.00960 -0.00824
179496.00000 0.00000 15.00000 0.00000 1.00000 0.00964 -0.00978 -0.00902
179497.00000 0.00000 15.00000 0.00000 1.00000 -0.00985 0.00144 -0.00838
179498.00000 0.00000 15.00000 0.00000 1.00000 0.00912 -0.00254 -0.00858
179499.00000 0.00000 15.00000 0.00000 1.00000 0.00979 0.00995 -0.00745
179500.00000 0.00000 15.00000 0.00000 1.00000 -0.00981 0.00984 -0.00805
And checking the size:
>> size(M)
ans =
78410 8
  6 个评论
JLV
JLV 2020-1-1
The method above worked fine this time!
I assume you put the safeguard in to warn me if the file can't be opened.
Stephen23
Stephen23 2020-1-1
编辑:Stephen23 2020-1-1
"The method above worked fine this time!"
I'm glad. It will be more efficient than the other methods shown on this thread.
"I assume you put the safeguard in to warn me if the file can't be opened."
Yes. I recommend putting that assert statement (or something equivalent) after every fopen: it prints much more useful information than you would get otherwise when a file cannot be opened.

请先登录,再进行评论。

更多回答(2 个)

Bhaskar R
Bhaskar R 2019-12-26
编辑:Bhaskar R 2019-12-26
data = fileread('ThinPlateNodes.txt'); % read file(it is in text)
ext_data = regexp(data, '[^{\]]+(?=})', 'match'); % get data between {}
ext_data(1) = []; % first cell is not required so removed
num_data = zeros(length(ext_data), 8); % your complete data
for ii = 1:length(ext_data)
num_data(ii,:) = cellfun(@str2num, strsplit(cell2mat(ext_data(ii))));
end
% you can get data any colum from the "num_data"
col_1 = num_data(:,1);
col_6_to_8 = num_data(:, 6:8);
  4 个评论
Bhaskar R
Bhaskar R 2019-12-27
Stephen Cobeldick provided a sophisticated answer !!
I am just giving his answer according to your context
opt = {'HeaderLines',4,'CollectOutput',true};
fmt = '"TET4{%f %f %f %f %f %s %f %f %f %f %f %f}"';
[fid,msg] = fopen('NodeNosatElementsUnedited.txt','rt');
assert(fid>=3,msg)
C = textscan(fid,fmt,opt{:}); % open C in variable editor so that you can know extracted data C
fclose(fid);
NodesatElements = [C{1}(:,1),C{3}(:, 3:6)]; % this is final data
Stephen23
Stephen23 2019-12-27
编辑:Stephen23 2019-12-27
"What would be the best way to speed up the code."
  • By not importing numeric data as character/strings, and then awkwardly converting it to numeric afterwards.
  • By not using str2num (which hides slow eval inside).
  • By not expanding the output arrays nearly half-a-million times inside a loop.
  • By not using a cell array to store one numeric scalar per cell.
  • By not importing any data that you do not need.
For example, much like the efficient code I showed you earlier:
fmt = '"TET4{%f%*f%*f%*f%*f%*s%*f%*f%f%f%f%f}"'; % note the ignored fields!
opt = {'HeaderLines',4,'CollectOutput',true};
[fid,msg] = fopen('NodeNosatElementsUnedited.txt','rt');
assert(fid>=3,msg)
C = textscan(fid,fmt,opt{:});
fclose(fid);
M = C{1};

请先登录,再进行评论。


Andrei Bobrov
Andrei Bobrov 2019-12-26
T = readtable('Path\your\txt\file\ThinPlateNodes.txt');
T.Varend = str2double(regexp(T{:,end},'(\-)?\d+(\.\d+e\-\d+)?(?=\}$)','match','once'));
T.Var0 = str2double(regexp(T{:,1},'\d+(?=$)','match','once'));
T = T(:,[end,6:7,end-1]);
T.Properties.VariableNames = {'LABEL','x','y','z'};

类别

Help CenterFile Exchange 中查找有关 Data Import and Export 的更多信息

标签

产品


版本

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by