Write specific data of the specific lines of a text file into a matrix

4 次查看(过去 30 天)
Hello-- I have a HUGE text file with the following format :
*********************************
timestep 225645
A 8
B 43
C 4
D 1
*********************************
timestep 225650
A 10
D 12
C 1
*********************************
What I want is to write the number in front of the timestep in a the first column of a matrix. Also For each loop I want to export the value in front of B to the second column of that matrix. And if there is no B reported in some of the loops take 0 for those elements. I hope you might be able to help me. Thanks.

回答(2 个)

Azzi Abdelmalek
Azzi Abdelmalek 2015-6-11
fid=fopen('fic.txt');
l=fgetl(fid);
k=1;
while ischar(l)
r{k}=l;
k=k+1;
l=fgetl(fid);
end
fclose(fid);
idx=find(~cellfun(@isempty,regexp(r,'(?=timestep).+')));
a=regexp(r(idx),'\d+','match');
b=str2double([a{:}]);
ii=diff([idx numel(r)+1])-1;
for k=1:numel(b);
s=r(idx(k)+1:ii(k));
jj=find(~cellfun(@isempty,regexp(s,'(?=B).+')));
c=regexp(s(jj),'\d+','match');
if isempty(c)
f(k)=0;
else
f(k)=str2double(c{1});
end
end
M=[b' f']
  1 个评论
Homayoon
Homayoon 2015-6-11
编辑:Homayoon 2015-6-11
Dear Azzi, I do appreciate your helps! I was really stuck with this issue until you provided me with the code! However, it seems the code is not working in an appropriate way and that might be because of some ambiguities existed in my question. Up to now, the code is perfectly generating the first column of the matrix but for the second column it always gives 0! To clear up the issue a sample of my input text file has been attached! In fact the second column that I am interested in is the value in front of H2O. In order to discern between H2O and H2O2, I have to put an extra space after H2O to prevent any wrongdoings! I will appreciate your helps as before. Thanks for being so nice! PS: In line 16 of the code you had given to me I changed B to HO2 but it did not work. Always second column is zero, no matter what B is!

请先登录,再进行评论。


Stephen23
Stephen23 2015-6-11
编辑:Stephen23 2015-6-11
This code reads the whole file as one string, then performing some string replacement operations to allow textscan to convert all of the values:
str = fileread('attached.txt');
str = regexprep(str,{'(\\par)?\s*\n','[*]{5,}'},{' ','\n'});
fmt = repmat('%s%f',1,9); % 9 == nine lines of 'key value'
C = textscan(str,['timestep',fmt(3:end)], 'HeaderLines',1, 'MultipleDelimsAsOne',true);
N = [C{3:2:end}]; % numeric values
S = [C{2:2:end}]; % string keys
T = C{1}; % numeric timesteps
Actually all of the data is now available in the variables N, S, and T. But if you want the columns of N to each contain just one variable, then the rows need to be sorted according to S, which can be done using this code:
X = cellfun('isempty',S);
U = unique(S(~X));
for k = 1:numel(T)
S(k,X(k,:)) = setdiff(U,S(k,:)); % insert missing keys
[S(k,:),Y] = sort(S(k,:)); % sort keys
N(k,:) = N(k,Y); % sort values
end
And we can view the output in the command window:
>> S
S =
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
>> N
N =
NaN 24 21 1 7 1 6 34
NaN 24 21 1 7 1 6 34
1 24 20 1 8 1 7 34
1 24 20 1 8 1 7 34
>> T
T =
525305
525310
525315
525320
Note that the order of the columns is alphabetical (after the sort), and the missing values are indicated with NaN's.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by