Concatenate data from processing of imported text files

3 次查看(过去 30 天)
I am extracting text files from a path on my computer, using a for loop to process each text file. This text file undergoes a simple processing like here I am extracting the first two columns of the text file and storing it in K9. Typically K9 has dimensions 14000*2 and varies slightly like 14002*2. How can I concatenate all K9 values in a new matrix say K10? My text files have names A1.txt, A2.txt, A3.txt,...and A22.txt. I also need a new row (first row) where names of these text files are stored so that I understand that the data belongs to this particular text file. I have searched the matlab forum for help in this regard, however could not solve the question. Here is my code
directory = 'D:\PhD\Matlab code';%path from where Iam extracting files
textfiles = fullfile(directory,'*.txt');
dinfo = dir(textfiles);
for K = 1:length(dinfo)
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
K1 = K0.data;%I extract only the matrix data useful to me not the rest
K9= [K1(:,1) K1(:,2)];%extract the first two columns not the third
end
Please help. Thank you.
Regards

采纳的回答

Max Murphy
Max Murphy 2019-12-9
Not the most efficient, but you could do:
directory = 'D:\PhD\Matlab code';%path from where Iam extracting files
textfiles = fullfile(directory,'*.txt');
dinfo = dir(textfiles);
K9 = [];
K10 = [];
for K = 1:length(dinfo)
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
K1 = K0.data;%I extract only the matrix data useful to me not the rest
% Same as vertcat() function:
K9= [K9; K1(:,1) K1(:,2)];%extract the first two columns not the third
% Cell array of labels:
K10 = [K10; repmat({dinfo(K).name},size(K1,1),1)];
end
You might also look into Matlab table format, since it seems you want to have different data entries where each "row" is a labeled data point.
To make the labeling vector smaller, you may also look into Matlab categorical variables, which can reduce the size if you get all the unique entries of dinfo.name first.
  6 个评论
Max Murphy
Max Murphy 2019-12-10
I see that they are intermediate steps in a processing algorithm, so it might make sense to do it that way.
Your algorithm is currently something like:
% Iterate on each file in the dataset
for K = 1:length(dinfo)
% Import the data
...
% Do processing on reduced subset of the data that meets criteria
...
% Write the result of processing to variable K7
% --> K7 is overwritten each time the loop runs
end
It might be easier to write a separate processing function that is called once on each loop iteration:
% Data matrix we wish to concatenate
data = [];
for K = 1:length(dinfo)
% Import the data
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
tmp = [K0.data(:,1),K0.data(:,2)];
% Do processing on reduced subset of the data that meets criteria
data = [data; doProcessing(tmp,T,P,Q)];
% Note that this causes the output of doProcessing to be
% vertically concatenated to the existing matrix, [data].
% This can become inefficient for large datasets, in which case
% it is better to pre-allocate your data matrix or store it in
% some other way where only the relevant chunk is being accessed.
end
And the processing function is
function data_out = doProcessing(data_in,T,P,Q)
K2=data_in(data_in(:,1)<=0.61,:);%Operation1
K3=K2(K2(:,1)>0,:);%Operation2
K4=[K3(:,1) K3(:,2)*T];%Operation3
K5=[K4(:,1) K4(:,2)-K4(1,2)];%Operation4
K6=[K5(:,1)*P K5(:,2)*-1];%Operation5
data_out=[K6(:,1) K6(:,2)*Q];%Operation6
end
Which can be saved as a .m file in the same working directory as your current script, or it can be a nested function within your current function, for example. If you save it as a separate file, it should have the same name as whatever function name you give it (in this example, doProcessing.m).
I would also point out that unless T, P, and Q are also scalars, this may not work depending on the dimensions of your dataset.
Venkatesh M Deshpande
Yes I initialized the matrix K9 and it works. For K10, it is giving me names of two files repeatedly in two columns. However, I am not concerned about that right now. Thanks for your time. I will also update you on the new code you have sent.

请先登录,再进行评论。

更多回答(1 个)

Jakob B. Nielsen
Jakob B. Nielsen 2019-12-9
You cant join two arrays of different size, but you can use a structure to store the data which will give you almost the same thing.
For example
directory = 'D:\PhD\Matlab code';%path from where Iam extracting files
textfiles = fullfile(directory,'*.txt');
dinfo = dir(textfiles);
for K = 1:length(dinfo)
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
K1 = K0.data;%I extract only the matrix data useful to me not the rest
K9= [K1(:,1) K1(:,2)];%extract the first two columns not the third
concstruct(K).data=K9;
concstruct(K).name=dinfo(K).name;
end
If you absolutely must join the K9's in the same matrix, you will have to either cut all that exceeds 14000 rows, or alternatively add trailing zeros to any matrix up to the dimension of the larger matrix. You cant have numbers and characters in the same array, so consider array2table of your final array, and then have the variable names in the table be your file name references. But I would still just use the structure way, it gives you - essentially - the same :)
  2 个评论
Stephen23
Stephen23 2019-12-9
There is no need to create a new structure, you can just use the strucutre returned by dir:
dinfo(K).data = K9;
Venkatesh M Deshpande
编辑:Venkatesh M Deshpande 2019-12-10
No this code is not working. The last two lines which you have added do not make any change in my output. I am still getting the same output. Also I could not find any function called concstruct. However, it does not show any error when I am running, though results are the same i.e. It stores data for the last file only. In my previous answer to your post, I have shared my complete code and the text files I am processing. If you could help me with that, I would be very grateful. Thank you.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Environment and Settings 的更多信息

产品


版本

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by