Removing matrix rows in while-loop

11 次查看(过去 30 天)
Evan Watts
Evan Watts 2018-4-27
评论: dpb 2018-5-4
I have some data sets I am running through a code to try and find the min and max pH for certain days of some months. The data is imported as a matrix of roughly 44000x3, depending on the month. The columns are date, flow, and pH. I am trying to remove any row where the pH meter erroneously read 0, as well as rows where the flow value doesn't change. Here is my snippet of code:
n=1;
while(n<numel(Matrix(:,1)))
if isequal(Matrix{n,3},0)
Matrix(n,:)=[];
end
if isequal((Matrix{n,2}-Matrix{(n+1),2}),0)
Matrix(n,:)=[];
end
n=n+1;
end
While this seems to function to some degree, (my matrix goes down to about 20000x3), I know there are much fewer data points, and when I finally display the min and max pH's with their corresponding dates, it shows all 28-31 days of the respective month. I know there were certain months with around 8 days of flow, so I am not sure what is happening. Any thoughts? Here is the remainder of my code that spits out the min/max values:
% Convert column 3 to numeric scalars
vec = cell2mat(Matrix(:,3));
% Find the min and max pH for each day
[idx,dates] = findgroups(Matrix(:,1));
max_pH = splitapply(@max,vec,idx);
min_pH = splitapply(@min,vec,idx);
% Convert the dates into serial numbers for sorting
Datesnum=datenum(dates);
% Input into new matrix
minmax=Datesnum;
minmax(:,2)=max_pH;
minmax(:,3)=min_pH;
% Sort the matrix based on the date values
[~,idx] = sort(minmax(:,1)); % sort just the first column
minmax2 = minmax(idx,:); % sort the whole matrix using the sort indices
% Return date serial numbers to date strings
MinMax{1,1} = datestr(minmax2(:,1)); %first cell array is the date strings
MinMax{1,2} = minmax2(:,2:end); %second cell is the X,Y data matrix

回答(2 个)

dpb
dpb 2018-4-27
"The columns are date, flow, and pH. I am trying to remove any row where the pH meter erroneously read 0"
data(data(:,3)==0,:)=[]; % remove any row w/ zero for Ph
"... as well as rows where the flow value doesn't change"
dF=[nan;diff(data(:,2))]; % flow difference between consecutive readings
data(df==0,:)=[]; % remove those
Given the flow readings are dependent upon adjacent elements to compute the difference, you need to do that operation first before then cleaning out the zero pH measurements...that's probably the biggest thing that's happening in the above code is that you eliminate a row for pH and then the flow difference isn't zero whereas it may have been in the original sequence.
I'd strongly recommend to read these data into a table with readtable and then can use those features associated with it. There is a timetable but I've yet to figure out how to make any really effective use of it...
  6 个评论
Evan Watts
Evan Watts 2018-5-3
Here is the result of your recommendation.
>> m=Matrix{1:3,:};
>> whos m
Name Size Bytes Class Attributes
m 1x8 16 char
Matrix is a 40,000x3 cell, with column 1 being a cell, and columns 2 and 3 being doubles. Here is how I create it once the data is imported from excel.
% create a Matrix
Matrix=Date;
% fill in Matrix
Matrix(:,2)=num2cell(Tot);
Matrix(:,3)=num2cell(pH);
Thanks for bearing with me!
dpb
dpb 2018-5-4
Ah! There's the rub; num2cell puts every element into a separate cell in the array instead of putting the array into a cell...the difference is that with the former M{:,2} returns a comma-separated list whereas if use M{1,2}=Tot; instead then {M{:,2}} returns the double array which is what I had presumed in initial answer.
At least four ways to go --
  1. Keep the same arrangement for M -- then need to write [data(data(:,3)]==0,:)=[]; so the list is aggregated into a vector for the comparison, or
  2. Change M definition slightly to write M(1,2)={Tot}; etc., to have the array as a cell, then the aforementioned "curlies" will work on the content of the cell which is an array, or
  3. Do the decimation/cleanup before turning into a cell array...here you've already got arrays to work with so just save the indices and remove elements directly--
dF=[nan;diff(Tot)]; % flow difference between consecutive readings
ix=(df==0); % the positions to remove
F(ix)=[]; pH(ix)=[]; Time(ix)=[]; % clean up all variables
ix=(pH==0); % now the pH measure after the dF is done
F(ix)=[]; pH(ix)=[]; Time(ix)=[]; % clean up all variables
  1. 4), can't continue numbering w/ edit tool after break for code--the last alternative would be to convert to the table type instead of cell; then the referencing of the different variable types "just works" transparently without the need to figure out what level of dereferencing is needed for the specific form of the cell array used; the reference with {} always returns the native type of the specific variable whereas the "dot" reference or {} returns another table.

请先登录,再进行评论。


James Tursa
James Tursa 2018-5-3
In general, you should not remove rows in a loop. Instead, keep track of which rows you want to delete and then delete them all at once at the end of the loop. E.g., the construct looks like this:
n = size(Matrix,1);
d = false(n,1);
for k=1:n
if( some condition for row k deletion is true )
d(k) = true;
end
end
Matrix(d,:) = [];
  4 个评论
Evan Watts
Evan Watts 2018-5-3
Ah, I think I see where I messed up here, when k=n, there is no k+1 term, and so it exceeds the dimensions. Any workaround?
Evan Watts
Evan Watts 2018-5-3
Update, I did a goofy looking bandaid level patch that I think works.
k=1:(n-1)
some high level coding work being done on my end obviously.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Just for fun 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by