I know this is probably a novice question, but I am quite a Matlab novice. The while loop in my script begins to run ridiculously slow as the table "nonapattern" increases in size. Is it possible to increase the speed somehow? Thank you.
1 次查看(过去 30 天)
显示 更早的评论
counter=1;
searchsize=254;
patternsize=92378;
j=1;
i=1;
newlist = zeros(100,2);
while counter<patternsize
while i<searchsize
i
if isequal(pinellas{i,3},nonapattern{counter,1})
newlist(j,1)=pinellas(i,1);
newlist(j,2)=pinellas(i,2);
j=j+1;
end
i=i+1;
end
counter=counter+1;
i=1;
end
Pattern trajectory is the script which matches the patterns from "Data" with the list in "nonapattern". When "nonapattern" becomes large (e.g. around 90,000 x 2 element table) the script takes days to run. Thanks so much for any suggestions/help to make this run faster.
0 个评论
采纳的回答
jonas
2018-7-29
编辑:jonas
2018-7-31
Looks like the size of the matrix is increasing by each entry. Read about preallocation and preallocation of matrices of unknown size .
Other than that, the original script loops through one cell array, nonapattern, and finds matching strings in a second cell array, data, including duplicates. Some data is then extracted from the matched rows of data. Faster code given below:
Load data
[~,nonapattern]=xlsread('nonapattern.xlsx');
[numdata,data]=xlsread('Data.xlsx');
Find pairs of identical strings in each cell arrays
[C,ia,ib] = intersect(nonapattern,data)
C =
3×1 cell array
{'SO5 SO6 SOA SOB SOC SOD SOE SOG SOO'}
{'SO5 SO6 SOA SOB SOD SOE SOG SOH SOO'}
{'SO5 SO6 SOD SOE SOF SOG SOK SOM SON'}
Next, find duplicates
index=cellfun(@(x)find(ismember(data,x)==1),C,'uniformoutput',false)
index =
1×3 cell array
{5×1 double} {4×1 double} {2×1 double}
Grab corresponding numerical data from numdata, columns 1 and 2
out=cellfun(@(x)numdata(x,1:2),index,'uniformoutput',false);
out =
3×1 cell array
{5×2 double}
{4×2 double}
{2×2 double}
11 个评论
jonas
2018-8-4
编辑:jonas
2018-8-4
I am a bit confused because I don't understand the structure of your new data. Now you are working with 2D cell arrays, which is fine, but what are the dimensions of numdata? Feel free to upload the new data if you want me to take a look.
Anyway, so let's break the code down line by line, using my original notations.
[C,ia,ib] = intersect(nonapattern,data)
You said this works fine, but I suspect there is a problem with the input here. I would take a look at the content of C{1} to make sure it looks OK. The next line of code:
index=cellfun(@(x)find(ismember(data,x)==1),D,'uniformoutput',false)
goes over over each unique cell in D, cell by cell, and finds matches in data. the function ismember outputs a matrix with the same size as data, containing ones where you have matches and zeros otherwise. The find function then takes this matrix and outputs the linear indices of matches, i.e. the ones. It seems C{1} matches 6315992 times, which is not necessarily wrong, but makes me believe there is something sketchy going on with the content of that cell.
out=cellfun(@(x)numdata(x,1:2),index,'uniformoutput',false);
The problem is in this line of code, which only works if both C and data are single column cell arrays. The reason is that the previous line of code outputs linear indices, as opposed to subscripts.
What are linear indices? Assume you have a matrix:
A =
0 0
0 0
1 1
find(A==1)
3
6
The linear indices basically describe the position in the 2D-array if you stack each column on top of one another to a long 1D-array.
The next line
out=cellfun(@(x)numdata(x,1:2),index,'uniformoutput',false);
breaks down because we are using linear indices to refer to rows.
This can easily be fixed. In fact, the find column can output both linear indices and subscripts if you add two more outputs:
[linear,row,col]=find()
However, I don't understand the structure of your new numdata so I cannot write the new code for you.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Linear Algebra 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!