Optimizing a strfind operation for speed
显示 更早的评论
Hi Matlab Central,
I am an inexperienced programmer looking to speed up the code I have. I know enough to go into profiler and look at what is taking a long time, and I think it is this bit here:
UniqueTFArray=unique(CombinedArray);
TFtable=zeros(size(AAA,1), length(UniqueTFArray));
for i=1:size(AAA, 1)
for j=1:length(UniqueTFArray)
TFtable(i,j)=~isempty(strfind(AAA.Regulator{i,1}, UniqueTFArray{1,j}));
end
end
TFSum=sum(TFtable);
figure; bar(TFSum);
AAA is a few thousand long, and UniqueTFArray is a few hundred, so the way I have it written, I think the profiler is telling me it gets called like 520,000 times so it is slow.
Now, I have a few ideas that I think could be of use.
Most of AAA.Regulator is empty, so length is 0. Should I put the strfind line in an if statement and only call it if the length is greater than 0? That would save time I think...
Or is there a fundamentally better approach?
Thank you very much!
采纳的回答
更多回答(2 个)
Walter Roberson
2014-1-17
0 个投票
There is no point in searching for a string that will never be found.
Question: is AAA.Regulator only unique words, or are you ending up searching multiple times for some words?
Do I understand correctly that the point of the code is to count the number of times that each word of a corpus of words appears in each subset? And to check, are you looking for exact matches, a whole word matching a whole word, or are you looking for the case where the words in AAA.Regulator{i,1} appears anywhere within any of the words? For one thing, if you are looking for whole-word matches then you can "break" out of the "for j" loop as soon as a match occurs.
3 个评论
Sarutahiko
2014-1-17
编辑:Sarutahiko
2014-1-17
Walter Roberson
2014-1-17
A sample AAA.Regulator entry and a sample entry from UniqueTF would help.
Sarutahiko
2014-1-21
类别
在 帮助中心 和 File Exchange 中查找有关 Data Type Identification 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!