Improving Efficiency of Find Algorithm

10 次查看(过去 30 天)
Hello,
I am aware that logical indexing is much faster than the usage of the find function, in specific instances. I'm wondering if there is a way to improve the following algorithm - I'm not quite sure how to use indexing (if possible) in this situation.
What I have is a matrix of ascending values, though some of those values may be repeated (specifically, I have millions of ascending timestamps with many repeated). I am then seeking the start and end indices of a window that is between time X and Y.
Here is an example of the algorithm that I currently have implemented:
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 80;
start_index = find(myDataTimestamps >= window_start_time,1,'first');
end_index = find(myDataTimestamps <= window_end_time,1,'last');
Is there a way to improve the speed of this code and still return the same start_index of 3 and end_index of 13?
Much appreciated!

回答(2 个)

Cris LaPierre
Cris LaPierre 2021-8-19
This approach may only work for this simple case, but here's a way to do it using max/min.
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 60;
% find start/end index
ind = 1:length(myDataTimestamps);
wind = myDataTimestamps==window_start_time | myDataTimestamps==window_end_time;
start_index = min(ind(wind))
start_index = 3
end_index = max(ind(wind))
end_index = 10
  1 个评论
Matt C
Matt C 2021-8-19
编辑:Matt C 2021-8-19
I haven't been able to do a comparison, but that wasn't the massive improvement that I was hoping for. I had cancelled the script early without grabbing a total runtimes for comparison, but it looked like the proposed algorithm was going to take just as long (if not longer) than the find function. I implemented the recommendation as:
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 60;
% find start/end index
ind = 1:length(myDataTimestamps);
start_index = min(ind(myDataTimestamps>=window_start_time));
end_index = max(ind(myDataTimestamps<=window_end_time));
Have I blown anything in my above implementation? Note that my processor loading was ~50%, and only ~1 GB of my 24 GB of RAM was being used.
Edit: I can confirm that it took much longer using the min/max method. My code took ~45 minutes to fully execute using 'find', whereas it had only completed ~25% after about 2 hours using the min/max method.

请先登录,再进行评论。


Cris LaPierre
Cris LaPierre 2021-8-19
I wonder if this is a scenario where using tall data may help. See this page.

类别

Help CenterFile Exchange 中查找有关 Loops and Conditional Statements 的更多信息

标签

产品


版本

R2015b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by