Could someone please help me speed up my code?

4 次查看(过去 30 天)
I have a code that I am trying to run that will end up taking days for me to execute. The problem is that I have to read in tons and tons of values to "mean" and a couple other matlab functions. I have 12 .mat files that I have to do my operation for, but I can't get it fast enough to get through one file in less than a few days. I really need help finding a way to speed everything up. Just name a file KCTDI001A.mat and make a 14002450x2 random number matrix to check the code.
clear
clc
addpath('C:\filepath');
AllFiles = [];
filenames = dir('C:\filepath');
profile clear
profile on
for ii = 3:length(filenames); % Start at third file( i.e., don’t include “.” and “..”)
%Get the filename
filename_timestamp = filenames(ii).name;
Index = findstr('A',filename_timestamp);
n = str2num(filename_timestamp(6:Index(1)-1));
File_Name = sprintf('KCTDI0%dA.mat',n);
DataFile = load(File_Name);
ACC_Data = DataFile.FileData(:,:);
for k = 1:13978444;
x = (ACC_Data(1+(k-1):24007+(k-1),:));
RMS(k,:) = sqrt(mean(x(:,:).^2));
end
new_name = sprintf('KCTDI0%dA_RMS.mat',n);
save(new_name,'RMS');
end
profile off
profile viewer
  5 个评论
Guillaume
Guillaume 2016-6-27
Also, using addpath just so you can load files in a different directory is not very good, this would be much better:
root = 'C:\filepath'
filenames = dir(root);
for ...
...
DataFile = load(fullfile(root, File_name));
Also, note that your code will never load a file named KCTDI001A.mat (as you suggest creating) since for n = 1, the name your sprintf generates is KCTDI01A.mat (one less 0).
Tony Pate
Tony Pate 2016-6-27
The bottleneck is at the "mean" function, but I would like to find an alternative to this function or an alternative to the overall function that I am trying to create. I need the root mean square of the window of data that I am selecting, so that is why I have that line of code "RMS(k,:) = sqrt(mean(x.^2));". If there is a quicker way to find the RMS a ton of times, then that would also work. I just am having a hard time running this program quickly with the "mean" function slowing things down so much.

请先登录,再进行评论。

采纳的回答

Guillaume
Guillaume 2016-6-27
See my comment to the question about the (:, :). The extra brackets is x = (ACC...) also do not help readability.
To speed up the loop you could certainly take out the squaring and the square root:
ACC_Data = DataFile.FileData.^2; %do the squaring only once
RMS = zeros(13978444, 2); %would be better if sizes were not hardcoded
for k = 1:13978444 %semicolon not needed
RMS(k,:) = mean(ACC_Data(1+(k-1):24007+(k-1),:));
end
RMS = sqrt(RMS);
But this is not going to help much with speed because you're still sliding over lots of rows.
If you have matlab R2016a or newer, then you can use movmean to calculate the moving average without a loop:
RMS = sqrt(movmean(DataFile.FileData .^ 2, 24007, 1, 'EndPoints', 'discard'));
If not, you can simply do a convolution with a constant vector of the right length and value:
RMS = sqrt(conv2(DataFile.FileData .^ 2, ones(24007, 1) / 24007, 'valid'));
  2 个评论
Tony Pate
Tony Pate 2016-6-27
Thank you. I will try this instead and see if it speeds up my program enough. I am not familiar with "conv2" or "movmean", but I will research and find out if they do the calculations I need correctly.
Guillaume
Guillaume 2016-6-27
Well movmean is just a moving average and is exactly what you are doing, so yes it does the calculation correctly.
A convolution with a constant function is also a moving average. Due to the way it's implemented it may results in negligible differences (in the last few decimals only).

请先登录,再进行评论。

更多回答(3 个)

Roger Stafford
Roger Stafford 2016-6-27
It is the repeated ‘mean’ of 24008 elements at a time taken 13978444 times that is the time-consuming aspect of your computation. You would greatly increase your speed if you compute the column-wise cumulative sum of the squares of the x elements and use that to compute the equivalent of the mean instead. There is of course a loss of accuracy over such a large number of cumulative sums but perhaps that would be acceptable to you. If not, perhaps you could still break up things into overlapping cumulative blocks only relatively small multiples of 24008. You would still gain a lot of speed that way. Having to add almost the same set of numbers repeatedly in forming your means is bound to be an inefficient kind of procedure.

Jan Orwat
Jan Orwat 2016-6-27
  1. If you have to use loop, preallocate variable RMS cause it seems it's changing size every iteration. With 14M iterations it may take "ages".
  2. It looks like you are doing moving average. Vectorize the code. Try movmean if you have MATLAB 2016a or newer. You can also do it via convolution, using conv/conv2, filter/filter2 or fft etc.

Thorsten
Thorsten 2016-6-27
编辑:Thorsten 2016-6-27
I found this to run much faster (about 23s on my machine): preallocate RMS_new, move the squaring and the division by N (to compute the mean) out of the loop, and then in each iteration subtract a single element and add a single element to be previous mean; finally do the square root.
K = 13978444;
N = 24007;
ACC_DataN = (ACC_Data.^2)/N;
RMS_new = nan(K, size(ACC_DataN, 2));
RMS_new(1, :) = sum(ACC_DataN(1:1+N-1, :));
for i = 2:K
RMS_new(i,:) = RMS_new(i-1,:) - ACC_DataN(i-1,:) + ACC_DataN(i+N-1,:);
end
RMS_new = sqrt(RMS_new);
  1 个评论
Tony Pate
Tony Pate 2016-6-27
This method sped things up a lot. Thank you for your help. I am going to test run all of the ideas that everyone gave me and find the best possible scenario.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Logical 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by