can someone suggest to make this code faster
2 次查看(过去 30 天)
显示 更早的评论
Can someone suggest me a trick to make this code faster.
"all_DOY" = 1000*15000*23 ; %(i,j,k) such that k dimension is the day of the year whic is random, while (i,j) are my pixel location starting from (1,1) to (1000,15000). The values of day of the year (DOY) in the kth dimension rranges range from -1 to 366
"all_variable" = 1000*15000*23; This is my variable value for (i,j,k) for the (i,j) position and k dimension i.e., day of the year in the "all_DOY" matrix. The values in the kth dimension range from -3000 to 15000
all_variable_new = 1000*15000*366
I'm trying to update the "all_variable_new" such that, my k dimension is from 1:366 (days of the year). for each day of the year obtained from "all_DOY" matrix, the variable value is updated in the "all_variable_new" for that respective (i,J,k) from "all_variable"
all_variable_new=zeros(1000,15000,366,'int16');
all_variable = load('all_variable.mat');
all_DOY = load('all_DOY.mat');
tic
for ii =1: size(all_DOY,1) % for each ii
for jj = 1:size(all_DOY,2) % for each jj
idx = squeeze(all_DOY(ii,jj,:)); % sequeez k'th dimension. idk is the DOY which is also the position value in all_variable_new
variable = squeeze(all_variable(ii,jj,:)); % obtain the variable values
variable(idx==-1)=[]; % if -1, remove
idx(idx==-1)=[]; % if -1 then remove the values
Ind=idx;
if isempty(Ind)
all_variable_new(ii,jj,:) = NaN; % if there are no idx values then replace with NAN
else
for mm = Ind
all_variable_new(ii,jj,mm) = variable; % for the day of the year values obtained, replace those positions with variable values.
end
end
clear idx Ind mm
end
end
toc
4 个评论
Guillaume
2020-3-10
编辑:Guillaume
2020-3-10
Writing good comments is hard. Unfortunately your comments are not that useful, they mostly paraphrase the code but don't explain the purpose of the code, so we still have no idea what your code is doing. It's not helped by the poor choice of variable names. A good variable name describes the purpose of the variable. Calling a variable variable tells you nothing about its purpose.
We need to know:
- What do the inputs represent?
- It looks like you have 3D variables, what does each dimension represent? and if within one dimension, the rows/columns/pages represent different things, what are these?
- Why is all_variable of class int16?
- What is the purpose of the code? What does it try to achieve?
Saying that there are clearly some things that can be improved in your code. It's doubtful they'll have much of an impact though:
- The clear idx Ind mm is completely pointless. The variables will be replaced anyway on the next step of the loop.
- The copy of idx into Ind is completely pointless. You may as well continue using idx and never bother creating Ind.
- The for mm = Ind is completely pointless. Ind is guaranteed to be a column vector, so the loop will only have one iteration with mm exactly equal to Ind, so the mm loop can be altogether replaced by:
all_variable_new(ii,jj,Idx) = variable;
采纳的回答
Guillaume
2020-3-10
编辑:Guillaume
2020-3-10
If I understood correctly:
pixels_dayofyear = load('all_doy.mat');
pixels_temperature = load('all_variable.mat');
assert(isequal(size(pixels_dayofyear), size(pixels_temperature)), 'Size of matrices doesn''t match!');
%sort each pixel (row, column) temperature (pages) according to the corresponding day of year.
%temperature for invalid days of year (-1) are all replaced by 0 at the end of the page.
pixels_dayofyear(pixels_dayofyear == -1) = NaN; %replace by a value that is sorted last. Could be Inf instead of NaN
[~, order] = sort(pixels_dayofyear, 3); %sort the day of year of each pixel. Get the new order
[rows, cols, ~] = ndgrid(1:size(pixels_dayofyear, 1), 1:size(pixels_dayofyear, 2), 1:size(pixels_dayofyear, 3)); %create row and column indices replicated along the correct dimensions for sub2ind
newpixels_temperature = pixels_temperature(sub2ind(size(pixels_temperature), rows, cols, order)); %reorder temperature according to the sorting of the days
newpixels_temperature(isnan(pixels_dayofyear)) = 0; %replace invalid values by 0
26 个评论
Guillaume
2020-3-13
编辑:Guillaume
2020-3-13
"While interpolating the data, it is required to have all the data even if it is same DOY, instead of mean." I'm not saying it doesn't exist, but I'm not aware of an interpolation method that can cope with two different values at the same point.
"dividing it into 10 rows, makes it hard to keep count of data blocks" I don't see why it would be hard. It's certainly an easy way to get rid of memory issue. At the same time, it also makes it possible to parallelise the processing (but beware of memory usage!) since the processing of the blocks is completely independent of each other. On a supercomputer, I assume that would be a better use of the resources. So, yes, Instead of splitting by year, I'd split by satellite coordinates and process blocks of size MxNx(10*23) -> MxNx366 where M and N are such that the data fits comfortably in memory.
Addendum: Assuming than M and N are reasonable so that the below fits in memory, then this is how I'd implement the transformation from the MxNx(10*23) to MxNx366 matrix. This directly interpolates the MxNx(10*23) data without ever creating the matrix of 0s:
%inputs:
%pixels_dayofyear, a M x N x day matrix indicating the day of measurement of the respective location in pixels_temperature
%pixels_temperature, a M x N x day matrix indicating the temperature on the respective day in pixels_dayofyear
[M, N, days] = size(pixels_dayofyear);
[inrows, incols, ~] = ndgrid(1:M, 1:N, 1:days); %create matrices of locations for each temperature
isvalid = pixels_dayofyear > -1
temperature_interpolant = scatteredinterpolant(M(isvalid), N(isvalid), pixels_dayofyear(isvalid), pixels_temperature(isvalid)); %create interpolant using locations and days as coordinates
%the line below creates 3 double matrices of size M x N x 366. They will use a lot of memory for large M and N!
[qrows, qcols, qday] = ndgrid(1:M, 1:N, 1:366); %matrices of all query locations and days
interpolated_temperature = reshape(temperature_interpolant(qrows, qcols, qday), M, N, 366); %interpolate at all query points and reshape into 3D matrix
Beware that qrows, qcols and qday will require M*N*366*8 bytes of memory. ScatteredInterpolant only accepts doubles so there's no way to save memory there by using a smaller type. However, these 3 arrays can be used unchanged for all the blocks (assuming they're all the same size).
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Argument Definitions 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!