Vectorizing a simple accumulation?
2 次查看(过去 30 天)
显示 更早的评论
I have something like a sparse array, whose first member is always nonzero, and I want to replace each zero element with the nearest non-zero element right before it. For example:
matrixData = [1.3; 0; 0; 0; 4.2; 0; 0; 1.5; 0; 0; 0; 0];
should become
matrixData = [1.3; 1.3; 1.3; 1.3; 4.2; 4.2; 4.2; 1.5; 1.5; 1.5; 1.5; 1.5];
I am currently using a loop:
emptyRows = (matrixData ==0);
for i = 2:length(matrixData)
if emptyRows(i)
matrixData(i) = matrixData(i-1);
end
end
This is the performance bottleneck on my function, and it becomes very slow as I deal with extremely long arrays, and I can't think of a way to speed it up. (Can't parallelize it because the elements are non-independent.) Is there a way to vectorize this using accumarray or anything similar?
Thanks!
0 个评论
采纳的回答
Sean de Wolski
2012-9-20
编辑:Sean de Wolski
2012-9-20
matrixData = [1.3; 0; 0; 0; 4.2; 0; 0; 1.5; 0; 0; 0; 0];
idxk = find(matrixData);
idxr = cumsum(logical(matrixData));
matrixData = matrixData(idxk(idxr));
One of many ways...
更多回答(1 个)
Jan
2012-9-20
编辑:Jan
2012-9-20
If your vectors are really large, try a Mex function:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
mwSize n, i;
double *X, q, *Y;
n = mxGetNumberOfElements(prhs[0]);
plhs[0] = mxCreateDoubleMatrix(n, 1, mxREAL);
X = mxGetPr(prhs[0]);
Y = mxGetPr(plhs[0]);
q = mxGetNaN();
for (i = 0; i < n; i++) {
if (X[i] != 0.0) {
q = X[i];
}
Y[i] = q;
}
return;
}
The M-version needs some large temporary arrays:
1. t1 = find(matrixData)
2. t2 = logical(matrixData))
3. t3 = cumsum(t2)
4. t4 = idxk(idxr)
Therefore the C-method should have a great advantage.
This function can be parallelized: Use two additional inputs as inital and final index. Skip the inital phase until the 2st non-zero is found instead of inserting NaNs. Proceed after the final index until the next non-zero element as long a the vector length is not exceeded. This should scale very well with the number of cores.
Depending on the processor, this could be faster than the IF method:
int m;
for (i = 0; i < n; i++) {
m = (X[i] == 0);
q = X[i] * m + q * (m - 1);
Y[i] = q;
}
[EDITED] No, avoiding the IF is some percent slower. Some percent faster:
for (i = 0; i < n; i++) {
if (X[i] == 0) {
Y[i] = Y[i - 1];
} else {
Y[i] = X[i];
}
}
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Matrix Indexing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!