Removing empty cells with non-zero dimensions

My code needs to deal with a cell array X, each cell of which is itself a cell array, containing a double array. For example, X could look as follows:
X = cell(N,1);
for i=1:N
X{i}=cell(1,10);
for j=1:10
X{i}{j} = randi(10, 5,2); %each cell contains a double array of size (5,2)
end
end
While manipulating my code, some rows of these double arrays might get removed. For example:
for i=1:N
for j=1:10
X{i}{j}(X{i}{j}(:,1) < 3,:) = [];
end
end
In some cases, all elements of some double arrays get removed, resulting in a 0×2 empty double matrix. This nonzero size is causing problems elsewhere in my code, how do I efficiently replace these with empty arrays?
My current approach is to call the following forloop after each set of manipulatoins that might result in empty arrays with nonzero size.
for i=1:N
for j=1:10
if isempty(X{i}{j})
X{i}{j} = [];
end
end
end
However, I'm fairly certain that there is no better way of doing this. Any suggestions?
Edit: I want to emphasize that I do not want to remove the empty cells. What I do want is to replace any 0x2 empty double matrices with 0x0 matrices.
The 10 cells inside each X{i} represent "physical" lattice sites in my simulation. An empty cell does have a meaning, and should not be removed.

3 个评论

That would just change the empty cell from a 0x2 to a 0x0. Is your goal to remove the empty cells completely? Note that the 2nd layer of cells may no longer all be the same length.
No, I explicitly want to keep the empty cells, I just don't want them to have a non-zero size if they are empty.
The 10 cells inside each X{i} represent "physical" lattice sites in my simulation. An empty cell does have a meaning, and should not be removed.
I see. I'll update my answer.
Note that the isempty function will return the same results whether the cell is 0xn, nx0 or 0x0 but if you're using the cell size for any reason, then it matters what the empty dimensions are.

请先登录,再进行评论。

 采纳的回答

How to remove empty cells
To remove all empty cells in the 2nd layer of a nested cell array named X,
for i = 1:numel(X)
X{i}(cellfun(@isempty,X{i})) = [];
end
Or, in 1 line,
X = cellfun(@(C){C(~cellfun(@isempty,C))},X);
That may eliminimate all of the 2nd layer of nested cells in which case some of the first layer may become empty. If you'd like to eliminate them as well (ie, all cells where all nested cells were removed),
X(cellfun(@isempty, X)) = [];
How to replace 0xn or nx0 empty cells with 0x0
To replace all 0xn or nx0 cells in the 2nd layer of a nested cell array named X,
for i = 1:numel(X)
X{i}(cellfun(@isempty,X{i})) = {[]};
end

1 个评论

I'm guessing that your workflow uses size() which is why it's a problem when a cell is 0x2. If that's the case, you could avoid this entire process if you use isempty() within your workflow instead of size(). If the size of the arrays are already stored somewhere as sz, you could use something like if any(sz==0).
Also, if the second block of code in your question resembles what you're actually doing, you could shave off some time by fixing the problem within that section rather than additing another set of loops to convert 0x2 to 0x0. This is the fastest method yet, I believe (not that it matters at this point).
% Replace the 2nd block of code in your question with this
for i=1:N
Xi = X{i};
for j=1:10
rmIdx = Xi{j}(:,1) < 3;
if all(rmIdx)
Xi{j} = [];
else
Xi{j}(rmIdx,:) = [];
end
end
X{i} = Xi;
end

请先登录,再进行评论。

更多回答(1 个)

I like your for-loop; you might speed up a little bit
for i=1:N
Xi = X{i};
Xi(cellfun('isempty',Xi)) = {[]}; % switch to string from Rik's remark
X{i} = Xi;
end

13 个评论

You can replace the outer for-loop with cellfun
X = cellfun(@ReplaceEmpty, X, 'unif', 0)
function Xi = ReplaceEmpty(Xi)
Xi(cellfun('isempty',Xi)) = {[]}; % switch to string from Rik's remark
end
The OP's original nested loops are actually 1.99x faster than the one in your answer and 1.84x faster than the one in my answer, on average, mainly thanks to cellfun.
Each timed 1000 times, comparing the median values.
Your loops isn't really different than mine. It unpacks and repacks the cell array which adds a tiny bit more time.
Wait, are you saying my original method is the fasted approach? I expected somthing using cellfun to be faster, I just didn't get it to work properly without some help.
edit: some testing suggests that it isindeed quite a lot faster. I assumed that arrayfun and cellfun would speed up things, but that turns out not to be true.
Yeah, that's why I first state that I like OP's for-loop.
I'm still outthere looking for example where CELLFUN/ARRAYFUN beats FOR-LOOP.
"I expected somthing using cellfun to be faster"
I don't understand why a lot of people get this expectation from. CELLFUN/ARRAYFUN is a scam. It does provide compact code that's all.
"CELLFUN/ARRAYFUN is a scam" 😄
Generally vectorization is faster than loops which initially gave for-loops a bad rep. But speed has generally increased, especially with Matlab's JIT compilation. cellfun, arrayfun, etc all have internal loops anyway. Their main attraction is the reduction of lines of code and, sometimes, improved readability (certainly not always; sometimes they are very difficult to interpret). For simple operations, loops, even nested loops, are often faster.
Though in this case the main slowdown is due to your use of the handle style, instead of the char input to cellfun:
N=100;
X = cell(N,1);for i=1:N,X{i}=cell(1,10);for j=1:10,X{i}{j}=randi(10,5,2);end,end
for i=1:N,for j=1:10,X{i}{j}(X{i}{j}(:,1)<3,:)=[];end,end
[timeit(@()cellfun_handle(X)) %42 microseconds
timeit(@()cellfun_str(X)) % 2.1 microseconds
timeit(@()for_fun(X))] % 1.5 microseconds
function out=cellfun_handle(X)
out=cellfun(@isempty, X);
end
function out=cellfun_str(X)
out=cellfun('isempty', X);
end
function out=for_fun(X)
out=false(size(X));
for n=1:numel(X)
out(n)=isempty(X);
end
end
This is the fatest according to my benchmark
for i=1:N
Xi = X{i};
for j=1:10
if isempty(Xi{j})
Xi{j} = [];
end
end
X{i} = Xi;
end
If you look at the numbers I posted: I agree. Using a for loop is faster. The thing I pointed out there is that it isn't much faster than cellfun('isempty',X), while cellfun(@isempty,X) is a lot slower.
Great point, Rik!
I suppose that extra time is saved by not sorting through overloaded versions of the function. Thanks for that reminder!
@Bruno Luong, good idea adding the condition to check for empties.
@Rik, Historically the CELLFUN has special speedy implementation for a small number of functions and they are invoked through string 'xx' and not @xx. 'isempty' is among them.
At some point TMW recommended not using string, I would though they move the special implementation for @xx syntax, obviously not. So thanks for reminding us and TMW must get to work and implement what they still left over.
@Bruno Luong, Would you mind explaning why defining and then using Xi = X{i}; inside the first loop speeds things up? It's more than twice as fast on my machine.
Well very simple explanation:
with X{i}{j} you tells matlab to indexing twice with i variable then with j.
With Xi{j} only one indexing once with j since Xi is a variable. In the for-loop it makes a difference.

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Loops and Conditional Statements 的更多信息

产品

版本

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by