optimal selection of rows and columns

Question

Benjamin Bechtel 2011-7-11

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns

I have a optimatimisation problem which sounds easy but yet no trivial solution has come into my mind. I have a logical array (with data acquisition points) – lets call it X. I want to select rows and cols in such a way, that X(sel_rows, sel_cols) are all 1. Thereby I’d like to have a optimal choice in respect to a weighting function of the number of choosen rows and cols. In the easies case, this would just be the number of chosen elements: sum(sel_rows == 1) * sum(sel_cols == 1).

Do you now a MATLAB function to solve the problem or an easy algorithm that is converging to the optimum? Brute force methods take to long for the given Matrix-size.

Thanks in advance, Benjamin

2 个评论
显示无隐藏无

Walter Roberson 2011-7-15

0110110000

0110010000

0010001111

If you select all rows, then the only column that is complete is #3, for a total of 3 selected elements

If you select row #1 by itself, you can select columns 2, 3, 5, 6, for a total of 4 elements.

If you select row #2 by itself, you get columns #2, 3, 6, for a total of 3 elements.

If you select row #3 by itself, you get columns #3, 6, 7, 8, and 9, for a total of 5 elements. This is the most complete row.

If you select rows #1 and 2, then columns #2, 3, and 6 can be selected, for a total of 6 elements (2 x 3)

If you select rows #1 and 3, then only column #3 would be complete, for a total of 2 elements (2 x 1)

If you select rows #2 and 3, then only column #3 would be complete, for a total of 2 elements (2 x 1)

If one were to try to proceed by selecting the most complete row, that would be #3, but in fact row #3 does not happen to get selected at all in the maximum solution for this configuration.

The solution here is X([1 2], [2 3 6]) . Second best is X([3], [3 6 7 8 9]).

See how it goes? Whatever list I and J you select, every member of X([I],[J]) must be a 1, and you want (in the simplest version of this problem) to maximize numel(X([I],[J])).

The question is whether there is an algorithm that is better than trying each member of the cross product of the powersets of 1:size(X,1) by 1:size(X,2)

Benjamin Bechtel 2011-7-15

sorry, i haven't seen this comment before posting below. yes, that is exactly the problem!

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

bym 2011-7-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns#answer_15531

在 MATLAB Online 中打开

I am sure I don't understand your question, but how about:

[r,c] = find(X==1); %...?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

Image Analyst 2011-7-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns#answer_15533

在 MATLAB Online 中打开

I don't understand either. I guess you have somehow selected certain rows, which are specified in sel_rows, and certain columns, which are specified in sel_cols. Then you want to make sure all the elements in those rows or columns are equal to 1. So why don't you just do

    X(sel_rows, :) = 1;
    X(:, sel_cols) = 1;

I have no idea what the weighting thing is all about. And how huge is your matrix? Are we talking tens or hundreds of millions of elements? Don't say it's a few tens of thousand because that is in no way huge.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 3

Walter Roberson 2011-7-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns#answer_15552

To cross-check: you would be looking for the maximum for the number of selected elements, right?

If so then I believe I understand what you are asking, but I will need to think more to see if I can come up with an algorithm.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 4

Benjamin Bechtel 2011-7-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns#answer_15560

Obviously my problem was ambiguous – sorry therefore. So first, the matrix X is given and may not be changed. It contains measurements (cols correspond to pixels/location, rows to times/scenes). I need a subset ob pixels and scenes that is complete – means it may not contain gaps (I use it as a input for PCA). The question ist, how to select the optimal subset. Optimal means the maximum of ones or as an advancement – a function of the number of pixels and the number of scenes (more scenes are better than more pixels). But lets just concentrate on the maximum.

@proecsm: Of course finding the one elements is simple, but I need a set of rows and a set of cols that only contains ones (still straightforward) and is the best possible subset (rather complicated).

@image analyst: The size is only about 100 * 200 elements, but this corresponds to 2^200 possible of colum subsets.

Thanks!

2 个评论
显示无隐藏无

bym 2011-7-14

maybe @Image Analyst can chime in but what about bwarea()?

Image Analyst 2011-7-15

Don't look at me - I'm still confused. I though he should just sum over columns a map of matrix elements identifying whether X is complete there (sumOfCols = sum(XisCompleteMap, 2)) and then find the row with the max number of sum to find the "most complete" row, but I'm not sure if that's right. I think a small example of the X matrix showing "more complete" rows and "less complete" rows would help illustrate this. Explain why the data shows that some rows are more complete than the other rows.

请先登录，再进行评论。

Answer 5

Walter Roberson 2011-7-15

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns#answer_15925

At worst you could loop over K in the powerset of the row indices, at each point selecting the columns find(all(X(K,:))) .

This algorithm can potentially be made a bit more efficient in that if you have found a solution involving N selected elements, you can skip the test for any set of rows if the number of rows in the set times the total number of columns in X, is less than N: even if every column was selected in each of those rows, you would not be beating your score of N.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 6

Benjamin Bechtel 2011-7-15

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/11352-optimal-selection-of-rows-and-columns#answer_15933

在 MATLAB Online 中打开

Hello Mister Robertson,

this is closer to the solution but still doesn’t cover the full problem yet. The best set will definitely not contain all columns! You can see, if you generate a small test set by:

rand(10) > 0.3

Even the maximum of 1 in one row

find(x(k,:) = max(sum(x(k,:),2))

is not necessarily the best solution, as you can see in this simple example:

  i  ii  iii  iv  v
A  1  1  0  1  0
B  1  1  0  1  0
C  1  1  1  1  0

Your proposal would select nothing. The maximum would select x([C],[i:iv]) {= 4 elements}, but the optimum was x([A,B,C], [i, ii, iv]) {=9 elements};

Also this approach is very time consuming – 100 rows make 2^100 possible row selections ….

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Walter Roberson 2011-7-15

I suspect you missed the fact that I said "powerset"; see http://en.wikipedia.org/wiki/Power_set . Looking over the powerset would indeed find the combination you indicated.

But yes, the powerset does involve 2^N possibilities where N is size(X,1) . That is still much cheaper than 2^(N+M) where N is the number of rows and M is the number of columns, the pure brute-force method.

请先登录，再进行评论。

optimal selection of rows and columns

2 个评论
显示无隐藏无

回答（6 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

optimal selection of rows and columns

2 个评论 显示 无隐藏 无

回答（6 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

2 个评论 显示 无隐藏 无

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论