Complex (I think) combination problem?

Hello everyone,
I'm wondering how I could go about solving this combination problem.
I have 5 groups, that I need to select from to create another group.
For example:
A = [1,2,3,4,5]
B = [3,4,5,6,7,8]
C = [7,8,9,10,11,12]
D = [13,14,15,16,17]
E = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
From each group, I need select a certain number from each group, and I don't want any duplicates.
For example:
A = [1,2,3] % need 3 from group A
B = [4,5,6] % need 3 from group B
C = [7,8,9] % need 3 from group C
D = [13,14,15,16] % need 4 from group D
E = [10,11] % need 2 from group E
How I initially solved this problem was to calculate the combinations from each group using for example:
combinations_A = nchoosek(A,3) %combinations of group A, choosing 3 from A
Then I simply looped through every single combination of every group in a nested for loop, i.e.
for ii = 1:size(combinations_A,1)
for jj = 1:size(combinations_B,1)
for kk = 1:size(combinations_C,1)
for ij = 1:size(combinations_D,1)
for jk = 1:size(combinations_E,1)
% do some analysis on this combination
end
end
end
end
end
This however doesn't take into the fact that I can't have the same number in a given selected group. This also balloons the problem. For the total number in my problem, it calculated that there would be over 250,000,000 combintations, and thus I could not even initialize my matrix it was so large.
Anyone have any idea how I could simplify this problem?

8 个评论

What does the analyses involve? Maybe there is a way to get the answer without explicitly generating all of the combinations. Also it seems you have restrictions that the same number can't be repeated from different groups. With such a large number of possibilities, you may be faced with doing some type of statistical sampling. But we need to know more about what your analyses is before we can advise.
Thanks for your answer James.
Each "number" from this group corresponds to a set of values, and we are trying to maximize the total sum of these values against another "test" group (including the same total number of groups). For instance, simplifying the problem to just a single combination of 3.
%Combination
"1" = {5,6,5} %Each number has corresponding values
"2" = {2,4,7}
"3" = {9,3,0}
%Test group
"1" = {1,2,6}
"2" = {1,2,5}
"3" = {1,2,5}
%Analysis
%Summing each column, and finding the difference between the combination and the test group
value = ((5+2+9)-(1+1+1)) + ((6+4+3) - (2+2+2)) + ((5+7+0)-(6+5+5))
Which then, I find the maximum "value" and select this combination.
I agree with you that there is probably a better way than brute forcing this. With less overall combinations, I used to be able to brute force it (took time, but was possible). Now the combionations have increased and the problem generates way too many calculations.
I'm wondering though, as I am essentially choosing 15 (3 each from Group A, B and C, 4 from Group D, and 2 from Group E), from a total of 18 numbers.
This is 18C15 = 816.
I would think the restrictions on selecting from the groups would actually decrease the total value of possible combinations, as we can't have duplicate numbers in the final 15 combination.
On my previous comment, I think will be much faster to simply calculate all the combinations, and then test to see if any are not valid (i.e. Group A contains 5, 6, 7) and eliminate those.
Although this still seems inefficient to me...
How many numbers are in each group in your actual problem?
The numbers in each group are dynamic.
However the total number needed from each group remains constant:
A - 3
B - 3
C - 3
D - 5
E - 2
And the total number available to choose from remains constant at 22. So just the distribution of the 22 numbers amongst the groups does not remain constant.
Well, how large can the groups be? The viability of any approach will depend on the maximum group sizes.
The groups could range from the min required from each group to the total number available.
So the min would be the total number showed above, with the max being 22.
Keeping in mind that every group could have all the same values (i.e. 1, 2, 3... 21, 22)

请先登录,再进行评论。

回答(1 个)

Not really simplifying, but more automating:
You could adapt my odometer functions from https://www.mathworks.com/matlabcentral/answers/623358-get-a-combination-of-unique-paths-for-given-pair-of-numbers#comment_1082638 to generate all of the possibilities systematically. The current functions do not accomedate "N unique out of this group", but changing that should not require a major redesign.
Those functions do not have any ability to reduce the number of combinations, other than the fact that a minor modification could remove duplicates as soon as they show up, so you do not (for example) end up trying to generate several tens of millions of possibilities each with a duplicate in the first group, only to eliminate those in post processing. So no ability to reduce the number of valid combinations, and no ability to avoid generating temporary combinations with one invalid situation -- but the technique can avoid generating temporary combinations with two simultaneous invalid situations, which prunes out a lot.
What the functions are good for is generating the possibilities systematically, incrementally, using only a small amount of state memory, fetching just the next possibility each time. Non-vectorized brute force.... and thus good for cases where the full set of values would take too much memory.

6 个评论

I looked at this, and while I think it could be useful, the problem is that the so called, brute forcing of every single combination (whether we eliminate it during create, or in post processing) generates way too many combinations.
In the example above, generating the combinations with no repeats will produce < 816 combinations. While if considering repeats from different "pipes", pushes this number astronomically high (some test data I had generated over 250,000,000 combinations).
If I can't figure out a better method, I will look to implement your method, as computation time isn't necessarily super important (I can run this overnight), while with my current method, too much memory is required and thus I can't run it at all.
Thank you very much.
Yes, this is very much a method for cases where memory would potentially be an issue but you need to try all the possibilities anyhow.
And, of course, a method for cases where the number of positions is dynamic, so nested for loops are not appropriate.
Yes, which actually would be useful for me. The part where the number of positions could change (this would be a more elegant solution, if the number of groups ever changed).
I'm wondering if it would be possible to change the pipes dynamically. i.e. if the first number is 1, it would remove the number 1 from all other pipes and conduct the analysis as such?
It would be possible, yes, but you may need to put the numbers back later, and you would need to know where to put them back into. Probably easier to keep a current expanded state, and test each proposed value to see if it already exists at some point earlier in the current state.
I'm thinking I could just create temporary pipes as I go down each pipeline.
This is a very elegant solution, which I may look to implement in the future (if the number of groups ever change).
For now, I will use your idea, and attempt implement this odometer method with for loops, where I remove the values from the further nested for loops, to reduce the calculations... much less elegant, but just need a working for solution...
Will report back
Again the issue with any sort of method where all combinations are tested is the shear number of combinations generated.
The true value should be less than 816 where as if every single combination is tested, over 250,000,000 are generated...

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Loops and Conditional Statements 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by