How can I make a combination/permutation of all possible values with a given subset of data?

2 次查看(过去 30 天)
Hello, I'm having trouble putting this into words so I'll give an example and hopefully someone can help.
To make it simple, let's say I have a 200 second time series (200x1 array) from 3 regions (A,B,C). Each region has different types, so for all A, theres A1, A2, A3 etc. This also applies to B and C. However the number of types differ for each region. So if A has A1 - A5, B would have B1 - B9 etc.
I want to make an array combination of one of each region. So [A1 B1 C1], [A2 B1 C1], [A3 B1 C1], etc. So if I had 3 regions, I want all combinations of a 200 x 3 array possible using one type from each region.
My question is, currently, I have all the types and regions in one array (200 x 164). So A1:A5 B1:B11 C1:C20 D1:D5 etc. In total, I have 54 regions, so I would want to make all possible combinations of a 200 x 54 array.
Is there a way to do this with how my data is currently organized? Thanks for any suggestions.
  2 个评论
Stephen23
Stephen23 2024-7-29
编辑:Stephen23 2024-7-29
I doubt that your computer would be table to store all of those combinations in memory at once. Would it be sufficient to generate tham one-at-a-time ?
The problem may anyway be intractable due to the total number of combinations required.
Umar
Umar 2024-7-29

Hi @Andrew You ,

To generate all possible combinations of a 200 x 54 array from your current 200 x 164 array, you can extract the regions you need and concatenate them to form the desired array. Here's a sample code snippet to achieve this:

% Sample data (replace this with your actual data)

data = rand(200, 164); % Assuming your data is stored in a variable named 'data'

% Extract regions A1:A5, B1:B11, C1:C20, D1:D5 (adjust the indices accordingly)

regions_A = data(:, 1:5);

regions_B = data(:, 6:16);

regions_C = data(:, 17:36);

regions_D = data(:, 37:41);

% Concatenate the extracted regions to form a 200 x 54 array

combined_array = [regions_A, regions_B, regions_C, regions_D]; % Add more regions as needed

% Display the size of the combined array

size(combined_array)

So, by extracting the regions of interest and concatenating them, you can create the desired 200 x 54 array. Make sure to adjust the indices and add more regions as necessary to cover all 54 regions in your data. Please see attached results of code snippet.

Please let me know if you have any further questions.

请先登录,再进行评论。

回答(1 个)

Tony
Tony 2024-7-29
Below is example code for running through all combinations of a simpler problem of just 9 regions (A1:A3, B1:B2, C1, D1:D3). You can update the parameter settings for your full problem. dataCombinations stores all the combinations in a single variable, with the third index iterating over the combinations. But as Stephen23 remarked, storing all the combinations may require too much memory. So it would be more efficient to process each combination as it's generated.
% using smaller values for testing and demonstration
nTime = 1; % 200 in full problem
nRegionClass = 4; % 54 in full problem
nRegionClassSize = [3 2 1 3]; % to be updated for full problem
nRegionTotal = sum(nRegionClassSize);
data = rand(nTime, nRegionTotal); % dummy values for testing
nCombinations = prod(nRegionClassSize);
iRegionStart = cumsum([0 nRegionClassSize(1:end-1)]); % index of region just before each class
dataCombinations = zeros(nTime, nRegionClass, nCombinations);
combCounters = ones(1, nRegionClass);
for i = 1:nCombinations
regionSubset = combCounters + iRegionStart;
disp("Combination #" + num2str(i) + ": " + num2str(regionSubset));
dataCombinations(:, :, i) = data(:, regionSubset); % extracts data for region combinations
for j = 1:nRegionClass
if combCounters(j) < nRegionClassSize(j)
combCounters(j) = combCounters(j) + 1;
break;
else
combCounters(j) = 1;
end
end
end
Combination #1: 1 4 6 7 Combination #2: 2 4 6 7 Combination #3: 3 4 6 7 Combination #4: 1 5 6 7 Combination #5: 2 5 6 7 Combination #6: 3 5 6 7 Combination #7: 1 4 6 8 Combination #8: 2 4 6 8 Combination #9: 3 4 6 8 Combination #10: 1 5 6 8 Combination #11: 2 5 6 8 Combination #12: 3 5 6 8 Combination #13: 1 4 6 9 Combination #14: 2 4 6 9 Combination #15: 3 4 6 9 Combination #16: 1 5 6 9 Combination #17: 2 5 6 9 Combination #18: 3 5 6 9
  3 个评论
Stephen23
Stephen23 2024-7-31
So you have 4.2797e+15 combinations... lets assume that your code can process them at a rate of one million combinations per second, then you will only need to wait:
4.2797e+15 / (1e6 * 60*60*24*365)
ans = 135.7084
one hundred and thirty-six years for the results.
You might need to think about your approach a bit more, e.g. perhaps use dynamic programming.
Steven Lord
Steven Lord 2024-7-31
FYI you can perform this computation without the "magic numbers" 60, 24, and 365 using some duration functions.
numCombinations = 4.2797e15;
Y = years(seconds(numCombinations/1e6))
Y = 135.6183
This matches the computations with "magic numbers" if you use 365.2425 instead of 365.
4.2797e+15 / (1e6 * 60*60*24*365.2425)
ans = 135.6183
It doesn't make a lot of difference in this case, shaving off a mere 0.1 year, but IMO the intent of the years and seconds calls is a little clearer.
I agree with your last statement; brute-forcing this problem is probably not the best approach. Without knowing the problem the original poster wants to solve, offering specific suggestions for a different approach doesn't seem possible.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Multidimensional Arrays 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by