Generating 60 random samples that sum to 1, each subject to a unique lower and upper limit

3 次查看(过去 30 天)
0
I am looking for a method to generate uniform(ish) random samples for 60 variables that sum to 1, with each variable being subject to a unique lower and upper limit.
On the MATLAB file exchange, I found a algorithm capable of this, called randFixedLinearCombination. However, it is limited to ~20-25 dimensions before the hypercube array size becomes too large.
I should note that the samples generated does not have to be perfectly uniform, as they will be used for generating data for training a machine learning model. If necessary, one possible compromise would be to group the variables into e.g. 6 groups, with each group of 10 variables having the same lower and upper limits. I do not know if this approach would simplify the problem, just a comment.
  2 个评论
Jan
Jan 2023-1-19
"I should note that the samples generated does not have to be perfectly uniform" - of course, otherwise they could not have a fixed sum.
If you mention a FileExchange submission, it is useful to insert a link.
Bruno Luong
Bruno Luong 2023-1-19
I think 60 dimension is quite hard. randFixedLinearCombination is based on decomposing the convex set in non-overlapping simplex, randfixedsum muts raise to the power the # od dimension some to compute te conditioning probability. Both will have difficulty to handle dimension up to 60.

请先登录,再进行评论。

采纳的回答

RaFa
RaFa 2023-9-8
编辑:RaFa 2023-9-8
Problem was "solved" by using randFixedLinearCombination for a number of variables (the highest the algorithm could handle) with the largest average values (I will call these major variables). The rest of the variables (minor variables) were significantly smaller in scale, lower than 1e-5. By generating the minor variables independently (within their respective bounds), and then scaling the major variables so that the sum of all variables is 1, I could obtain combinations of variables that are random enough for my application. To minimize the amount of scaling of the major variables, they were generated to sum to:
1 minus sum(minor variable upper bounds) / 2, instead of 1.
This means that before scaling and given enough generated combinations, the generated sums should average 1.
Note that this strategy would not be great if all variables are of equal or near equal scale, as the scaling would have to be quite drastic, as opposed to around 0.1% in my case.
  1 个评论
Bruno Luong
Bruno Luong 2023-9-8
But this way of generation does not give uniform distribution and does not work for general case.
This would work for only versy specfic case of the configuration you describe.
So this meets your need, but does not really asnwer the question you asked in January.

请先登录,再进行评论。

更多回答(1 个)

Jan
Jan 2023-1-19
% [x,v] = randfixedsum(n,m,s,a,b)
%
% This generates an n by m array x, each of whose m columns
% contains n random values lying in the interval [a,b], but
% subject to the condition that their sum be equal to s.
  3 个评论
RaFa
RaFa 2023-1-19
编辑:RaFa 2023-1-19
Perhaps I don't know how to use it for my purpose, but isn't that algorithm limited to one interval for all variables? Let me clarify my problem with an example:
0 < x1 < 1
0.1 < x2 < 0.3
0.001 < x3 < 0.002
... etc to x60
And sum([x1 x2 x3 ... x60]) = 1
randFixedLinearCombination does this, but is limited to 20-25 variables.
Jan
Jan 2023-1-19
@Bruno Luong, @RaFa: Thanks. I misunderstood "with each variable being subject to a unique lower and upper limit." If all variables have different limits, the tool is not matching. Sorry.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Random Number Generation 的更多信息

产品


版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by