Generating random points between two n-dimensional boxes
7 次查看(过去 30 天)
显示 更早的评论
Hello,
Consider two n-dimensional boxes. I would like to generate random points in the region between the inner and the outer boxes especially when the inner box is very close to the outer box making it rather difficult to do it in a less expensive way. I know a way to do this which is simply based on generating points inside the outer box and check whether the points lie outside the inner box or not. Imagine the inner box is defined by the vector of lower and upper bounds lb1 and ub1 and the outer box is defined by lb2 and ub2.
N=10^3; %Number of points inside the outer box
dim=10; % dimension of points
lb1=5.*ones(N,dim);ub1=10.*ones(N,dim);
lb2=2.*ones(N,dim);ub2=13.*ones(N,dim);
RS=lb2+(ub2-lb2)*rand(N,dim); %These are N random points inside the outer box
d1=RS-lb1;
d2=ub1-RS;
idx=all(d1>=0 & d2>=0,2);
SAMPLE=RS(idx,:);
Unfortunately, this is really not an efficient approach. A big drawback of this approach is this: I have no control on the number of random points I wish to generate. So, I have to generate a lot of random points inside the bigger box and wish that it lies outside the smaller box. This is particularly problermatic when the inner box is very close to the outer box.
I am wondering if you know a magical way to do this!
I look forward to hearing from you!
Thank you in advance!
Babak
0 个评论
采纳的回答
John D'Errico
2023-4-8
编辑:John D'Errico
2023-4-8
In 2-dimensions, I posted a solution on the FEX that would work. But 2-d is trivial. You triangulate the domain, then generate random points in the various triangles. This is done by first computing the area of each triangle, then generate points in the triangles proportionally to their area. And generating a point inside a trangle is trivial. You can find code for this in randpoly.
X = [0 .1 .9 1];Y = [0 .1 .9 1];
boxes = [X(1),Y(1);X(4),Y(1);X(4),Y(4);X(1),Y(4);X(1),Y(1);X(2),Y(2);X(2),Y(3);X(3),Y(3);X(3),Y(2);X(2),Y(2)]
interpoly = polyshape(boxes);
plot(interpoly)
hold on
XY = randpoly(1000,interpoly);
plot(XY(:,1),XY(:,2),'r.')
Again, pretty easy in 2-d, partly because all the work is already done for you.
The same general idea would even work well enough in 3-d. Again, tesselate the domain that lies between the two boxes. Then it is trivial to perform the work. In fact, as long as the dimension is low enough to dissect the domain between the boxes into a simplicial complex, then you want to follow the approach I suggest.
However, in higher dimensions it gets way more complicated. As you should understand, rejection is not always an option, since the curse of dimensionality will hurt you. But for the same reason, the cost of performing a dissection of the domain of interest gets very expensive. Its been a while since I , but I am pretty sure the count of number of simplexes to tile a domain is something like O(factorial(N)), where N is the number of dimensions. You don't really want to go there.
The good news is the curse of dimensionality actually helps you in this case, in terms of rejection. What matters is the ratio of n-d volumes of the two boxes. For example, in 10 dimensions, if the smaller box has side length 99% of the larger box, then the rejection fraction is only
(99/100)^10
which says you need to over-sample by a factor of approximately 10 to 1, since 90.44% of the samples will be rejected. And that is not at all a bad rejection fraction.
更多回答(1 个)
Walter Roberson
2023-4-8
One way that does not require any rejection:
- for each location to be generated, first generate a random integer 1 to 3^n-1 using a weighted distribution. The random integer describes the location of the generated point relative to the inner box. In 1D that would be like "left of the inner box", (exclude "inside the inner box"), "right of the inner box". In 2D it would be like "left top", "center top", "right top", "left center", (exclude inner box), "right center", "left bottom", "center bottom", "right bottom".
- Use the integer to look up the edges of the region, getting out upper and lower bound for each dimension
- now generate a random number inside those bounds
There would be no rejection because each section has no holes in it.
It would be important to chose each of the integers proportional to the area of the section compared to the total area.
3 个评论
Walter Roberson
2023-4-8
There are 3^N sub-domains but you only choose from 3^N-1 of them because you do not choose anything inside the inner box.
John D'Errico
2023-4-8
Its a subtle point. I guess it depends on how you implement the algorithm to do the sampling. I would compute the sub-volumes from ndgrid. Once that is done, I would re-assign the computed sub-volume corresponding to that middle cube as zero, which makes the probability essentially zero that any point gets assigned to it. And that would make the code fairly simple in my eyes to write, rather than a more complicated scheme to map the integers 1:3^N-1 into each of the n-boxes.
Other implementations are possible though. As I said, a minor point.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Random Number Generation 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!