Surprising behavior in randsample
2 次查看(过去 30 天)
显示 更早的评论
1. Generating sequences with replacement -- not surprising
When I generate sequences with replacement (after setting the same seed), the first N values generated are the same, regardless of how many values I generate:
seed = 13;
N = 12;
for ni = 1:N
rng(seed)
fprintf("randsample (with replace), %2d value(s): ",ni); fprintf('%g ', randsample(N,ni,true)'); fprintf("\n");
end
2. Generating sequences without replacement -- surprising
When I generate sequences without replacement (after setting the same seed), I expected the same behavior. And that is the behavior -- but only if the sequence is long enough. For shorter sequences, the values are not in the same order.
seed = 13;
N = 12;
for ni = 1:N
rng(seed)
fprintf("randsample (without replace), %2d value(s): ",ni); fprintf('%g ', randsample(N,ni,false)'); fprintf("\n");
end
Notice how the first three rows don't follow the pattern. This seems odd to me, and perhaps buggy. (The behavior is consistent, and doesn't depend on the particular seed.)
I'm not sure I have a question, other than ... "Does this seem strange to anyone else?"
0 个评论
采纳的回答
Paul
2023-2-26
移动:the cyclist
2023-2-26
I don't see anything in the doc that says anything about the ordering. randsample is an .m file. The algorithm for without replacement changes when 4*k > n, consistent withthe results shown for randsample.
datasample works the same with replacement, but much differently without replacement.
datasample with replacement is the same as randsample.
seed = 13;
N = 12;
for ni = 1:N
rng(seed)
fprintf("datasample (with replace), %2d value(s): ",ni); fprintf('%g ', datasample(1:N,ni,'Replace',true)'); fprintf("\n");
end
But datasample without replacement is ...
seed = 13;
N = 12;
for ni = 1:N
rng(seed)
fprintf("datasample (without replace), %2d value(s): ",ni); fprintf('%g ', datasample(1:N,ni,'Replace',false)'); fprintf("\n");
end
2 个评论
Walter Roberson
2023-2-26
移动:the cyclist
2023-2-26
The algorithm for without replacement changes when 4*k > n
IIRC that is the point that the Fisher-Yates Shuffle stops being used.
That is, although the FY is pretty efficient, when you are asking to generate most or all of the available locations, then at some point it becomes more efficient to use the sort(rand()) algorithm, I gather.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Problem-Based Optimization Setup 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!