Generate random numbers given distribution/histogram

36 次查看(过去 30 天)
MATLAB provides built-in functions to generate random numbers with an uniform or Gaussian (normal) distribution. My question is: if I have a discrete distribution or histogram, how can I can generate random numbers that have such a distribution (if the population (numbers I generate) is large enough)?
Please post here if anyone knows of a good method of doing this.
Thanks, David

采纳的回答

Jonathan Epperl
Jonathan Epperl 2012-10-27
Since nobody has any suggestions, here's one. If you have a discrete distribution, say it is a Nx2 matrix PD, first column the discrete values, second the probabilities of the corresponding value -- so sum(PD(:,2))==1.
Then map the probablities to the unit interval and use rand. What mean by that:
% Those are your values and the corr. probabilities:
PD =[
1.0000 0.1000
2.0000 0.3000
3.0000 0.4000
4.0000 0.2000];
% Then make it into a cumulative distribution
D = cumsum(PD(:,2));
% D = [0.1000 0.4000 0.8000 1.0000]'
Now for every r generated by rand, if it is between D(i) and D(i+1), then it corresponds to an outcome PD(1,i+1), with the obvious extension at i==0. Here's a way you could do that, even though I'm sure there are better ones:
R = rand(100,1); % Your trials
p = @(r) find(r<pd,1,'first'); % find the 1st index s.t. r<D(i);
% Now this are your results of the random trials
rR = arrayfun(p,R);
% Check whether the distribution looks right:
hist(rR,1:4)
% It does, roughly 10% are 1, 30% are 2 and so on
If you want more help you should post a minimal example of the form in which you have the discrete distribution.
  5 个评论
Aasheesh Dixit
Aasheesh Dixit 2020-6-8
one change is required:
p = @(r) find(r<d,1,'first'); % find the 1st index s.t. r<D(i);
Himanshu Tanwar
Himanshu Tanwar 2022-5-25
Overall code may be written (with some slight modifications) as:
dx = 0.001;
x = -100 : dx : 100; % limits depending on the random variable definition.
% For example, (dx : dx : 100) for Rayleigh (0 not recommended).
f = PDF(x);
F = cumsum(f) * dx; % CDF
F = F / F(end); % recommended when max(F) is close to 1. Otherwise increase x - points to achieve F(end) - max value - close to 1.
N = 10000;
U = rand(1, N);
pdf2rand = @(u) x(find(u <= F, 1, 'first'));
X = arrayfun(pdf2rand, U);
X % desired random variable points
function f = PDF(x)
% Probability Distribution Function
...
end

请先登录,再进行评论。

更多回答(2 个)

Image Analyst
Image Analyst 2012-10-28
I didn't notice your question or I would have answered, especially since it's asked so often. Try RANDRAW ( http://www.mathworks.com/matlabcentral/fileexchange/7309-randraw) for a list of common distributions. Or to refresh yourself on the theory of why using the CDF works, see Wikipedia: http://en.wikipedia.org/wiki/Inverse_transform_sampling

Theron FARRELL
Theron FARRELL 2019-4-30
编辑:Theron FARRELL 2019-4-30
Hi there,
I use this naive function to generate artificial outliers applied in machine learning. Hope that it will be a bit help in your case.
function [Out_Data, Out_PDF, CHist] = Complement_PDF(Hist, Data_Num, p)
% Generate a 1D vector of data with a PDF specified as the complementary PDF of input historgram. Note that the larger
% Data_Num is, the more Out_PDF will resemble to CHist
% Input
% Hist: PDF/Histogram of data
% Data_Num: Desired number of data to be generated
% p: Precision given by number of digits after 0
% Output
% Out_Data: Generated data as per the complementary PDF
% Out_PDF: The complementary PDF as per Out_Data
% CHist: The complementary PDF as per Hist
% Example
% Hist = [1, 6, 7, 100, 0, 0, 0, 2, 3, 5];
% Data_Number = 100000;
% p = 3
Hist = Hist/sum(Hist);
CHist = 1- Hist;
CHist = CHist/sum(CHist);
CDF_CHist = cumsum(CHist);
CDF_CHist = double(int32(CDF_CHist*10^p))/10^p;
Out_Data = zeros(1, Data_Num);
Out_PDF = zeros(1, length(CDF_CHist));
for i = 1:Data_Num
% Generate a uniformly distributed variable
x = double(int32(rand*10^p))/10^p;
% Inversely index CDF
Out_Data(i) = Inverse_CDF(x, CDF_CHist);
temp = floor(Out_Data(i) * length(CDF_CHist));
Out_PDF(temp) = Out_PDF(temp) + 1;
end
figure;
subplot 221, bar(Hist);
subplot 222, bar(CHist);
subplot 223, plot(CDF_CHist);
subplot 224, bar(Out_PDF);
end
function [y] = Inverse_CDF(x, CDF_CHist)
CDF_CHist_Ext = [0, CDF_CHist];
y = 1;
for ind = 1:length(CDF_CHist)
if (x >= CDF_CHist_Ext(ind)) && (x < CDF_CHist_Ext(ind+1))
y = ind/length(CDF_CHist);
break;
end
end
end

类别

Help CenterFile Exchange 中查找有关 Random Number Generation 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by