Zipf Based Number generation

7 次查看(过去 30 天)
Tamoor Hassan
Tamoor Hassan 2015-3-31
I am trying to generate numbers based on Zipf distribution.
There is a function named RANDDRAW that generates number for the zipf exponent greater than 1. However, i need to generate Zipf numbers which can be less than 1.
Can anybody help me to generate random numbers based on zipf distribution with any value of zipf exponent?

回答(4 个)

John D'Errico
John D'Errico 2015-3-31
编辑:John D'Errico 2015-3-31
Whenever someone says something like "for ANY exponent", I can bet that they won't be going to throw an easy problem at any offered code.
So, looking at the zipf distribution, it turns out to be an easy one to deal with, sort of. The zipf pdf is simply
p_zipf(x,p) = -x^(p+1)/zeta(p+1)
This is a discrete distribution in x, over the positive integers.
If we look at the pdf, we recognize that zeta(p+1) is the limit of the sum of powers of x, x^-(p+1), where x ranges from 1 to inf. So zeta(p+1) merely normalizes that pdf to have unit area.
Here the zipf exponent that was referred to would be p, I assume, but there are unresolved questions hidden in the question. I'll quote the line you wrote:
"However, i need to generate Zipf numbers which can be less than 1."
So, IF your question is to generate zipf samples that are non-integer, this makes no sense, since the zipf distribution is a discrete one, that samples over the positive integers.
If you are calling the zipf exponent (p+1), and you wish to sample such that p+1 is less than 1, so p is less than zero, then you will have a great deal of trouble.
If your question is to sample for non-integer values of p that are less than one, but greater than zero, then this is possible, but a serious bit of an effort for many values of p less than 1.
By the way, since the zipf pdf involves zeta(p+1), this would employ the symbolic toolbox, because for small p, the zeta function converges pretty slowly.
Sadly, I cannot proceed further without some resolution of the above questions.
Edit: Though I've not yet gotten a response to my questions, I took a quick look at the randraw code. Randraw defines the zipf pdf such that
p_zipf(x,a) = -x^(-a)/zeta(a)
and then requires only that a>1.
It would appear that you want to generate zipf numbers for a<=1, as randraw defines that distribution. If this is so, then you cannot do so. The sum...
sum(x^-a)
where x is the set of positive integers, is inf for a<=1.
This is why randraw requires that a>1, as otherwise, the zeta function has a problem. Even for a near 1, I would predict that randraw becomes slow.

Roger Stafford
Roger Stafford 2015-3-31
If you want exponent values less than or equal to one, it is necessary to confine yourself to a specific finite number of terms, since the infinite sum of terms in the zipf distribution would be infinity. See
http://en.wikipedia.org/wiki/Zipf's_law
In the following, N is the number of possible integer values of the distribution, s is the exponent, and the desired size of the generated random matrix is m-by-n.
V = cumsum([0,1./(1:N).^s]');
[~,r] = histc(rand(m*n,1),V/V(end));
r = reshape(r,m,n);
The variable 'r' assumes random integer values from 1 to N in accordance with the zipf distribution.
  10 个评论
Tamoor Hassan
Tamoor Hassan 2015-4-4
In each row, no values along columns should be repeated.
Roger Stafford
Roger Stafford 2015-4-4
Your statement "all the columns contain unique values" made me think you wanted no repetitions to occur along any column. If this is not the case, then things are a little easier.
V = cumsum([0,1./(1:N).^s]'); V = V/V(end);
t = true;
while t
[c,r] = histc(rand(n,m),V);
t = any(c(:)>1);
end
r = r';

请先登录,再进行评论。


Tuyen Tran
Tuyen Tran 2015-10-17

Wasseem Suhabuth
Wasseem Suhabuth 2016-8-31
Can ihave the coding on matlab to generate zipf distribution please

类别

Help CenterFile Exchange 中查找有关 Random Number Generation 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by