Function 'pdf' doesn't return pdf values

Question

0 个投票

I have a problem with the function pdf. I have this code:

estim_KDE = fitdist(data, 'kernel');
x = low:(abs(low-high)/(obs-1)):high;
y = pdf(estim_KDE,x);
plot(x,y,'r'), xlabel('xxx'), ylabel('yyy'),...
    title('title'), legend('xyz');

but the function pdf returns values that have no sense for me: not comprised between 0 and 1, nor numbers between zero and one multiplied by the length of x (one of this two options is what i expected from the function pdf); for example: it gives me numbers like 20.something or 5.something, with length(x) = 1000 or more, numbers that have no sense for me. This happens for all the distributions i tried to have the pdf (always by the function fitdist). I discovered this problem only because i have plotted an histogram of the frequencies versus the Kernel Density Estimator.

Can someone help me, please?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

John D'Errico 2015-2-6

编辑：John D'Errico 2015-2-6

在 MATLAB Online 中打开

1 个投票

I think you are under a common misperception about the PDF of a random variable. My guess is it is because of the letter P in PDF that confuses people, and yes, it is called a Probability Density Function.

The thing is, it does not actually return a probability. Consider a PDF with a very narrow spread. Here, a Gaussian with mean 0 and std deviation of 0.001.

normpdf(0,0,.001)
ans =
     398.94

See that the PDF at 0 is 398.94, vastly larger than 1.

What matters is that the PDF integrates to 1. The integral of that function over the domain is 1.

It is the CDF that actually returns something you can interpret as a probability. Or, you can form the integral of the PDF to compute a probability. That is what the CDF gives you though.

4 个评论
显示 2更早的评论隐藏 2更早的评论

simo borto 2015-2-7

编辑：simo borto 2015-2-8

Well, probably I see the things from the wrong point of view. What I need to see is the graph of the relative frequency: if you know the software R (I'm not acting like a sponsor of R, it's just to give an idea) I need something like the function 'density'.

Imagine an histogram of the relative frequency: ok, i need a curve that smooths the bars of the histogram.

John D'Errico 2015-2-10

A plot of the PDF IS a graph of the relative frequency, to the extent that this makes any sense. Why do you care about the y-axis scaling? If that is what bothers you, then just turn off the y-axis labels.

The fact is, you CAN create a histogram, of the frequency in each "bin". You would do this by either an integration of the PDF over that sub-interval, or by subtracting successive values of the CDF, to get the relative fraction that would occur in that bin.

If you used a tiny enough bin interval, then the curve would look very nice and smooth. But the probability of a point falling in any single such tiny bin would be vanishingly small. So the y-axis scaling would be all tiny numbers. This reflects the fact that any single number has probability ZERO of arising.

So, just plot the PDF, and don't worry about the y-axis, or turn it off completely.

请先登录，再进行评论。

Answer 2

Rob Keeton 2019-9-3

0 个投票

Multiply by the bandwidth of the pdf.

y = pdf(estim_KDE,x)*;estim_KDE.BandWidth;

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Function 'pdf' doesn't return pdf values

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（2 个）

4 个评论
显示 2更早的评论隐藏 2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

标签

Community Treasure Hunt

Function 'pdf' doesn't return pdf values

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（2 个）

4 个评论 显示 2更早的评论 隐藏 2更早的评论

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

4 个评论
显示 2更早的评论隐藏 2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论