Kernel Density estimation with chosen bandwidth, then normalize the density function (cdf) so that integral of cdf from min to max equal to 1 ; then take the first and second derivative of the cdf
显示 更早的评论
I've tried using kde(data,n,MIN,MAX) and [f,xi] = ksdensity(x) over my data points.
I haven't figure out how to retrieve the cdf (density function).
I've tried using linear fit on the density data points (I got from using [density,cdf]=kde(y,1000,min(y),max(y))
but wonder if there is another method to approach finding the kernel density cdf assuming normal distribution with chosen bandwidth (standard deviation) 0.5
Thanks!
回答(1 个)
Tom Lane
2017-12-14
You seem to want to do a number of things including integrating and specifying a bandwidth. Maybe this will get you started.
Here's an example looking at a kernel density estimate from a gamma random variable and comparing it with the distribution used to generate the data.
>> x = gamrnd(2,3,1000,1);
>> X = linspace(0,40,1000);
>> f = ksdensity(x,X);
>> plot(X,gampdf(X,2,3),'r:', X,f,'b-')
Usually "cdf" is used to describe the cumulative distribution function rather than the density (pdf). Here's how to get that.
>> F = ksdensity(x,X,'Function','cdf');
>> plot(X,gamcdf(X,2,3),'r:', X,F,'b-')
5 个评论
Tam Ho
2017-12-20
Tom Lane
2017-12-22
The density from ksdensity or gampdf is defined so that it integrates to 1 over the real line. You could, I suppose:
- Integrate it from fmin to fmax; say the result is R<1
- Set it to zero outside this range
- Divide by R to get a total integral of 1
To set the bandwidth to 0.5, type "help ksdensity" and look at the argument that defines that. You don't do anything with linspace for that.
Brendan Hamm
2017-12-29
To follow up on Tom's post:
The ksdensity function includes a Support input argument. You could not use the exact min and max for the Support, but if you extend that range out slightly it will work.
x = gamrnd(2,3,1000,1);
X = linspace(0,40,1000);
n = 1e5;
delta = 0.01; % Factor for expanding the Support
Support = [min(x)-delta,max(x)+delta];
X = linspace(Support(1),Support(2),n);
F = ksdensity(x,X,'Function','cdf','Support',Support);
You can perform a numerical integration with the trapz function:
f = ksdensity(x,X,'Function','pdf','Support',Support);
I = trapz(X,f) % As n-> Inf, I -> 1
Bandwidth is also an option, but when you provide a bounded support (as done above) a log transformation is applied to the data and the bandwidth applies on this scale. So you may need to check if the requirements are on the original scale (which I assume they are).
F = ksdensity(x,X,'Function','cdf','Support',Support,'Bandwidth',0.5);
The points for X from linspace are simply the points to evaluate the pdf/cdf and do not change the fitting which is done only on the underlying data x.
类别
在 帮助中心 和 File Exchange 中查找有关 Kernel Distribution 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
