KERNEL : mean integrated squared error- Bandwidth Selection
5 次查看(过去 30 天)
显示 更早的评论
Hello all,
I have my set of data and I estimated the function using kernel, however the Bandwidth must be estimated for a correct density from the given data. I just put 0.2 for initial start so I will be able to play around with the bandwidth before looking into proper method but the kernel didn't work for width = 0.2,however for another set of data it did work. there is more proffesional method to pick the best bandwith for the given data and it is using mean integrated squared error, Is there any in-built function in Matlab, I didn't seem to find any, not sure if there is a method in one of the toolboxes not available to me. I would like to know why the width 0.2 is not working to my code??..
Thank you all,
sample1 = [6.52689332414481E7
6.52693837402845E7
6.5270203713004336E7
6.527122138667133E7
6.52717237415096E7
6.527173346449997E7
6.527211590239384E7
6.5272540473269284E7
6.527282568117965E7
6.527314005807114E7
];
x = sample1.';
[xi,f]=ksdensity(x,'width',0.2);
plot(f,xi);
line(repmat(x,2,1),repmat([0;0.1*max(xi)],1,length(x)),'color','g' );
0 个评论
采纳的回答
Ilya
2011-8-29
The "right" width depends on your assumptions about the fitted distribution. MATLAB does not choose the bandwidth "randomly". It computes the optimal bandwidth for the normal distribution:
help ksdensity
[snip]
[F,XI,U]=ksdensity(...) also returns the bandwidth of the kernel smoothing window.
[snip]
'width' The bandwidth of the kernel smoothing window. The default is optimal for estimating normal densities, but you may want to choose a smaller value to reveal features such as multiple modes.
If you look at that Wikipedia article, note this paragraph:
Neither the AMISE nor the hAMISE formulas are able to be used directly since they involve the unknown density function ƒ or its second derivative ƒ'', so a variety of automatic, data-based methods have been developed for selecting the bandwidth. Many review studies have been carried out to compare their efficacities,[6][7][8][9][10] with the general consensus that the plug-in selectors[11] and cross validation selectors[12][13][14] are the most useful over a wide range of data sets.
I suggest that you choose the optimal bandwidth by cross-validation using ksdensity and crossval functions. Often the approximation based on the normal distribution (which you get by default from ksdensity) is good enough. -Ilya
更多回答(1 个)
the cyclist
2011-8-28
In your case, your data are order-of-magnitude 1e7, but you are choosing a width of 0.2, so it is much, much too tiny. I suspect you do not have a very good understanding of what kernel density estimation is doing, so you might want to read some basic articles to understand the technique better. This is not a bad place to start:
The easiest thing to do is to not include the 'width' parameter at all, and let MATLAB choose it for you:
[xi,f] = ksdensity(x);
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!