How to fit a nonparametric distribution to a sample of known percentile values

5 次查看(过去 30 天)
Hello everyone
I have a sample of percentile values that describe the distribution of possible earthquake acceleration levels that lead to the failure of a building component. I would like to fit a nonparametric model to these data. I know that, for a random sample of these earthquake acceleration levels, I coiuld fit a nonparametric density using the the ksdensity function but is there a way to do a similar fit for the cumulative distribution function of this function?
Many thanks
Example data:
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
  5 个评论
Torsten
Torsten 2024-8-14
And the method doesn't allow to approximate only between acc(1) and acc(9) where 89 % of the mass is cumulated ?
Jeff Miller
Jeff Miller 2024-8-14
@Torsten Not completely. The smoothing would spill over at the edges, so for example the pdf at prctile 91 would depend a bit on what you assumed about the top 8%.

请先登录,再进行评论。

回答(2 个)

Star Strider
Star Strider 2024-8-13
The empirical cumulative distribution function ecdf would likely bea appropriate here. (There is also ecdf however it seems less applicable to me.) There are a number of associated functions as well, lilnked to in that documentation page.
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
figure
ecdf(acc, 'Frequency',percentiles)
grid
axis('padded')
[f,x,flo,fup] = ecdf(acc, 'Frequency',percentiles)
f = 10x1
0 0.0067 0.0314 0.0919 0.1659 0.2825 0.4305 0.5987 0.7937 1.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
x = 10x1
0.3339 0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
flo = 10x1
NaN 0 0.0152 0.0651 0.1314 0.2407 0.3845 0.5532 0.7562 NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
fup = 10x1
NaN 0.0143 0.0476 0.1187 0.2004 0.3243 0.4764 0.6441 0.8313 NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
.
  8 个评论
Torsten
Torsten 2024-8-14
编辑:Torsten 2024-8-14
So you want a smooth version of
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
plot(acc,percentiles/100)
to get an approximate cdf ? Maybe fit a sigmoid function ?
Star Strider
Star Strider 2024-8-14
The pdf plots might look something like this —
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
[f,x,flo,fup] = ecdf(acc, 'Frequency',percentiles);
dfdx = gradient(f, x);
dpda = gradient(percentiles/100, acc);
figure
stairs(x, dfdx, 'DisplayName','From ‘ecdf’ Results')
hold on
stairs(acc, dpda, 'DisplayName','From Posted Vectors')
hold off
grid
xlabel('$x$', 'Interpreter','LaTeX')
ylabel('$\frac{dF(x)}{dx}$', 'Interpreter','LaTeX', 'FontSize',14)
legend('Location','best')
.

请先登录,再进行评论。


Image Analyst
Image Analyst 2024-8-14
You could fit a spline through them. The spline doesn't take any parameters, it just fits a cubic equation between each pair of points. See attached demo.
  1 个评论
Xavier
Xavier 2024-8-14
Thanks for this idea, but fitting a spline does not ensure that the fitted function will comply with the necessary conditions for being a CDF.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Interpolation 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by