Use convolution (conv) for probability distribution... ?

12 次查看(过去 30 天)
Dear user community,
I have an x-vector and two (or more) y-vectors representing probability densities for a variation in a geometric dimension.
The probability frequency function of the two dimensions added is, as I understand it, the convolution of the two y-vectors.
When I use "conv" however, the resulting distribution seems dependent on the resolution in the x-vector; a higher resolution results in a higher value for the resulting convoluted function. The shape of the distribution looks OK, but its integral is not equal to 1 (which should be the case for a probability distribution)
What am I doing wrong ??
Grateful for any support here...
Best regards Mats Lindqvist
Code:
clear
%Input data, tolerance limits for three linear dimensions [mm]
Dy_h_max=0.07; Dy_h_min=0;
Di_h_max=0.097; Di_h_min=0;
Da_max=0.063; Da_min=0;
% End of input data
f=@(x) conv(conv(unifpdf(x,Dy_h_min,Dy_h_max),unifpdf(x,Di_h_min,Di_h_max)),unifpdf(x,Da_min,Da_max));
% f(x) is the convolution of three probability distribution functions, in this case % with uniform distribution, "unifpdf", but the method should work for any shape of the probability distribution
x=[0:0.01:0.4]; % x-vector for the distributions, unit: mm
g=f(x);
%Generate the x-vector for the convolution of the two functions X=[.5*(x(2)-x(1)):(x(2)-x(1)):(length(g)-.5)*(x(2)-x(1))];
X=X/max(X);
% It seems I need to divide the distribution with the integral, % so that the cumulative probability becomes 1:
Int=trapz(X,g);
for i=1:length(X)
PD(i)=trapz(X(1:i),1/Int*g(1:i));
end
% plot cumulative probability distribution of the sum of the three dimensions
plot(X,PD)

回答(2 个)

Hugo
Hugo 2013-6-6
Dear Mats Lindqvist,
I've found several possible explanations, so please check if anyone results to be your case:
1) As it is, g is not a probability distribution, because the convolution using conv simply multiplies elements from the vectors without taking into consideration the size of the bin. In other words, you have to multiply/divide by the bin size (in your example, 0.01) each time you convolve (in your example, twice), depending on how you define the uniform distributions (see item 3).
2) To get the integral of g equal to unity, you might need to consider a bin size smaller than the precision of the limits of the uniform distributions. In your example, the precision is 0.001 (take a look at Di_h_max, for example). Otherwise, you will encounter some border effects depending on how you define the uniform distribution (i.e. the integral will slightly differ from 1).
3) You might need to be careful about how you define the uniform distributions in general (as samples from a continuous distribution or as a discrete distribution) and at the edges of the intervals where the distribution is different from zero, especially in your particular discretization.
In your example, if you take x=0.0005:.001:.4 (so that each bin is defined in the centre and the size of the bin matches the precision of the borders of the uniform distributions), you define the uniform distributions as if they were continuous (i.e., assuming that a<b are the edges, then for each x, the value of unifpdf(x) is 1/(b-a)) and you multiply each convolution for the size of the bin (the differential in the convolution integration) when you obtain a probability g that, when integrated (i.e. sum all the terms and multiply by the size of the bin) is unity, and thus it is a proper piece-wise constant continuous probability distribution.
Does this solve your problem? Or did I misunderstand your question?
Best regards, Hugo

Mats Lindqvist
Mats Lindqvist 2013-6-11
Thank you very much Hugo for a thorough answer. I will have to read it carefully and try your suggestions before giving any feedback. As I mention above, the convoluted distribution's integral is no equal to one, and dependent on the resolution of x. So far i have "solved" the problem by simply dividing the function with its integral. My justification for this is that the distribution looks good, just like I expect. For example, the convolution of two uniform distributions, become a triangular pulse, as expected. Convoluting Three distributions become more like the usual "bell shaped" curve. If I understand the "central limit theorem" correctly, a combination of several uniform distributions, should eventually approach the "normal distribution" as the number of distributions increase.
In this example I havce started out with the uniform distribution because it is simple, and relevant in my case, but I hope to be able to analyze other distributions aswell. The application is tolerancing in the manufacturing industry. We try to utilize the fact that several manufactured parts are very unlikely to be at the maximum or minimum within their tolerance specification, at the same time. I'm neither a matlab nor a statistics specialist however, so my progress is slow unfortunately...
And thanks again for valuable support...
Best regards Mats Lindqvist

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by