Several questions regarding distributions

3 次查看(过去 30 天)
Hi everyone,
I have agglomerated some powder. I measured the particle size of my powder particles and want to test what kind of distribution my powder fits the best. My measurement is based on a number distribution.
1) To test for a normal distribution, I did a lillietest and jbtest. Both say H1 - I havent got a normal distribution. p << 0.001 for both tests.
2) Then I used fitdist to fit my data according to these distributions: Normal, Weibull, LogNormal, Exponential. Afterwards, I determined the Normal negative Loglikelihood (NLogL, see Table). It says Weibull and Lognormal fit the best.
3) Then I did the Chi-square goodness-of-fit test with all these distributions. It says that my data fits to a Normal distribution but not to the other 3 distributions. BUT I get no p-value (see table, Chi2Irrtum) for the normal distribution. The other 3 distributions have a very low p-value around 0.
4) I visualized my data (histogram, normal probability plot and the plots of the different distributions).
a) In the histogram you can see clearly, that some of the powder didnt agglomerate at all, thats why I have a big bar on the leftern side. Also, I have some few very big agglomerates (I guess thats why I NLogL tells me, I have a Lognormal distribution?)
b) In the normal probability plot I can see as well, that I have some very big agglomerates.
c) Judging only from the plots of the distributions, they dont look that bad I think. But I cant have negative values because there exists no such thing like a negative particle size of a powder. But most of these distributions are partly in a negative area of the graph. Should I change something in order to let them be only positive?
5) When I make fractions from 0-100, 100-200, 200-300...µm and plot these data on a probability net (log-normal) manually, then I get a straight line which means I would have a log normal distribution. Is that because of the Central limit theorem, when i build the arithmetic mean for all my fractions?
NameVerteilung NLogL Chi2Hypothese Chi2Irrtum
_______________ __________ _____________ ___________
{'Normal' } 1.0033e+05 0 NaN
{'Weibull' } 98333 1 2.0319e-244
{'Lognormal' } 97516 1 3.7622e-11
{'Exponential'} 1.024e+05 1 1.2428e-182
Is my approach ok? Did I make some failures? Should I just accept that none of these distributions fit?
Thanks and best regards,
Marcel

回答(2 个)

Jeff Miller
Jeff Miller 2020-11-13
I don't know that there are definitive answers to any of your questions, but here are some thoughts:
> most of these distributions are partly in a negative area of the graph.
That's not true. Of the four distributions you are looking at, only the normal can take on negative values.
> Is that because of the Central limit theorem
No, if I understand your description, this has nothing to do with the CLT. It probably results from the precision loss that comes from grouping observations into bins--especially in that top bin.
> Is my approach ok?
It sounds ok from your description. The only thing that would be better would be to start with some physical theory describing the process of powder agglomeration and use it to derive or simulate predicted distributions. But you may not have such a theory.
> Should I just accept that none of these distributions fit?
Yes. In fact, the histogram does not look like any standard distribution that I know of. In addition to the ones you have tried, I would suggest also looking at the gamma and extreme value distributions.
I hope these impressions are somewhat helpful to you.

Marcel Langner
Marcel Langner 2020-11-16
Thanks a lot for all the good hints and advices!
> most of these distributions are partly in a negative area of the graph.
>>That's not true. Of the four distributions you are looking at, only the normal can take on negative values.
True, only the normal distribution has a negative part on my diagrams.
> Is that because of the Central limit theorem
>> No, if I understand your description, this has nothing to do with the CLT. It probably results from the precision loss that comes from grouping observations into bins--especially in that top bin.
Ok, but just another thought. If I have many batches (e.g. over 100) and I calculate the mean particle sizes and then find out that they can be approximated with a normal distribution - does it have to do something with the CLT then? If not can you give me maybe an example of the CLT for my case?
> Should I just accept that none of these distributions fit?
>> Yes. In fact, the histogram does not look like any standard distribution that I know of. In addition to the ones you have tried, I would suggest also looking at the gamma and extreme value distributions.
I will do that. In fact, I'm not that interested in these data but I read that you should test your data first with which distribution they can be appoximated because many tests are based on the assumption of a normal distribution. So what I want to do is to compare these particle size data to other particle size measuring techniques which is not straightforward and sometimes almost impossible because of the different principles. Also I want to have a look which process parameters are affecting the particle size.
  1 个评论
Jeff Miller
Jeff Miller 2020-11-16
> Ok, but just another thought....
Yes, the CLT says that the means will look normal.

请先登录,再进行评论。

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by