How to estimate probabilities of an arbitrary range, based on the probability distribution of a given a data set of numbers?

23 次查看(过去 30 天)
Given a series of values x, I want to estimate the probabilities of a range of numbers U, in(using) the probability distribution of the given series x. My code works for one value, but I need probabilities of a range, Can somebody give me some feedback please?
Thank you in advance.
This is the code:
%%Generate some data/series
x=randi([-2 50],25,1);
%Values/ranges of interest
%define histogram and probability distribution of x
h = histogram(x);
h.Normalization = 'probability';%Changing count in probabilities
h.Values(U); %finding probabilities of range U


Bruno Luong
Bruno Luong 2018-10-22
编辑:Bruno Luong 2018-10-22
N = histcounts(x, [-Inf, U, Inf]);
P = N(2:end) / sum(N)
  4 个评论
Clarisha Nijman
Clarisha Nijman 2018-10-23
x=randi([-3 3],10,1); U=[-5:5];
N = histcounts(x, [-Inf, U, Inf ]) prob = N(2:end) / sum(N)
%alternative code f=hist(x,U); prob=f/sum(f);
Now I fully understand your answer. With this small example it is clear. With the tails you are getting 2 extra intervals. An arbitrary value for U, let's say 2 is associated with interval <1,2] Such that we have eleven intervals, and since the left tail does not live in U, it is excluded, and that's why use (2:end) in the code. Thanks a lot!


更多回答(2 个)

Torsten 2018-10-22
%%Generate some data/series
X=randi([-2 50],25,1);
%Values/ranges of interest
X = sort(X)
[countsX, binsX] = hist(X)
cdfX = cumsum(countsX) / sum(countsX)
extrap_left = (min(U) > max(X));
extrap_right = (max(U) > max(X));
p_U_left = interp1(binsX,cdfX,min(U),'linear',extrap_left)
p_U_right = interp1(binsX,cdfX,max(U),'linear',extrap_right)
p_U = p_U_right - p_U_left
  4 个评论
Clarisha Nijman
Clarisha Nijman 2018-10-22
If you want to use data you can not do that, that would be excluding situations that possibly might occur. That is why the frequency polygon is a smooth line. To estimate values in between.
Torsten 2018-10-22
编辑:Torsten 2018-10-22
If you get discrete values from a random variable, say [ 1 2 4 5 6 ], how should it be possible to tell p({3}) ? (Hint: It's impossible).
In my opinion, the most reasonable estimate would be p=0 since it does not appear in the list.
If you know the distribution the values stem from, you can get a Maximum Likelihood Estimate (MLE) of the parameters describing the distribution. Having calculated these parameters, you can give estimates of probabilities for elements of your choice.


Bruno Luong
Bruno Luong 2018-10-22
编辑:Bruno Luong 2018-10-22
not sure, is it what you want?
x=randi([-2 50],10000,1);
h = histogram(x, U);
  1 个评论
Clarisha Nijman
Clarisha Nijman 2018-10-22
Let's say x is the profit of a shop observed 20 times. and the values are: 2,5,7,2,20,25,35,15,6,-2,15,27,2,20,15,5,7,2,20,25
This can be associated with a probability distribution. And you can plot it.
Now it is asked to estimate the probability of the values in between, and also in the tails. U=-[5 -4 -3 -2 -1 0 1 2 .... 40]


Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by