How to exclude data when fitting an exponential distribution
4 次查看(过去 30 天)
显示 更早的评论
I am trying to fit an exponential distribution of lifetimes. I want to exclude all lifetimes <= 1 because those represent unreliable data points. However, the fitter will then include the fact that there are zero data points in that region, rather than ignoring it. This becomes clear when I simulate a basically perfect exponential distribution:
rng('default')
x = round(exprnd(4,1e6,1)); % exponential distribution with mean 4
pd = fitdist(x,'exponential');
disp(pd.mu) % the fitted mean
x1 = x(x>1); % remove all values <= 1
pd1 = fitdist(x1,'exponential');
disp(pd1.mu) % the new fitted mean
Output:
3.9834
5.5111
The two fits are clearly different even though they are fitting the same data. How can I make the fitter ignore that range of values?
3 个评论
J. Alex Lee
2020-10-22
i don't think this is surprising at all...you aren't fitting a distribution to a histogram and ignoring probability densities. you are fitting to actual data. so if you alter the data, of course you will alter the fits...it's like being surprised at the difference between
x = randn(1000,1);
mean(x)
x1 = x(x>1)
mean(x1)
采纳的回答
Jeff Miller
2020-10-22
You can make use of the memory-less property of the exponential here--the mean remaining time is independent of how much time has already passed, so just reset the clock to 0 after excluding times less than 1. Rounding messes that up though--I'm not sure why.
rng('default')
% x = round(exprnd(4,1e6,1)); % exponential distribution with mean 4
x = exprnd(4,1e6,1); % exponential distribution with mean 4
pd = fitdist(x,'exponential');
disp(pd.mu) % the fitted mean
x1 = x(x>1); % remove all values <= 1
x1 = x1 - 1; % ADJUST THE SCORES TO "RESTART THE CLOCK" AT TIME 1
pd1 = fitdist(x1,'exponential');
disp(pd1.mu) % the new fitted mean
% output:
% 3.994
% 3.9924
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!