Unexpected interquartile range (IQR) result

7 次查看(过去 30 天)
Sim
Sim 2023-12-9
评论: Sim 2023-12-11
For a number of distributions I would like to compare and show the interquartile range (IQR) and the standard deviation (STD).
For the normal distribution I got more or less what expected, i.e. the percentage of data within 1 STD, is around 68% of the distribution, and the IQR is around 50% of the distribution (i.e. the central half of the distribution). Here following my test:
clear all; clc;
samplesize = 100000;
% generate distribution
mu = 0;
sigma = 1;
data = normrnd(mu,repmat(sigma,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 68.1040
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50.1370
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off
However, if I try the same with another distribution, like a gamma one, the IQR is not 50% anymore of the distribution. What did I do wrong?
clear all; clc;
samplesize = 100000;
% generate distribution
a = 1;
b = 5;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.5350
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 100
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off

回答(1 个)

Sim
Sim 2023-12-9
编辑:Sim 2023-12-9
my bad.. this is the solution:
dataIQR = data( data > q(1) & data < q(3) );
and the vertical lines related to the quartiles need to be replaced by this command:
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
This is a correct example:
% generate distribution
samplesize = 100000;
a = 1;
b = 8;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.3970
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data( data > q(1) & data < q(3) );
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50
% plot
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
  2 个评论
Steven Lord
Steven Lord 2023-12-9
You could check your results using the iqr function and/or the prctile function, each moved from Statistics and Machine Learning Toolbox to MATLAB in release R2022a.
Sim
Sim 2023-12-11
Thanks a lot @Steven Lord for your nice comment and suggestion! :-) :-)

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by