How to create a boxplot from a PDF?
9 次查看(过去 30 天)
显示 更早的评论
Hello!
I have a somewhat embarrassing question, but me and my colleagues cannot figure it out since several days. Thinking block ^^ So I would appreciate help!
I have a pdf of my data called pdfxcor (598x1), which resembles a normal distribution when I plot it along a x-axis resembling the molecular weight of my data (called pixelweight (598x1)).
plot(pixelweight,pdfxcor)
boxplot(pdfxcor)
I want to display the distribution as boxplot according to the correct molecular weight.
Thanks for your patience! :)
Jette
0 个评论
采纳的回答
Teja Muppirala
2013-4-23
How about something like this. Generate the CDF from your data as Tom suggested, invert it, use the inverted CDF to generate a bunch of samples that follow your distribution exactly, and send those to BOXPLOT:
%%Just making some data that resembles yours
x = linspace(1000,12000,598);
P = normpdf(x,5800,1800);
figure, plot(x,P), title('PDF');
%%Generate the CDF
C = cumsum(P);
C = C/C(end);
figure, plot(x,C); title('CDF');
%%Sample linearly along the inverse-CDF to get a bunch of points
% that have your same distribution
BigNumber = 100000;
p = interp1(C,x,linspace(C(1),C(end),BigNumber));
figure, hist(p,100); % Confirm p indeed has your distribution
figure ,h = boxplot(p);
delete(findobj(h,'tag','Outliers')) % Hide the outliers
4 个评论
Tom Lane
2013-4-23
It looks like your distribution is not symmetric. The normal distribution is symmetric, so it would not resemble the histogram in that respect.
更多回答(1 个)
Tom Lane
2013-4-22
The boxplot shows the median, lower quartile, and upper quartile. You may be able to calculate these for your pdf. For example, if you have the pdf as a numeric vector, you might compute cumsum on the vector, then divide by the last value to impose the correct probability normalization, then interpolate.
The boxplot also shows a notion of the range of the data, and sometimes outliers. These are harder to extend to a pdf. You could decide that you want to compute the 1% and 99% points as in the previous paragraph, and use those to represent the end points of the range. You could decide not to show outliers.
Plotting these as lines or points will be relatively simple. It would be more of a challenge to plot them in exactly the way that the boxplot function does.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!