Compare two CDF distributions

24 次查看(过去 30 天)
MEC
MEC 2023-3-16
评论: Jeff 2023-3-16
I am trying to compare two CDF distributions that are generated from two datasets of elevation. One dataset is observed elevations from a DEM (HeightDis.txt), the other is predicted elevations from a model (ModelHeight.txt). I want to generate a goodness of fit for how well the model is matching observed elevations.
I tried to use ktest2 but for that they need to be vectors. My two distributions are two-column matrices. The first column is the elevation value, the second column is the probability. The two distributions have different values in both columns. So my question is how do I covert these two distributions into a format that can be used in ktest2 without comprimising the data? I feel that this is an obvious problem, but have not found a solution.
  3 个评论
MEC
MEC 2023-3-16
Apologies. Digital Elevation Model.
Jeff
Jeff 2023-3-16
Does the model have any free parameters that you are estimating from these observed data?

请先登录,再进行评论。

回答(1 个)

Star Strider
Star Strider 2023-3-16
I am not sure that either of those tests would be appropriate for these data.
A1 = readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1326675/ModelHeight.txt');
A2 = readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1326680/HeightDis.txt');
figure
plot(A1(:,1), A1(:,2), '.', 'DisplayName','Model')
hold on
plot(A2(:,1), A2(:,2), '.', 'DisplayName','Observed')
hold off
grid
legend('Location','best')
pdf1 = gradient(A1(:,2)) ./ gradient(A1(:,1));
pdf2 = gradient(A2(:,2)) ./ gradient(A2(:,1));
figure
plot(A1(:,1), pdf1, '.-', 'DisplayName','Model')
hold on
plot(A2(:,1), pdf2, '.-', 'DisplayName','Observed')
hold off
grid
legend('Location','best')
They do not appear to be normally distributed in any event, although assuming that they have the same underlying distribution (whatever it is), perhaps the ranksum test (if these could be considered unpaired data) would be appropriate,, however on the original data, not the probability distributions.
.
  2 个评论
MEC
MEC 2023-3-16
Thank you for the comment. I wanted to use a KS test for easy comparison with another model output, which is also a KS test. But your point is a good one. I also wanted to avoid using the "data" that comes out of the model because it would require some more time-intensive coding that I wished to avoid. Plus it seemed this should, in theory, have been an easy thing to do, which clearly it is proving not to be.
Star Strider
Star Strider 2023-3-16
My pleasure!
Based on the PDF plots, the data appear to not be normally distributed, so I doubt that it would be worthwhile to test for that, although if that is part of your analysis, then it could be appropriate to consider. If you want to compare the model to the data to see if the model explains the data, a completely different approach would be required. That the independent variables are not the same definitely complicates any analysis.

请先登录,再进行评论。

标签

产品


版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by