Correlation between two differently formatted datasets
16 次查看(过去 30 天)
显示 更早的评论
Hello,
I want to calculate the R^2 correlation between two different datasets.
The first one, A, is 192x288 (lat,lon) and I can visualize the values on a 2D colormap
The second one, B, is 555x2 (lat,lon) This data was from an excel file, in column format. The data is randomly spread throughout the globe, and do not lie on the same grid cells of A. The data is far too sparse to be able to interpolate.
I am having trouble figuring out how I can possibly find a correlation between these two different data formats. Is there a way to convert B into a map that I can visualize with a colormap like A? Also, how would the resolution affect this calculation?
Any help would be highly appreciated
Thank you,
Melissa
0 个评论
采纳的回答
Chad Greene
2015-3-9
Melissa,
Without knowing anything about your project, my gut feeling is that it does not seem prudent to grid your B dataset because you'll end up interpolating over long, long distances between data points. I suppose you could use triscatteredinterp or gridfit, but you'd probably want to then mask out any grid boxes that are far away from the B data points.
You can, however, get a correlation between these data sets. I'm going to make up a gridded dataset A and a point dataset B:
% Some gridded dataset A:
[lonA,latA] = meshgrid(-180:2:180,90:-1:-90);
A = peaks(181)+.1*latA;
% Some measurements B at specific points:
latB = 180*(rand(30,1)-.5);
lonB = 360*(rand(30,1)-.5);
B = .1*latB+rand(size(latB));
% Plot the points atop the gridded dataset:
pcolor(lonA,latA,A)
hold on
plot(lonB,latB,'rp','markersize',15)
shading interp
xlabel('longitude')
ylabel('latitude')
Then get A values at points B by interpolating the A dataset:
A_interp = interp2(lonA,latA,A,lonB,latB);
You can then use corrcoef to get a correlation coefficient, which for this fake data is 0.89:
R = corrcoef([A_interp B])
R =
1.0000 0.8936
0.8936 1.0000
But note that correlation coefficient depends a bit on data means and scaling. Below I'm going to use polyplot to plot the linear regression:
plot(A_interp,B,'b*')
hold on
polyplot(A_interp,B,'k-')
axis tight; box off
xlabel('dataset A')
ylabel('dataset B')
0 个评论
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Distribution Plots 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!