Expanding Sample Covariance Matrix

6 次查看(过去 30 天)
Lemar DeSalis
Lemar DeSalis 2011-8-21
Hello!
I need to calculate the mean vector and the covariance matrix for sampled data. E.g. I have matrix with NumFeatures colums and NumSamples rows. I can then easily use "mean(MyMatrix)" and "cov(MyMatrix)".
However, what should I do if I want to extend the covariance matrix I got through the method described above?
So I have a covariance matrix calculated from the old samples, how can I add the influence of the new samples?
Is there an ease MATLAB-way to do that?
Thanks in advance!
  1 个评论
Oleg Komarov
Oleg Komarov 2011-8-21
The terminology you're using is not clear. Could you give an example.
For reference: http://www.mathworks.com/matlabcentral/answers/6200-tutorial-how-to-ask-a-question-on-answers-and-get-a-fast-answer

请先登录,再进行评论。

回答(2 个)

Lemar DeSalis
Lemar DeSalis 2011-8-22
% MyMatrix is a Matrix containing samples, in this case random data:
MyMatrix = rand( [NumSamples NumFeatures] );
% I need the mean vector and the covariance matrix:
MyMean = mean(MyMatrix);
MyCov = cov(MyMatrix);
% Now I got some new data:
MyLargerMatrix = vertcat(MyMatrix, SomeNewData);
% Calculate new values:
MyMean_New1 = mean(MyLargerMatrix)
MyCov_New1 = cov(MyLargerMatrix);
%%%%HERE IS MY QUESTION:
% But what to do, when the old data is not available anymore?
clear MyLargerMatrix, MyMatrix;
MyCov_New2 = ... ?
% How to update the covariance matrix, if you only have the old
% covariance matrix "MyMean", the number of old samples "NumSamples"
% and the new samples "SomeNewData"?
%
% MyCov_New2 should be identical to MyCov_New1, but MyCov_New2
% should be computed WITHOUT access to the old data.
% For the mean vector, this is easily possible, but how to do so for the covariance matrix?

Oleg Komarov
Oleg Komarov 2011-8-22
% Example inputs
A = rand(100,2);
B = randn(20,2);
C = [A;B];
% Sample covariances (normalized by N-1)
c1 = cov(A);
c2 = cov(B);
c3 = cov(C);
% Means
m1 = mean(A);
m2 = mean(B);
m3 = mean(C);
% Number of samples
nA = size(A,1);
nB = size(B,1);
nC = nA + nB;
% The question is: how to get c3 having only c1, c2, m1, m2?
% Keep in mind that:
  • cov(x,y) = E(xy) - E(x)E(y)
  • m3 = (m1*nA + m2*nB)/nC
  • same with E(xy)
  • cov is the sample covariance, thus we have to adjust for N-1
  • the following formula is valid for covariance only for covariance
ExEy12 = prod((m1*nA + m2*nB)/nC);
adj = nC/(nC-1);
(c1*(nA-1) + c2*(nB-1) + prod(m1)*nA + prod(m2)*nB)/nC*adj - ExEy12 * adj
c3
How to derive the variance is up to you. But you really just need paper and pencil.

类别

Help CenterFile Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by