Generate correlated samples with copulas: Problems/Errors by using "copulafit"

Question

0 个投票

exampleData.mat

Hello everybody,

I need to generate samples out of real measured data for teaching an unsupervised machine learning algorithm. Inspired by the examples in the documentation ( link: Page in Documentation ) I would like to do this by using copulafit .

In the following code, the variable thisData is a m-by-n-matrix ( m: samples, size: 231 and n: different indicators with their own distributions, size: 16 ). Out of these 231 measurementsets I would like to generate 2500 (variable "nSample") samples with the dependancy of the 16 different distributions.

Error Message:

By executing the code, there is the following error in the command line display: "Error in copulafit (line 125) [lowerBnd,upperBnd] = bracket1D(profileFun,lowerBnd,5); % 'upper', search ascending from 5

Error in SimulatingDependentRandomVariablesUsingCopulas3 (line 13) [Rho,nu] = copulafit('t',u,'Method','ApproximateML')"

My Code:

%%Show distributions of dataset
plotmatrix(thisData)
%%Transform the data to the copula scale (unit square) using a kernel estimator of the cumulative distribution function.
for i = 1:size(thisData,2)
    u(:,i) = ksdensity(thisData(:,i),thisData(:,i),'function','cdf');
end
%plotmatrix(u,'Direction','out')
%%Fit a 't' copula.
[Rho,nu] = copulafit('t',u,'Method','ApproximateML')
%%Generate a random sample from the t copula.
r = copularnd('t',Rho,nu,1000);
u1 = r(:,1);
v1 = r(:,2);
scatterhist(u1,v1,'Direction','out')
xlabel('u')
ylabel('v')
set(get(gca,'children'),'marker','.')
%%Transform the random sample back to the original scale of the data.
x1 = ksdensity(x,u1,'function','icdf');
y1 = ksdensity(y,v1,'function','icdf');
scatterhist(x1,y1,'Direction','out')
set(get(gca,'children'),'marker','.')

I appreciate your help and support - thank you very much.

Jonas

2 个评论
显示无隐藏无

Tom Lane 2015-7-24

You give the location of the error but not the text of the error. Could you add that? I got your code to run after correcting the variable names x and y.

rowJoe 2015-7-27

编辑：rowJoe 2015-7-27

在 MATLAB Online 中打开

exampleData.mat

Hi Tom,

thank you very much for your comment. This is the error message:

Error using copulafit/approxProfileNLL_t (line 290)
The estimate of Rho has become rank-deficient.  You may have too few data, or strong dependencies among variables.
Error in copulafit>bracket1D (line 489) oldnll = nllFun(bound);
Error in copulafit (line 125) 
[lowerBnd,upperBnd] = bracket1D(profileFun,lowerBnd,5); % 'upper', search ascending from 5
Error in SimulatingDependentRandomVariablesUsingCopulas3 (line 14)
[Rho,nu] = copulafit('t',u,'Method','ApproximateML')

Moreover, I attached the file "exampleData.mat" which contains an example of matrix "thisData".

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Shruti Sapre 2015-7-29

编辑：Shruti Sapre 2015-7-29

在 MATLAB Online 中打开

1 个投票

Hi Jonas,

I understand that you are receiving an error while using the “copulafit” function with your data.

This error is due to collinearities in the input data; in this case, due to the presence of duplicate columns. The rank of the matrix (14) is less than the number of columns in the matrix (16).

These singularities can be observed by computing the eigenvalues of matrix "thisData" using the following commands:

>> [V,D] = eig(corr(thisData))
>> Eigenvalues = diag(D)
>> Eigenvalues(find(abs (Eigenvalues < 10^(-16))))

You can observe that there are a couple of eigenvalues which are smaller than 10^-16; effectively machine zeros in the context of a sample covariance/correlation matrix. Therefore, based on linear algebra and machine precision, some columns of "thisData" are treated as a linear combination of some other ones.

Modifying the matrix so that its rank is equal to the number of columns/indicators/variables can help resolve your issue.

Hope this helps!

-Shruti

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Generate correlated samples with copulas: Problems/Errors by using "copulafit"

2 个评论
显示无隐藏无

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

产品

标签

Community Treasure Hunt

Generate correlated samples with copulas: Problems/Errors by using "copulafit"

2 个评论 显示 无 隐藏 无

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

产品

标签

另请参阅

Community Treasure Hunt

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论