Generate correlated samples with copulas: Problems/Errors by using "copulafit"
5 次查看(过去 30 天)
显示 更早的评论
Hello everybody,
I need to generate samples out of real measured data for teaching an unsupervised machine learning algorithm. Inspired by the examples in the documentation ( link: Page in Documentation ) I would like to do this by using copulafit .
In the following code, the variable thisData is a m-by-n-matrix ( m: samples, size: 231 and n: different indicators with their own distributions, size: 16 ). Out of these 231 measurementsets I would like to generate 2500 (variable "nSample") samples with the dependancy of the 16 different distributions.
Error Message:
By executing the code, there is the following error in the command line display: "Error in copulafit (line 125) [lowerBnd,upperBnd] = bracket1D(profileFun,lowerBnd,5); % 'upper', search ascending from 5
Error in SimulatingDependentRandomVariablesUsingCopulas3 (line 13) [Rho,nu] = copulafit('t',u,'Method','ApproximateML')"
My Code:
%%Show distributions of dataset
plotmatrix(thisData)
%%Transform the data to the copula scale (unit square) using a kernel estimator of the cumulative distribution function.
for i = 1:size(thisData,2)
u(:,i) = ksdensity(thisData(:,i),thisData(:,i),'function','cdf');
end
%plotmatrix(u,'Direction','out')
%%Fit a 't' copula.
[Rho,nu] = copulafit('t',u,'Method','ApproximateML')
%%Generate a random sample from the t copula.
r = copularnd('t',Rho,nu,1000);
u1 = r(:,1);
v1 = r(:,2);
scatterhist(u1,v1,'Direction','out')
xlabel('u')
ylabel('v')
set(get(gca,'children'),'marker','.')
%%Transform the random sample back to the original scale of the data.
x1 = ksdensity(x,u1,'function','icdf');
y1 = ksdensity(y,v1,'function','icdf');
scatterhist(x1,y1,'Direction','out')
set(get(gca,'children'),'marker','.')
I appreciate your help and support - thank you very much.
Jonas
2 个评论
Tom Lane
2015-7-24
You give the location of the error but not the text of the error. Could you add that? I got your code to run after correcting the variable names x and y.
回答(1 个)
Shruti Sapre
2015-7-29
编辑:Shruti Sapre
2015-7-29
Hi Jonas,
I understand that you are receiving an error while using the “copulafit” function with your data.
This error is due to collinearities in the input data; in this case, due to the presence of duplicate columns. The rank of the matrix (14) is less than the number of columns in the matrix (16).
These singularities can be observed by computing the eigenvalues of matrix "thisData" using the following commands:
>> [V,D] = eig(corr(thisData))
>> Eigenvalues = diag(D)
>> Eigenvalues(find(abs (Eigenvalues < 10^(-16))))
You can observe that there are a couple of eigenvalues which are smaller than 10^-16; effectively machine zeros in the context of a sample covariance/correlation matrix. Therefore, based on linear algebra and machine precision, some columns of "thisData" are treated as a linear combination of some other ones.
Modifying the matrix so that its rank is equal to the number of columns/indicators/variables can help resolve your issue.
Hope this helps!
-Shruti
0 个评论
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!