Need help for large scale portfolio optimisation

Question

0 个投票

Hi, I am recently working on a project that is related to portfolio optimisation where the investible universe contains roughly 6000 to 10000 assets(depending on the dates). The data set is large and there are roughly 1000 data points for each asset. The objective function in this problem is x'Vx, where x is the weight vector and V is the covariance matrix. V is singular, not invertible and not postive definite, since there are more assets than data points. I tried using fmincon and quadprog solvers from MATLAB, but the results are unsatisfactory since the solvers stop basically due to the fact that the maximum number of iterations has been reached. Enhancing this number (to 30000) gives more accurate, but still not correct, results in both solvers and also it is easy to cause out of memory problem.

I am wondering if there is any other alternative solvers that I can used to get this around. Any suggestions will be greatly appreciated.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Brendan Hamm 2015-8-12

0 个投票

For a quadratic objective function you would want to use quadprog so you are right in this sense, fmincon will just take longer, but is necessary if you have non-linear constraints. In what sense is the answer "not correct"?

4 个评论
显示 2更早的评论隐藏 2更早的评论

Brendan Hamm 2015-8-12

在 MATLAB Online 中打开

This may be an issue due to a singular covariance matrix, or an ill-conditioned covariance matrix.

You can check these via:

rank(V)
cond(V)

This would not be a surprise given the large number of variables to samples. There are some things you can do to mitigate this problem.

1. Add a small regularization parameter to the main diagonal (I use a covariance matrix computed from 252 days of stock prices in 2006 for 450 assets on the S&P).

rank(c)
ans =
   249
rank(c+diag(repmat(10*eps,length(c),1)))
ans =
   450
10*eps
ans =
 2.2204e-15

Notice how small this regularization parameter is. It will have negligible effect on the solution.

The condition number has also changed drastically:

cond(c)
ans =
   9.9006e+19
cond(c+diag(repmat(10*eps,length(c),1)))
ans =
   1.3600e+13

2. If you have missing data points you must be careful how you are calculating the covariance matrix. If you are using pairwise data you will have different dimensionality and no guarantee of a positive symmetric definite matrix. You are better off omitting any observations which contain NaNs

3. You could always use some sort of shrinkage estimator with regularized condition numbers. For instance see: CONDITION NUMBER REGULARIZED COVARIANCE ESTIMATION

I hope this helps!

Jiefeng 2015-8-13

Thanks for your suggestions Brendan and here are some results based on the three advices.

Firstly, I added a small number to the diagonals of the covariance matrix and it indeed gave me a full rank after that. However, the fmincon still stops because the maximum number of iteration is reached.

Secondly, I was indeed using the nancov MATLAB function with the 'Pairwise' option. I estimated the covariance matrix again without the 'pairwise' option, which means the rows that contain NaNs will be deleted in the calculation. I checked the values in the covariance matrix and compared them with the case with 'pairwise' option, and I found that the one without the 'pairwise' option has larger variances and covariances. I am using weekly data and the length of this dataset is not long and therefore deleting data will be last thing I want to do.

Thirdly, I haven't tried the method in the paper you mentioned yet, but I tried using the shrinkage estimator proposed by Ledoit and Wolf(2004) to estimate the covariance matrix of the dataset and ran the optimisation with quadratic programming. It is able to find minimums but the portfolio is underperforming for the whole time period, which I think there might be something going wrong.

Brendan Hamm 2015-8-14

在 MATLAB Online 中打开

It might be possible that your constraints are not set appropriately, so I would double check these first.

With the number of assets in question it is likely a good idea to increase the number of iterations MaxIter and perhaps even increase the tolerances associated with termination TolX and TolFun. Additionally, it will likely be helpful if you can get diagnostic information from the solver to ensure that convergence is occurring. With that in mind consider passing in options such as:

opt = optimoptions('quadprog','TolFun',...
    1e-6,'TolX',1e-6,'Display','iter-detailed')

Depending on what your constraints are, you may be able to switch the algorithm to the trust-region-reflective method and provide Hessian information. There is a good example of this here:

Hessian Multiply Function

请先登录，再进行评论。

Need help for large scale portfolio optimisation

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

4 个评论
显示 2更早的评论隐藏 2更早的评论

类别

标签

Community Treasure Hunt

Need help for large scale portfolio optimisation

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（1 个）

4 个评论 显示 2更早的评论 隐藏 2更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

4 个评论
显示 2更早的评论隐藏 2更早的评论