Improving the consistency of the NNMF function
3 次查看(过去 30 天)
显示 更早的评论
I'm attempting to use non-negative matrix factorization on a matrix containing spectral information (A). Whenever I run the nnmf function, the output matrices W and H are usually different from any other iterations. I have found that this is stated in the help documentation for the nnmf function:
"Because the root-mean-squared residual D may have local minima, repeated factorizations may yield different W and H."
However, as a result of this, I find it difficult to make use this method to say anything scientifically meaningful, as it introduces considerable bias on my behalf (I can effectively run the function repeatedly until I come to a result that fits with my narrative).
My question: how can I get the nnmf function to return W and H matrices with higher reproducibility thereby improving my confidence in the method? I've tried tweaking the input options by decreasing the tolerances, increasing the number of replicates in the initial run, and increasing the number of iterations, all with little effect.
My code is currently very similar to what is written in the help documentation and looks like this:
numcom = 2; % The rank. My datasets typically can be described by very low-rank approximations
opt = statset('MaxIter', 10, 'Display', 'final');
[W0,H0] = nnmf(A, numcom, 'replicates', 10, 'options', opt, 'algorithm', 'mult'); %Get starting values
opt = statset('Maxiter', 1000, 'Display', 'final');
[W,H] = nnmf(A, numcom, 'w0', W0, 'h0', H0, 'options', opt,' algorithm', 'als');
Of course, I can set the random number generator to default before running the function every time:
rng('default')
But that kind of defeats the purpose ;)
1 个评论
回答(1 个)
Jakub
2019-8-19
According to my experiences I only use 'als' algorithm and with many replicates which usually gives me better estimate. So something like this:
opt = statset('Maxiter',100,'Display','final','useparallel',true);
[coeff,score] = nnmf(A, numcom,'replicates',1e6,'options',opt);
1 个评论
Guy Reading
2019-11-12
Agreed, with enough replicates hopefully the space will be adequately explored and the global max will be found each time & repeatably returned. How many is enough? Depends on how large your input space (m) is...
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!