Having a billion cores will not help if you have a problem that is inherently not parallel.
So small computations, branches, if statements, etc. Do one thing, then do the next. You don't always know exactly what you are doing next, because it may depend on what just happened. This is true in many codes, including fmincon. The computations are probably just too small to spread out over multiple cores.
MATLAB automatically multithreads many computations. For example, multiply two large matrices. If they are sufficiently large, then MATLAB will farm it out to multiple cores on the fly. Too small, and that would actually result in a throughput reduction, because of the additional overhead.
For example, suppose I create two matrices of size 5000x5000. Multiply them. I bet my machine will automatically start up 4 cores, the fan might even kick on, if I make them sufficiently larger than that.
A = randn(5000);
B = A*A;
So I tried that with 5K matrices. 4 cores kicked into operation, but for a few seconds, and not even the fan. Next, I tried this:
A = randn(10);
for i = 1:1000000
B = A*A;
end
Only one core. Surprisingly, I did get 4 cores running in this loop though:
A = randn(100);
for i = 1:100000
B = A*A;
end
But it takes a problem that MATLAB can effectively (and intelligently) farm out to multiple processors.
In some cases, if you have the parallel processing TB, you can make the decisions yourself in the matter, but not all such problems will be amenable to that.
One thing that you can do is set maxNumCompThreads. On my machine, it is set at:
maxNumCompThreads
ans =
4
