Improving Performance with Parallel Computing

Factors That Affect Speed

Some factors may affect the speed of execution of parallel processing:

Parallel overhead. There is overhead in calling parfor instead of for. If function evaluations are fast, this overhead could become appreciable. In particular, solving a problem in parallel can be slower than solving the problem serially.
No nested parfor loops. This is described in Nested Parallel Functions. parfor does not work in parallel when called from within another parfor loop. If you have programmed your objective or constraint functions to take advantage of parallel processing, the limitation of no nested parfor loops may cause a solver to run more slowly than you expect. In particular, the parallel computation of finite differences takes precedence, since that is an outer loop. This causes any parallel code within the objective or constraint functions to execute serially.
When executing serially, parfor loops run slower than for loops. Therefore, for best performance, ensure that only your outermost parallel loop calls parfor. For example, suppose your code calls fmincon within a parfor loop. For best performance in this case, set the fmincon UseParallel option to false.
Passing parameters. Parameters are automatically passed to worker machines during the execution of parallel computations. If there are a large number of parameters, or they take a large amount of memory, passing them may slow the execution of your computation.
Contention for resources: network and computing. If the network of worker machines has low bandwidth or high latency, computation could be slowed.

Factors That Affect Results

Some factors may affect numerical results when using parallel processing. There are more caveats related to parfor listed in Parallel for-Loops (parfor) (Parallel Computing Toolbox).

Persistent or global variables. If your objective or constraint functions use persistent or global variables, these variables may take different values on different worker processors. Furthermore, they may not be cleared properly on the worker processors. Solvers can throw errors such as size mismatches.
Accessing external files. External files may be accessed in an unpredictable fashion during a parallel computation. The order of computations is not guaranteed during parallel processing, so external files may be accessed in unpredictable order, leading to unpredictable results.
Accessing external files. If two or more processors try to read an external file simultaneously, the file may become locked, leading to a read error, and halting the execution of the optimization.
If your objective function calls Simulink^®, results may be unreliable with parallel gradient estimation.
Noncomputational functions, such as input, plot, and keyboard, might behave badly when used in objective or constraint functions. When called in a parfor loop, these functions are executed on worker machines. This can cause a worker to become nonresponsive, since it is waiting for input.
parfor does not allow break or return statements.

Searching for Global Optima

To search for global optima, one approach is to evaluate a solver from a variety of initial points. If you distribute those evaluations over a number of processors using the parfor function, you disable parallel gradient estimation, since parfor loops cannot be nested. Your optimization usually runs more quickly if you distribute the evaluations over all the processors, rather than running them serially with parallel gradient estimation, so disabling parallel estimation probably won't slow your computation. If you have more processors than initial points, though, it is not clear whether it is better to distribute initial points or to enable parallel gradient estimation.

If you have a Global Optimization Toolbox license, you can use the MultiStart (Global Optimization Toolbox) solver to examine multiple start points in parallel. See Parallel Computing (Global Optimization Toolbox) and Parallel MultiStart (Global Optimization Toolbox).