This is a question for WikiPedia: simulated annealing, because it does not have a direct relation to Matlab. But it is easy to construct such a relation, so let me try in spite of the off-topic nature:
The goal is to find a global maximum. Imagine the function to be optimized as smooth surface with many local hills. A local optimization method, e.g. a steepest descent method, would find the nearest local maximum, but very likely not the global maximum.
Therefore the simulated annealing adds a large noise at first, which lets the current point move almost freely over the complete area of definition at first. This can be interpreted as "high temperature", such that the changes are dominated by a kind of "Brownian motion". Then the "temperature" is reduced following a specific method, and the local optimization rules the motion more and more. Finally there is a high chance, that a global maximum is found, when the temperature reaches zero.