I'm not sure what causes the sharp change between iterations - I can reproduce roughly similar timings when I call svd with 3 outputs, not using the "econ" mode.
For this size of inputs, most time will be spent on computing the third output matrix, so quadratic time increase should be expected with the number of iterations. I do not believe there are different algorithms are used here, but different block sizes for splitting up the problem might be part of the reason her, or also some hardware-dependent memory or cache effects.
If it's possible to do your computation using the economy mode output of SVD, this would likely improve performance significantly.

