There are a couple of reasons that your parfor loop is slower than the for loop equivalent. Firstly, there's the data transfer overhead - you're transferring quite a decent amount of data back to the client from the workers - this has to be serialized (basically like calling save on the data - but without using a file) on the worker, sent to the client, and then deserialized (equivalent of load).
Secondly, and probably most importantly for this case, if you're using only the local cluster type, then unfortunately this particular loop is pretty much guaranteed to be slower using parfor than for. That's because the for loop version is already pretty efficiently multi-threaded using mtimes - essentially, it's already taking full advantage of all the cores on your computer. The workers in a parfor loop default to running in a single-threaded mode, so each individual call to mtimes will be slower. Workers default to running in single-threaded mode to avoid overloading your computer.