MATLAB allocates memory when necessary using the driver API. But it maintains a memory pool and avoids raw allocation if possible. This considerably increases performance. So, to answer your question, most of the time when you call "mxGPUCreateGPUArray" and "mxGPUCreateFromMxArray", no allocations will happen at all.
Unified memory doesn't replace data transfers, it just places data transfer between CPU and GPU in the hands of the CUDA runtime. MATLAB's model is to place data transfer in the hands of the user.
Future versions of MATLAB may seek to leverage Unified Memory to extend the memory capacity of the GPU, although this will probably have to wait until the minimum supported architecture is Pascal, since that is the first with efficient paging.
Of course, the GPU MEX API is intended to help the user pass GPU data to and from the MATLAB workspace in the form of gpuArray variables. If you don't intend to use them, you can allocate memory however you like in a MEX function.
