The following link mentions the type of data supported by 'MultiScaleTrainingSizes' Name-Value Pair argument.
In essence, it needs to be a Mx2 matrix, where each row of the matrix is of the form [height width].
Do keep in mind the note mentioned for this parameter. It says that the values specified using this syntax should be greater than or equal to the size mentioned using the image input layer.
HTH