One option is to create a cell array to save data in each iteration and then combine it in the form of a vector at the end. See my answer to this question here: https://www.mathworks.com/matlabcentral/answers/695740-output-of-fun-is-not-a-matrix-the-same-size-as-the-block-in-block-proc#answer_577455
More specifically, something like this
a = load('x.mat');
b = load('y.mat');
c = load('z.mat');
C = cell(1,128);
for i =1:128
% l = (i-1)*3+1;
% m = i*3;
first = a.ss(:,i,:); % a.ss =1024x128
second = b.kk(:,i,:);% b.kk =1024x128
third = c.zz(:,i,:); % c.zz = 1024x128
first = transpose(first); % now size of first = 1x1024
second = transpose(second); % now size of second = 1x1024
third = transpose(third); % now size of third = 1x1024
% all(l:m) = [first;second;third]; % size of all = 3x1024, want size of all to be 384x1024
C{i} = [first;second;third];
end
all = vertcat(C{:})