How to maximize MATLAB's GPU utility?
2 次查看(过去 30 天)
显示 更早的评论
I've surveyed my GPU's performance against itself and the CPU for varying matrix sizes, and found the opposite of what most GPU literature suggests: the GPU's computing advantage diminishes with array size. Code, results, & specs shown below. Noteworthy observations: . (1) GPU utility remains sub-10%, according to Task Manager (2) ~(50%, 20%) = (RAM, CPU) usage for large (K > 9000) array (3) Considerable speed ratio drop's observed for around K > 8000 (4) Splitting the K > 8000 (= 9000) Xga matrix into four increases vectorized speed two-fold (5) My GPU ranks far higher among GPUs than my CPU (#24 vs. #174); it thus seems an on-par CPU would outperform the GPU for larger arrays (6) Last pic's GPU vs. CPU benchmark supports (5); GPU isn't as vastly superior as expected
What's the culprit - is my code, or MATLAB, or hardware configuration under-utilizing the GPU? How to find out and resolve it? m-files: testrun.zip (testrun compares performance for a single K; testrun0 for multiple)
%% CODE: centroid indexing in K-means algorithm
% size(X) = [16000, 3]
% size(c) = [K, 3]
% Xsg = single(X); csg = single(c);
% Xga = gpuArray(Xsg); cga = gpuArray(csg);
% Speed ratio = t2/t1, if t2 > t1 - else, t1/t2
%% TIMING
f1 = fasterFunction(...); % e.g. vectorized(Xga, cga, K, m)
f2 = slowerFunction(...); % e.g. forVectorized(X, c, m)
t1 = gputimeit(f1) % OR timeit(f1) for non-GPU arrays
t2 = timeit(f2) % OR gputimeit(f2) for GPU arrays
%% FUNCTIONS
function out = vectorized(X, c, K, m)
[~, out] = min(reshape(permute(sum((X-permute(c,[3 2 1])).^2,2), ...
[1 2 3]),m,K),[],2);
end
function out = forVectorized(X, c, m)
out = zeros(m,1);
for j=1:m
[~,out(j)] = min(sum(((X(j,:))'-c').^2));
end
end
function out = forFor(X,c,K,m)
out = zeros(m,1); idxtemp = zeros(K,1);
for i=1:m
for j=1:K
idxtemp(j) = sum((X(i,:)-c(j,:)).^2,2);
end
[~, out(i)] = min(idxtemp);
end
end
%% PLOTS
% GPU vectorized = vectorized(Xga, cga, K, m) for varying K, timed w/ gputimeit
% CPU vectorized = vectorized(Xsg, csg, K, m) for varying K, timed w/ timeit
% for-loop = forFor(Xsg, csg, K, m) for varying K, timed w/ timeit






5 个评论
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 GPU Computing in MATLAB 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!