GPU utilization is not 100%.

41 次查看(过去 30 天)
DONGHYUN KIM
DONGHYUN KIM 2019-5-22
gpu.PNG
The GPU usage is only 40% allocated for running the deep learning network.
Sometimes go up to 80% for a while but usually stay at 40%.
I want to know why.
  1 个评论
Walter Roberson
Walter Roberson 2019-5-22
GPU can only run at full speed if the entire problem fits into memory. That is seldom the case for deep learning: those networks are updated incrementally, so transferring images in from disc and memory uses a fair bit of time.

请先登录,再进行评论。

回答(4 个)

Joss Knight
Joss Knight 2019-5-31
Your question is very hard to answer in it's current form. You want to know why GPU utilisation is not 100%? The answer is, because the GPU isn't running kernels 100% of the time. Why? I don't know, because you haven't provided any information about what you're doing. Maybe, as Walter says, a lot of time is being spent doing file I/O, perhaps because you have a very slow disk or slow network file access. Maybe you have a transformed datastore, or an imageDatastore with a custom ReadFcn, and the data processing is very complex and takes place on the CPU, blocking GPU execution while it is carried out. Maybe you have a very small network, or a low resolution network, or you don't have a high enough mini-batch size, and so you are not successfully occupying all the cores on the GPU. Maybe your network is so small that the amount of time spent running the MATLAB interpreter in order to generate the GPU kernels to do the computation outweighs the amount of time it takes to run those kernels.
If you want to know more, run the MATLAB profiler and find out where time is being spent during training.
  2 个评论
Ali Al-Saegh
Ali Al-Saegh 2020-12-5
Dear Joss,
I kindly invite you to help me by giving some advice on my question at
https://www.mathworks.com/matlabcentral/answers/680293-gpu-vs-cpu-in-training-time

请先登录,再进行评论。


Abolfazl Nejatian
Abolfazl Nejatian 2019-12-15
dear Joss,
Thank you for the information you provided.
the strange thing is when i was testing my code on Linux and building my network with Python the GPU utilization grew up to around 100 percent but in windows with Matlab, it stays around 45 percent.
  1 个评论
Joss Knight
Joss Knight 2019-12-15
It's not strange. Windows is a different operating system, different file system, and completely different (and considerably slower at allocating memory) GPU driver. Do you have a different card in your Windows machine too? All could be a problem.
Plus, if you start with a model defined in a Python framework and optimized for that, and then adapt it, we've no idea how good a job you did. If you took a MATLAB example and then converted it to Python you might have the same problem with Python. Maybe you're not successfully prefetching your data from the file system. Maybe you're not using MEX acceleration when you should be. Maybe your GPU could be put in TCC mode. That's why it's so difficult to answer your question when you're not telling us what you're doing.

请先登录,再进行评论。


Abolfazl Nejatian
Abolfazl Nejatian 2019-12-16
well, i know these are the different OS, but the vague point is, with the same resource(both of them use Tesla V100, actually i install both of OS on one machin), why they can't use GPU in a similar percent!
yes, absolutely i used MEX code on my Matlab.
then i try to train a Resnet with Python. ( and all of the initial value were same, input size, network layers and etc).
there is no code conversion between Matlab and Python i used Matlab function and Pretrained Net for this work and for Python use Keras and Pycharm.
but in a windows environment with Matlab, my GPU utilization goes around 45% and in Python, at Linux OS it was around 90%!
now the question is, do you recommend me reInstall my Matlab on Linux OS and then i can use more from my hardware resource?
  3 个评论
Joss Knight
Joss Knight 2019-12-16
Hi Abolfazi. I can't really recommend anything until I've seen your code. It may be as simple as changing the way you access your data; it may be that you should move to Linux; or it may be that there's nothing you can do. Maybe your Python code is grotesquely inefficient with GPU resources or spins up a lot of worthless kernels during spare cycles! It's just impossible to say. Give us your code, and run the MATLAB profiler and show us the profile report.
Markus Walser
Markus Walser 2024-9-26
编辑:Markus Walser 2024-9-26
Hi
I'm having the same problem with low GPU usage on a Windows 2019 server with current Matlab 2024b while training a yolox network. The load on the GPU looks like this:
Load A40
The top profile entries of the training call trainYOLOXObjectDetector are:
Profile 1
And the code is like this:
% Load images and box labels
oldDataPath = "L:\DataStore";
basePath = "C:\DataStore";
pathes = {
fullfile(basePath, '10_ImageFolder', 'gTruthA.mat');
fullfile(basePath, '10_ImageFolder', 'gtruthB.mat');
fullfile(basePath, '20_ImageFolder', 'gtruthC.mat');
};
rng(0);
for idx = 1:numel(pathes)
load(pathes{idx}, 'gTruth');
alternativePaths = {[oldDataPath basePath]};
changeFilePaths(gTruth, alternativePaths);
[gTruthTrain, gTruthVal] = partitionGroundTruth(gTruth, 0.8);
if ~exist('gTruthTemp', 'var')
gTruthTemp = gTruth;
gTruthTrainTemp = gTruthTrain;
gTruthValTemp = gTruthVal;
else
gTruthTemp = merge(gTruthTemp, gTruth);
gTruthTrainTemp = merge(gTruthTrainTemp, gTruthTrain);
if ~isempty(gTruthVal)
gTruthValTemp = merge(gTruthValTemp, gTruthVal);
end
end
end
gTruth = gTruthTemp;
gTruthTrain = gTruthTrainTemp;
gTruthVal = gTruthValTemp;
clear gTruthTemp gTruthTrainTemp gTruthValTemp;
% Generate and combine datastores
classNames = gTruth.LabelDefinitions.Name;
imbxtrainds = combine(imageDatastore(gTruthTrain.DataSource.Source), boxLabelDatastore(gTruthTrain.LabelData));
if ~isempty(gTruthVal)
imbxvalds = combine(imageDatastore(gTruthVal.DataSource.Source), boxLabelDatastore(gTruthVal.LabelData));
else
imbxvalds = [];
end
% Image processing
imgSize = [96 576 3];
imbxtrainds = imbxtrainds.transform(@(x) imbxdsPreprocess(x, imgSize));
imbxtrainaugds = imbxtrainds.transform(@imbxdsAugmenter);
if ~isempty(imbxvalds)
imbxvalds = imbxvalds.transform(@(x) imbxdsPreprocess(x, imgSize));
imbxvalaugds = imbxvalds.transform(@imbxdsAugmenter);
end
% Create new yolo
net=yoloxObjectDetector('nano-coco', classNames, 'InputSize', imgSize);
% Train and transfer learning
imbxtrainaugds.reset();
options = trainingOptions("sgdm", ...
InitialLearnRate=1e-3, ...
MiniBatchSize=64,...
MaxEpochs=500, ...
BatchNormalizationStatistics="moving",...
ResetInputNormalization=false,...
VerboseFrequency=10,...
Plots="training-progress",...
Shuffle="every-epoch",...
ValidationData=imbxvalds,...
ExecutionEnvironment="auto",...
PreprocessingEnvironment="parallel");
[net, info] = trainYOLOXObjectDetector(imbxtrainds, net, options);
Do you have any idea, how to increase the GPU usage and speed up the training process?

请先登录,再进行评论。


Lamya Mohammad
Lamya Mohammad 2020-2-29
Did you solve the problem? My utilization is 29% and I wish to increase it

类别

Help CenterFile Exchange 中查找有关 Image Data Workflows 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by