cuda out of memory
3 次查看(过去 30 天)
显示 更早的评论
I use a MATLAB 2017a for a training of Deep CNN. But, i have a cuda out of memory error when i try to train deep networks such as a VGG net. I use a GTX 1070 GPU with 8GB memory. I think it is enough for the training of VGG net. Even i try to train with titan x GPU. But same error occurs! Anyone can help this problem.
2 个评论
Joss Knight
2017-6-16
I think you're going to have to show us the code you used to create the network, and what your trainingOptions are. The default training options use a MiniBatchSize of 128 which is probably too big to train a VGG network.
silver tena
2021-8-24
I have a problem about out of memory for VGG16. I have the GPU nVIDIA T1650, 16GB. I need all of your help.
net = vgg16;
inputSize = net.Layers(1).InputSize; %program lama
layersTransfer = net.Layers(1:end-3);
numClasses = numel(categories(imdsTrain.Labels));%program lama
layers = [
layersTransfer
fullyConnectedLayer(numClasses,'WeightLearnRateFactor',20,'BiasLearnRateFactor',20)
softmaxLayer
classificationLayer];
pixelRange = [-30 30];
%scaleRange = [0.9 1.1];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, ...
'RandYTranslation',pixelRange);
% 'RandXScale', scaleRange, ...
% 'RandYScale', scaleRange);
augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain, ...
'DataAugmentation',imageAugmenter);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);
% remark the parameter (input from GUI)
max_epoch = str2double(get(handles.edit1,'String'));
miniBatchSize = str2double(get(handles.edit2,'String'));
learn_rate = str2double(get(handles.edit3,'String'));
momentum = str2double(get(handles.edit4,'String'));
options = trainingOptions('sgdm', ...
'MiniBatchSize',miniBatchSize, ...
'MaxEpochs',max_epoch, ...
'InitialLearnRate',learn_rate, ...
'Momentum',momentum, ...
'Shuffle','every-epoch', ...
'ValidationData',augimdsValidation, ...
'ValidationFrequency',3, ...
'Verbose',false, ...
'Plots','training-progress');
% Training process
rng('default')
netTransfer = trainNetwork(augimdsTrain,layers,options);
% Feature extrations
layer = 'fc7';
featuresTrain = activations(netTransfer,augimdsTrain,layer,'OutputAs','rows');
featuresVal = activations(netTransfer,augimdsValidation,layer,'OutputAs','rows')
I have a prolem is "out of memory" for VGG16. I have the datasets are 1200 images. I set the minibatchsize is 4 or 5, so the the reuslt is message error as "out of memory". I need the help form all of you. mag God bless you.
回答(2 个)
dekwe
2017-6-23
I also have the same problems, my GTX 1080 Ti has a memory for 11.8 go. The MiniBatch Size is set to 16. Please what is the memory required for imagenet-vgg-verydeep-16 ?
2 个评论
Joss Knight
2017-6-26
编辑:Joss Knight
2017-6-26
I was able to train VGG16 on my GTX 1080 with MiniBatchSize up to 80 or so, and that has only 8.5GB of memory. Beyond that I started to get issues with kernel timeouts on my Windows machine, but I could see looking at nvidia-smi output that this was using nearly all the memory.
You may have to share your code for creating and training the network for us to diagnose this. Also, what version of MATLAB are you using? R2016b and before are missing some important memory optimisations.
Deng Cao
2017-9-13
I also have the same problem. I am using Quadro M1000M which has a compute Capability of 5.0 and the memory is 4GB. I tried the smallest MiniBatch Size = 4 and still has a out of memory problem. I am training 1080P images using faster RCNN for object detection. The GPU I am using might not be the best, but people can even train VGG on a mobil device with OpenCV and TensorFlow already. This issue must be solvable.
dekwe
2017-7-4
I am using R2017, I have error even by using a batchSize of 4. My images was 6000*4000 I try 1000*1000 and get similar problems. So for sure it is a stupid problem, but I don't know where to look !
2 个评论
dekwe
2017-7-4
Error using gpuArray/max An unexpected error occurred trying to launch a kernel. The CUDA error was: out of memory
Joss Knight
2017-7-18
Well, those images are enormous and far too big to be input to VGG16, which takes 227-by-227. So have you modified the network to support large images? If so the size of the data needed to propagate activations through the network will also go up and thus the capacity of your GPU has gone down dramatically.
Alternatively, perhaps you are resizing the images before you pass them to the network. Maybe you are resizing them on the GPU and thus filling up GPU memory with large batches of oversized images that don't need to be there.
So as previously stated, not much can be diagnosed without seeing your code.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 영상에서의 딥러닝 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!