train a deep learning model (resnet-50 network) on a remote HPC cluster

7 次查看(过去 30 天)
I am trying to run a code, which uses a pre-trained ResNet-50 network, on a remote HPC cluster by submitting batch GPU jobs. I get the following error at this line:
net = resnet50
Error using resnet50
resnet50 requires the Deep Learning Toolbox Model for ResNet-50 Network support
package for the pretrained weights. To install this support package, use the <a
href="matlab:
matlab.addons.supportpackage.internal.explorer.showSupportPackages('RESNET50',
'tripwire')">Add-On Explorer</a>. To obtain the untrained layers, use
resnet50('Weights','none'), which does not require the support package.
It seems the Deep Learning Toolbox Model for ResNet-50 Network add-on is not installed on the cluster. How can I install this add-on on it?
Thanks

采纳的回答

David Willingham
David Willingham 2022-10-14
Just to confirm, you're sending batch jobs to a HPC cluster that has MATLAB parallel server installed?
If so, one option to try would be:
  1. save resnet50 as as MAT file
  2. attach the MAT file when submitting the job
  3. have a load MAT file command in the function you're submitting.
  1 个评论
EK_47
EK_47 2022-10-14
Brilliant! Thank you for your answer. It solved my problem.
Yes, the HPC cluster has MATLAB paraller server installed.
In your point 1, you said "save resnet50 as a MAT file". I was not sure what you mean by "save resnet50". What I did was just I called it in MATLAB on my local machine
basenet = resnet50;
then saved it as
save('basenet.mat','basenet');
and then transferred this MAT file into the remote cluster and loaded it there.
Thanks

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Image Data Workflows 的更多信息

产品


版本

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by