Optimizing Hyperparameters for trainnet fucntion

Question

Isabelle Museck 2024-9-10

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2151654-optimizing-hyperparameters-for-trainnet-fucntion

编辑： Shantanu Dixit 2024-9-12

Hi there I have built my own TCN model in matlab to predict a contunious output and am trying to figure out the best way to optimize the hyperparmeters: Filter Size, Number of Filters, Number of Blocks, and Drop out Factor. I am attempting to use the bayespot function, but am not sure what to use as my function handel and if this is the best method for this kind of network. Should I be using the experiment manger to do this instead (https://www.mathworks.com/help/deeplearning/ug/tune-experiment-hyperparameters-using-bayesian-optimization.html) Does anyone have an suggestions for my code or if there is another way to perform hyperparmeter optimization for this type of network achetecture? Thanks so much.

%Network 
numFilters = 64;
filterSize = 5;
droupoutFactor = 0.005;
numBlocks = 5;
net = dlnetwork;
layer = sequenceInputLayer(numFeatures,Normalization="rescale-symmetric",Name="input");
net = addLayers(net,layer);
for i = 1:numBlocks
    dilationFactor = 2^(i-1);
    
    layers = [
        convolution1dLayer(filterSize,numFilters,DilationFactor=dilationFactor,Padding="causal",Name="conv1_"+i)
        layerNormalizationLayer
        spatialDropoutLayer(Name= "spat_drop_"+i,Probability=droupoutFactor)
        convolution1dLayer(filterSize,numFilters,DilationFactor=dilationFactor,Padding="causal")
        layerNormalizationLayer
        reluLayer
        spatialDropoutLayer(Name="spat_drop2_"+i,Probability=droupoutFactor)
        additionLayer(2,Name="add_"+i)];
    % Add and connect layers.
    net = addLayers(net,layers);
    net = connectLayers(net,outputName,"conv1_"+i);
    
end
net = connectLayers(net,outputName,"fc");
%Training Options
options = trainingOptions("adam", ...
        'MaxEpochs', 60, ...
        'MiniBatchSize', 1, ...
        'InputDataFormat', "CTB", ...
        'Metrics', "rmse", ...
        'Verbose', 0);
filtsize = optimizableVariable('filterSize',[1,10],'Type','integer')
numfilt = optimizableVariable('numFilters',[20,60],'Type','integer')
numblock = optimizableVariable('numBlocks',[1,10],'Type','integer')
dropout = optimizableVariable('dropoutfactor',[0.001,0.01],'Type','integer')
net = trainnet(traningdataX,trainingdataY,net,"mse",options);
fun = (x)@ %Not sure what to put here!
reuslts = bayesopt(,[filtsize, numfilt, numblock, droupout])

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Shantanu Dixit 2024-9-11

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2151654-optimizing-hyperparameters-for-trainnet-fucntion#answer_1514610

编辑：Shantanu Dixit 2024-9-12

在 MATLAB Online 中打开

Hi Isabelle,

The 'bayesopt' function requires an objective function as its first argument, which it aims to minimize using the specified optimization variables. A custom objective function can be designed to take the values of these optimization variables as inputs. This function then defines the network architecture and training options based on these inputs, train and validate the network.

Here's a brief outline how the objective function can be designed:

function ObjFcn = makeObjFcn(X_train, Y_train, X_val, Y_val)
    ObjFcn = @valErrorFun;
    
    function [valLoss, cons, fileName] = valErrorFun(optVars)
        % Import the hyperparameters from optVars
        filterSize = optVars.filterSize;
        numFilters = optVars.numFilters;
        numBlocks = optVars.numBlocks;
        dropoutFactor = optVars.dropoutfactor;
        % defineTCN builds the network as defined above
        net = defineTCN(filterSize, numFilters, numBlocks, dropoutFactor);
        % Set up the training options
        options = trainingOptions('adam', ...
            'MaxEpochs', 60, ...
            'MiniBatchSize', 1, ...
            'ValidationData', {X_val, Y_val}, ...
            'Shuffle', 'every-epoch', ...
            'ValidationFrequency', 50, ...
            'Verbose', false);
        [net, trainInfo] = trainnet(X_train, Y_train, net, options);
        
        % Assuming task involves predicting continuous values
        % can involve different loss formulations
        valPredictions = predict(net, X_val);
        valLoss = sqrt(mean((valPredictions - Y_val).^2));
        % Save model file
        fileName = num2str(valLoss) + ".mat";
        save(fileName,'net','valLoss','options')
        cons = [];
        % ...
    end
end
% Set parameters for optimization
optimVars = [
    filtsize = optimizableVariable('filterSize', [1, 10], 'Type', 'integer');
    numfilt = optimizableVariable('numFilters', [20, 100], 'Type', 'integer');
    numblock = optimizableVariable('numBlocks', [1, 10], 'Type', 'integer');
    dropout = optimizableVariable('dropoutfactor', [0.001, 0.01], 'Type', 'real')];
ObjFcn = makeObjFcn(trainingDataX, trainingDataY, validationDataX, validationDataY);
results = bayesopt(ObjFcn, optimVars);

For a detailed example on using a custom objective function for optimization, refer to the following link:

https://www.mathworks.com/help/deeplearning/ug/deep-learning-using-bayesian-optimization.html?searchHighlight=optvars&s_tid=doc_srchtitle#OptimizeDeepNeuralNetworksUsingBayesianOptimizationExample-5

Alternatively you can also refer to the below examples using Experiment manager using Bayesian Optimization:

2 个评论
显示无隐藏无

Isabelle Museck 2024-9-11

在 MATLAB Online 中打开

Thank you so much fro your response, It is very helpful. I am wondering, however, if I am to include the TCN architecture that I built within this function, and if so where do I input it? Is this correct?

%Design optimization function
function ObjFcn = makeObjFcn(traningdataX, trainingdataY, validationdataX, validationdataY)
    ObjFcn = @valErrorFun;
function [valLoss, cons, fileName] = valErrorFun(optVars)
        % Import the hyperparameters from optVars
        filterSize = optVars.filterSize;
        numFilters = optVars.numFilters;
        numBlocks = optVars.numBlocks;
        dropoutFactor = optVars.dropoutfactor;
 
 % defineTCN builds the network as defined above
        net = defineTCN(filterSize, numFilters, numBlocks, dropoutFactor);
        
 %%Im not sure if this is where I need to inout the TCN network??%%  
 %TCN Network
net = dlnetwork;
layer = sequenceInputLayer(numFeatures,Normalization="rescale-symmetric",Name="input");
net = addLayers(net,layer);
for i = 1:numBlocks
    dilationFactor = 2^(i-1);
    
    layers = [
        convolution1dLayer(filterSize,numFilters,DilationFactor=dilationFactor,Padding="causal",Name="conv1_"+i)
        layerNormalizationLayer
        spatialDropoutLayer(Name= "spat_drop_"+i,Probability=droupoutFactor)
        convolution1dLayer(filterSize,numFilters,DilationFactor=dilationFactor,Padding="causal")
        layerNormalizationLayer
        reluLayer
        spatialDropoutLayer(Name="spat_drop2_"+i,Probability=droupoutFactor)
        additionLayer(2,Name="add_"+i)];
    % Add and connect layers.
    net = addLayers(net,layers);
    net = connectLayers(net,outputName,"conv1_"+i);
    
end
%Training options
      options = trainingOptions("adam", ...
        'MaxEpochs', 60, ...
        'MiniBatchSize', 1, ...
        'InputDataFormat', "CTB", ...
        'Metrics', "rmse", ...
        'Verbose', 0);
[net, trainInfo] = trainNetwork(traningdataX, trainingdataY, net, options);
        
% predicting continuous values loss formulations
        valPredictions = predict(net, validationdataX);
        valLoss = sqrt(mean((valPredictions - validationdataY).^2));
        % Save model file
        fileName = num2str(valLoss) + ".mat";
        save(fileName,'net','valLoss','options')
        cons = [];
        % ...
    end
end
% Set parameters for optimization
optimVars = [
    filtsize = optimizableVariable('filterSize', [1, 10], 'Type', 'integer');
    numfilt = optimizableVariable('numFilters', [20, 100], 'Type', 'integer');
    numblock = optimizableVariable('numBlocks', [1, 10], 'Type', 'integer');
    dropout = optimizableVariable('dropoutfactor', [0.001, 0.01], 'Type', 'real')];
ObjFcn = makeObjFcn(trainingDataX, trainingDataY, validationDataX, validationDataY);
results = bayesopt(ObjFcn, optimVars);
end

Shantanu Dixit 2024-9-12

编辑：Shantanu Dixit 2024-9-12

Yes, the network can be defined in the objective function as well.

Refer to custom objective function creation documentation: https://www.mathworks.com/help/deeplearning/ug/deep-learning-using-bayesian-optimization.html?searchHighlight=optvars&s_tid=doc_srchtitle#OptimizeDeepNeuralNetworksUsingBayesianOptimizationExample-5

'TrainNetwork' is not recommended by MathWorks, you can try using 'trainnet' for training.

In the above case where the network is defined as 'dlnetwork', 'trainnet' is appropriate to use since the above network is defined as 'dlnetwork'. 'TrainNetwork' takes in Neural network layers, specified as a 'Layer' array or a 'Layergraph' object.

'trainnet': https://www.mathworks.com/help/deeplearning/ref/trainnet.html#mw_191a6c6f-8ee5-44e7-aff4-1dcb23011b21

请先登录，再进行评论。

Optimizing Hyperparameters for trainnet fucntion

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

2 个评论
显示无隐藏无

另请参阅

类别

标签

Community Treasure Hunt

Optimizing Hyperparameters for trainnet fucntion

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

2 个评论 显示 无隐藏 无

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无