- Model complexity: “ResNet” models can have a larger number of parameters compared to CNN-LSTM models, especially if you use deeper “ResNet” variants like ResNet-50 or ResNet-101. This increased complexity may require more computational resources and training time.
- Dataset characteristics: The characteristics of your dataset, such as size, complexity, and class imbalance, can affect training time. If your dataset is particularly large or contains complex patterns, it may require more time to train regardless of the model architecture.
- Hyperparameters: The choice of hyperparameters, such as learning rate, batch size, and regularization techniques, can impact training time. Suboptimal hyperparameter settings may result in slower convergence or require more iterations to achieve good performance.
- Implementation details: The efficiency of the implementation, including the software framework and hardware used, can affect training time. Different frameworks or hardware configurations may have varying levels of optimization, which can influence the overall training speed.
It is being said that Resnet model requires less training time but when I used resnetLayer function of matLab to create a residual network why it takes more time
4 次查看(过去 30 天)
显示 更早的评论
It is being said that Resnet model requires less training time as it eliminate vanishing gradient problem but when I used resnetLayer function of matLab to create a residual network and do the training it takes more time in comparison to CNN-LSTM model why it is so?
0 个评论
回答(1 个)
Hari
2023-9-15
Hi Debojit,
I understand that you have observed, the “ResNet” model is taking more time to train compared to the “CNN-LSTM” model, contrary to the expectation that “ResNet” should have faster training due to its ability to address the vanishing gradient problem.
The “ResNet” model is known for its ability to mitigate the vanishing gradient problem, which can occur in deep neural networks during training.
However, the actual training time of a model can be influenced by various factors, including the specific architecture, dataset, hyperparameters, and implementation details. It's important to note that the “ResNet” architecture itself does not guarantee faster training time in all scenarios compared to other models like “CNN-LSTM”.
Here are a few reasons why you might observe longer training time with the “ResNet” model compared to the CNN-LSTM model in your specific case:
Refer to the documentation of “Sequence Classification Using CNN-LSTM Network” for more information.
Refer to the documentation of “resnetLayers” for more information.
I hope this helps.
Thanks,
Hari.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Pattern Recognition and Classification 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!