Optimize ESN Hyperparameters with Grid Search MATLAB

Question

Jonathan Frutschy 2024-4-24

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2111446-optimize-esn-hyperparameters-with-grid-search-matlab

评论： Jonathan Frutschy 2024-4-25

I have the following echo state state network (ESN) hyperparameters: m_GS,k_GS,c_GS,gamma_GS. I would like to use my ESN to find the optimal values for these four hyperparameters using the grid search method in MATLAB. The function I would like to minimize is the discrete time vector L2_loss(m), where M=101 is the number of simulation timesteps in the function for m=1,2,...,M. Hence, L2_loss is a 1 by 101 double vector in my case. My attempt is shown below:

% Grid Search
ndatapoints = 20;
m_GS = linspace(0,1,ndatapoints); % [kg]
k_GS = linspace(0,100,ndatapoints); % [N/m]
c_GS = linspace(0,100,ndatapoints); % [kg/s]
gamma_GS = linspace(100,200,ndatapoints); % [N/m^3]
[m_G,k_G,c_G,gamma_G] = ndgrid(m_GS,k_GS,c_GS,gamma_GS);
fitresult = L2_loss;
[minval, minidx] = min(fitresult);
m_GS_optimal = m_G(minidx);
k_GS_opitimal = k_G(minidx);
c_GS_optimal = c_G(minidx);
gamma_GS_optimal = gamma_G(minidx);

I'm not sure that this is correct as c_GS_optimal and k_GS_optimal are both zero. Do I need to increase ndatapoints or set it equal to m?

3 个评论
显示 1更早的评论隐藏 1更早的评论

Jonathan Frutschy 2024-4-25

编辑：Jonathan Frutschy 2024-4-25

在 MATLAB Online 中打开

@Torsten Based on the example I grabbed this from (see below), L2_loss or my fitting_function should be a 20 by 20 by 20 by 20 matrix, not vector. My confusion with defining fitting_function as L2_loss is that I don't have an exact function that relates L2_loss to my hyperparameters like in the example. In other words, I don't know the exact form of L2_loss(m) = fitting_function(m_GS, k_GS,c_GS,gamma_GS). Furthermore, I can only obtain an L2_loss(m) vector for one value of each hyperparameter at a time in my code. That is, I specify M, choose one value from the linspace distribution for each hyperparameter, then run my code with the given M and 4 hyperameter values. This returns a 1 by M L2_loss(m) vector for each hyperparameter set of 4 values. I guess I would then do this for each hyperparameter set until I have covered all possible combinations, where the total number of possible combinations is 20^4? That would yield 160,000 1 by M L2_loss(m) vectors, which I would then have to figure out how to distribute as a 20 by 20 by 20 by 20 double matrix and assign as my fitting_function? Or should my fitting function be a 1 by 160,000 double vector as you suggested? In that case, what happens to the M dimension? Do i simple take a time average of each L2_loss(m) vector for each run of my code to reduce L2_loss(m) from a 1 by M vector to a 1 by 1 vector?

Example: https://www.mathworks.com/matlabcentral/answers/225681-how-to-do-grid-search-to-optimize-sigma-using-matlab

firstparam = [1, 2, 3.3, 3.7, 8, 21];  %list of places to search for first parameter
secondparam = linspace(0,1,20);        %list of places to search for second parameter
[F,S] = ndgrid(firstparam, secondparam);
fitting_function = @(p1, p2) p1^2 + p2^2;
fitresult = arrayfun(fitting_function, F, S) %run a fitting on every pair fittingfunction(F(J,K), S(J,K))
fitresult = 6x20
    1.0000    1.0028    1.0111    1.0249    1.0443    1.0693    1.0997    1.1357    1.1773    1.2244    1.2770    1.3352    1.3989    1.4681    1.5429    1.6233    1.7091    1.8006    1.8975    2.0000
    4.0000    4.0028    4.0111    4.0249    4.0443    4.0693    4.0997    4.1357    4.1773    4.2244    4.2770    4.3352    4.3989    4.4681    4.5429    4.6233    4.7091    4.8006    4.8975    5.0000
   10.8900   10.8928   10.9011   10.9149   10.9343   10.9593   10.9897   11.0257   11.0673   11.1144   11.1670   11.2252   11.2889   11.3581   11.4329   11.5133   11.5991   11.6906   11.7875   11.8900
   13.6900   13.6928   13.7011   13.7149   13.7343   13.7593   13.7897   13.8257   13.8673   13.9144   13.9670   14.0252   14.0889   14.1581   14.2329   14.3133   14.3991   14.4906   14.5875   14.6900
   64.0000   64.0028   64.0111   64.0249   64.0443   64.0693   64.0997   64.1357   64.1773   64.2244   64.2770   64.3352   64.3989   64.4681   64.5429   64.6233   64.7091   64.8006   64.8975   65.0000
  441.0000  441.0028  441.0111  441.0249  441.0443  441.0693  441.0997  441.1357  441.1773  441.2244  441.2770  441.3352  441.3989  441.4681  441.5429  441.6233  441.7091  441.8006  441.8975  442.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[minval, minidx] = min(fitresult); 
bestFirst = F(minidx)
bestFirst = 1x20
     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
bestSecond = S(minidx)
bestSecond = 1x20
     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

Jonathan Frutschy 2024-4-25

@Torsten You are right. I simply needed to reduce the M dimension by taking the time average of each combination to get a 20^4 sized vector, then take the minimum of this and use the index of this minimum to extract the optimal hyperparameters.

请先登录，再进行评论。

请先登录，再回答此问题。