The indices for validation and test sets are not being assigned correctly by 'divideind'

Question

Fatemeh Choubineh 2024-5-30

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2124161-the-indices-for-validation-and-test-sets-are-not-being-assigned-correctly-by-divideind

评论： Steven Lord 2024-5-30

在 MATLAB Online 中打开

I am new in matlab and I am trying to use 'divideind' to divide my data in order, but unfortunately it gives me this:

trainInd: [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 … ] (1×1418 double)

valInd: [1×0 double]

testInd: [1×0 double]

I would appreciate it if you tell me where is the problem?

this is my code:

X = data.inputs;
% Define the dependent variable
y = data.output;
Q = length(y);
% Define the split points
train_end = 1418; 
val_end = 1773;    
test_end = Q;      
% Assign the indices
train_indices = 1:train_end;
val_indices = (train_end + 1):val_end; 
test_indices = (val_end + 1):test_end; 
% Use divideind to ensure proper division
[train_indices,val_indices,test_indices] = divideind(Q,train_indices,val_indices,test_indices);
% Divide the data into training and testing sets
x_train = log(X(train_indices, :));
x_valid = log(X(val_indices, :));
x_test = log(X(test_indices, :));
y_train = log(y(train_indices));
y_valid = log(y(val_indices));
y_test = log(y(test_indices));
% Transpose the data to fit the input format of the neural network
inputs_train = x_train';
outputs_train = y_train';
inputs_validation = x_valid' ;
outputs_valid = y_valid';
inputs_test = x_test';
outputs_test = y_test';
% Initialize arrays to store RMSE values
train_rmse = zeros(1, 10);
Valid_rmse = zeros(1, 10);
test_rmse = zeros(1, 10);
for i = 1:10
    hiddenLayerSize = [i, i];
    net = fitnet(hiddenLayerSize, 'trainlm');
    net.trainParam.epochs = 200; 
    net.layers{1}.transferFcn = 'poslin'; 
    net.layers{2}.transferFcn = 'logsig'; 
    
    net.divideFcn = 'divideind';
    net.divideMode = 'sample';
    net.divideParam.trainInd = train_indices;
    net.divideParam.valInd = val_indices;
    net.divideParam.testInd = test_indices;
    
    [net,tr] = train(net, inputs_train, outputs_train);
    
    % Predict on training data
    train_predictions = exp(net(inputs_train(:,tr.trainInd)));
    yTrainTrue = exp(outputs_train(:,tr.trainInd));
    train_rmse(i) = sqrt(mean((train_predictions - yTrainTrue).^2));
    % Predict on Validation data
    Valid_predictions = exp(net(inputs_validation(:,tr.valInd)));
    yValTrue = exp(outputs_valid(:,tr.valInd));
    Valid_rmse(i) = sqrt(mean((Valid_predictions - yValTrue).^2));
    % Predict on Validation data
    test_predictions = exp(net(inputs_test(:,tr.testInd)));
    yTestTrue = exp(outputs_test(:,tr.testInd));
    test_rmse(i) = sqrt(mean((test_predictions - yTestTrue).^2));
    
end

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Steven Lord 2024-5-30

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2124161-the-indices-for-validation-and-test-sets-are-not-being-assigned-correctly-by-divideind#answer_1465601

在 MATLAB Online 中打开

Can you confirm you're using the divideind function included as part of Deep Learning Toolbox?

which -all divideind
/MATLAB/toolbox/nnet/nnet/nndivision/divideind.m

When I run this code with a sample Q it gives me the results I'd expect:

callDivideInd(2000)
train_indices = 1x1418
     1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17    18    19    20    21    22    23    24    25    26    27    28    29    30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
val_indices = 1x355
        1419        1420        1421        1422        1423        1424        1425        1426        1427        1428        1429        1430        1431        1432        1433        1434        1435        1436        1437        1438        1439        1440        1441        1442        1443        1444        1445        1446        1447        1448
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
test_indices = 1x227
        1774        1775        1776        1777        1778        1779        1780        1781        1782        1783        1784        1785        1786        1787        1788        1789        1790        1791        1792        1793        1794        1795        1796        1797        1798        1799        1800        1801        1802        1803
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

Or perhaps you're calling it with a smaller Q, say Q equal to train_end?

callDivideInd(1418)
train_indices = 1x1418
     1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17    18    19    20    21    22    23    24    25    26    27    28    29    30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
val_indices =

  1x0 empty double row vector


test_indices =

  1x0 empty double row vector
function callDivideInd(Q)
train_end = 1418; 
val_end = 1773;    
test_end = Q;      
% Assign the indices
train_indices = 1:train_end;
val_indices = (train_end + 1):val_end; 
test_indices = (val_end + 1):test_end; 
% Use divideind to ensure proper division
[train_indices,val_indices,test_indices] = divideind(Q,train_indices,val_indices,test_indices)
end

2 个评论
显示无隐藏无

Fatemeh Choubineh 2024-5-30

Thank you so much for your answer. Yes I confirm that. The length of y is 2086 and I tried with different Q but still not working. I am stuck and I do not know what the problem is.

Steven Lord 2024-5-30

Can you run that which command and show us exactly what it returns? And can you show us exactly what the variable named data looks like? We probably don't need to see the values, just the sizes of its fields should be sufficient.

请先登录，再进行评论。