Formatting input data for linear regression model in leave-out-one validation testing
9 个评论
Hi Isabelle,
Sounds like interesting project. In your code, you are passing cell arrays as predictors, which is causing the error.To resolve this issue, you need to convert your cell arrays to numeric arrays before fitting the linear model. However, I did update the code including leave out one validation approach. Here is updated code snippet example,
% Define and populate sample data for 'data' and 'responseData'
data = {1, 2, 3, 4, 5}; % Sample predictor data
responseData = {10, 20, 30, 40, 50}; % Sample response data
% Define and populate the 'Predictors' variable with sample data
Predictors = cell(1, length(data));
for i = 1:length(data)
Predictors{i} = data{i};end
% Define and populate the 'Response' variable with sample data
Response = cell(1, length(responseData));
for i = 1:length(responseData) Response{i} = responseData{i}; end
% Train the linear regression model with leave-one-out cross-validation
for i = 1:length(Predictors)
% Extract validation data for the current iteration
validationdataX = cell2mat(Predictors(i));
validationdataY = cell2mat(Response(i));
% Exclude the current index (i) for training
trainingIndices = setdiff(1:length(Predictors), i);
trainingdataX = cell2mat(Predictors(trainingIndices));
trainingdataY = cell2mat(Response(trainingIndices));
% Train the linear regression model
mdl = fitlm(trainingdataX, trainingdataY);
% Make predictions on the validation data
ypred = predict(mdl, validationdataX);
% Calculate RMSE for the current iteration
RMSE = sqrt(mean((ypred - validationdataY).^2));
% Display RMSE for each iteration
disp(['RMSE for iteration ', num2str(i), ': ', num2str(RMSE)]);
Hope, this is what you are looking for. Please see attached results.

Please let me know if you have any further questions.
Hi @Isabelle Museck,
To input the predictor and response data into a linear model without dimension mismatch errors, you have to make sure that the dimensions of the data align correctly. In the provided code snippet, you can modify the data handling part as follows:
% Train the network
for i = 1:length(Predictors) % iterate over all data points
validationdataX = Predictors(:, i); % Use all features for the current timestep
validationdataY = Response(:, i); % Use the response variable for the current
timestep
% Exclude the current index (i) for training
trainingIndices = setdiff(1:length(Predictors), i);
trainingdataX = Predictors(:, trainingIndices); % Use all features for training data
trainingdataY = Response(:, trainingIndices); % Use response variable for training data
net = fitlm(trainingdataX', trainingdataY'); % Fit linear model
ypred = predict(net, validationdataX'); % Predict using the model
TrueValue = validationdataY';
PredictedValue = ypred';
RMSE = rmse(PredictedValue, TrueValue); % Calculate RMSE
end
Please bear in mind that this is example code snippet and you have to customize this code based on your preferences. Please let me know if you have any further questions.
回答(1 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Nonlinear Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


