How to split dataset into 2/3 for training and 1/3 for testing include plot the graph?
1 次查看(过去 30 天)
显示 更早的评论
This is my coding...but i got error and cannot get the correct answer.
can you guy help me...Pleaseeeee
clear all, close all, clc
load hald; % Load Portlant Cement dataset
A = ingredients;
b = heat;
N=13; %number of row
idx=1:13;
PD=2/3;
%split data for training and testing
Ptrain=idx(1:round(PD*N));Ttrain=idx(1:round(PD*N));
Ptest=idx(round(PD*N)+1:end,:);Ttest=idx(round(PD*N)+1:end,:);
dataPTrain=hald(Ptrain);
dataPTest=hald(Ptest);
[U,S,V] = svd(A,'econ');
x = V*inv(S)*U'*b; % Solve Ax=b using the SVD
plot(dataPTrain,'k','LineWidth',2); hold on % Plot data
plot(dataPTest,'r-o','LineWidth',1.,'MarkerSize',2); % Plot regression
l1 = legend('Heat data','Regression')
%% Alternative 1 (regress)
x = regress(b,A);
%% Alternative 2 (pinv)
x = pinv(A)*b;
0 个评论
回答(2 个)
Sulaymon Eshkabilov
2023-1-15
You should use random partition of your total data set, e.g.:
rng("default"); % For reproducibility
n = length(X); %
C = cvpartition(n, "HoldOut", 65); % 65% for training and the remaining 35% for testing
INDEXtrain = training(C,1);
INDEXtest = ~ INDEXtrain;
X_test = X(INDEXtest,:);
Y_test = Y(INDEXtest,:);
X_train = X(INDEXtrain,:);
Y_train = Y(INDEXtrain,:);
Voss
2023-1-15
This:
Ptest=idx(round(PD*N)+1:end,:);Ttest=idx(round(PD*N)+1:end,:)
should be this:
Ptest=idx(round(PD*N)+1:end);Ttest=idx(round(PD*N)+1:end)
because idx is a row vector, and the way you had it was trying to index beyond row 1 (the only row it has) of idx.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Model Building and Assessment 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!