What could be the reason why my model does not give accurate results as I planned?

11 次查看(过去 30 天)
Hi everyone. First of all, thank you for your time. This will be my first question on the matlab platform. Please excuse me if I have any mistakes. If you understand the problem, you can already find the necessary files in the zip folder. If you want to view images in pgm format, you can use the GIMP application.
I am planning to design a MLP image processing model without using any toolbox.
I plan to train my model by reading one by one 32x30 scale images in the CMU face images dataset I obtained from the internet and then continue with testing process.
(I use imread function that is provided by MATLAB)
INPUT is a cell vector which contains image matrixes in each element. So each element represents an image actually. While processing samples one by one I get its images as column vector.
Here is the code for file operations and image reading:
clc;clear;close;
%*****************Reading Images**************
myFolder = ''; %% Images folder path
if ~isfolder(myFolder) %% Checking if the folder doesn't exist
errorMessage = sprintf('Error: The following folder does not exist:\n%s\nPlease specify a new folder.', myFolder);
uiwait(warndlg(errorMessage));
myFolder = uigetdir(); % Ask for a new one.
if myFolder == 0
% User clicked Cancel
return;
end
end
filePattern = fullfile(myFolder, '*.pgm');
theFiles = dir(filePattern);
% Define the number of image files in the folder
numImages = length(theFiles);
% Initialize a cell array to store the images
INPUT = cell(numImages, 1);
for k = 1 : length(theFiles)
baseFileName = theFiles(k).name;
fullFileName = fullfile(theFiles(k).folder, baseFileName);
INPUT{k} = imread(fullFileName);
imshow(INPUT{k}); % Display image.
drawnow; % Force display to update immediately.
end
%***********************************************
It is a user-interactive model, and firstly I get the number of:
Hidden layers:
Neurons in each hidden layer: (neuron numbers will be the same for each hidden layer)
Max iteration:
of my model from user.
Other definitions are shown in below code. I define NumberOfInput as 960 which comes from 32x30 because Weight matrix's size between input layer and first hidden layer needed to adjusted in that way.
Weight matrix values are assigned randomly.
My model should return 0 if person doesn't wear sunglasses and 1 if person wears with high accuracy. So it is scaler and there will be 1 output obviously.
I studied about MLP models and I found that finding perfect variables is a hard subject in machine learning and it depends on application and tests. So I defined my ETA with various values: 0.01,0.02,0.05,0.2,0.5....
In this type of model input to neurons are defined as netH and output of these neurons are defined as H except last connections. In there they become netO and O.
Also sigma size is defined in order to use after forward state (backward state starts).
My model is an example of Supervised Learning and it needs some outputs for training images like mentioned DESIRED as below. Images inside Model_Training are located as open-->sunglasses-->open-->sunglasses... so I decided to define desired with this order as shown below.
Here is the code:
%*******************VARİABLES*******************
NumberOfPatterns=numImages;
NumberOfInput=960;
NumberOfOutput=1;
LearningRate_ETA=0.5;
while true
NofLayers=input("Layer number: "); % Hidden layer number
Nofneurons=input("Neuron number: "); % Neuron number of each hidden layer
Max_iteration=input("Max iteration number: "); % Max iteration
if(Nofneurons<=0 || NofLayers<=0 || Max_iteration<=0)
fprintf("These values can't be accepted !");
fprintf("\nPlease enter again");
else
break;
end
end
W = cell(NofLayers+1,1);
H=cell(NofLayers,1);
sigma=cell(NofLayers+1,1);
%***********************************************
% Random values are assigned to Weights
for i=1:NofLayers+1
if i==1
W{i}=rand(NumberOfInput,Nofneurons);
elseif i==NofLayers+1
W{i}=rand(Nofneurons,NumberOfOutput);
else
W{i}=rand(Nofneurons,Nofneurons);
end
end
%***********************************************
DESIRED=zeros(NumberOfPatterns,1);
%****************Adjusting Desired Results******
%Training images are located in order. Ex:
%A_open.pgn
%A_sunglasses.pgn
for i=1:NumberOfPatterns
if(mod(i,2)==1)
DESIRED(i)=0;
else
DESIRED(i)=1;
end
end
%************************************************
Right now processing starts. I need to mention that I use sigmoid function as activation function.
In order not to prolong the topic further I will share directly code in here.
%***********************Processing***************
for a=1:Max_iteration
totalerr=0;
for i = 1:NumberOfPatterns
ImageVector = reshape(INPUT{i}, [], 1);
X = double(ImageVector);
for lay=1:NofLayers+1
if(lay==1) %First connections
netH=W{lay}'*X;
H{lay}=sigmoid(netH);%%%
elseif (lay==NofLayers+1) %Last connections
netO=W{lay}'*H{lay-1};
O=sigmoid(netO);
else % between connections layers
netH=W{lay}'*H{lay-1}; %
H{lay}=sigmoid(netH);%
end
end
err=DESIRED(i)-O;
for j=1:NumberOfOutput
sigma{NofLayers+1}=err*O(j)*(1-O(j)); %Last sigma value
end
for l=1:NofLayers
for k=1:Nofneurons
[rowsigma colsigma]=size(sigma{NofLayers-l+2});
[rowW colsW]=size(W{NofLayers-l+2}(k,:));
%These conditions satisfies proper matrix multiplciation
if(colsigma==rowW)
sigma{NofLayers-l+1}=sigma{NofLayers-l+2}*W{NofLayers-l+2}(k,:) *H{NofLayers+1-l}(k)*(1-H{NofLayers+1-l}(k));
else
sigma{NofLayers-l+1}=sigma{NofLayers-l+2}*W{NofLayers-l+2}(k,:)'*H{NofLayers+1-l}(k)*(1-H{NofLayers+1-l}(k));
end
end
end
for z=1:NofLayers+1
%Weights are updated at this part
if((NofLayers+2-z)==1)
W{NofLayers+2-z}=W{NofLayers+2-z}+LearningRate_ETA*X*sigma{NofLayers+2-z};
else
W{NofLayers+2-z}=W{NofLayers+2-z}+LearningRate_ETA*H{NofLayers+1-z}*sigma{NofLayers+2-z};
end
end
totalerr=totalerr+0.5*err^2;
end
cost(a)=totalerr;
end
plot(cost);
%%*****************Test Case********************
%Getting test image address from user
fileFilter = '*.pgm';
[filename, pathname] = uigetfile(fileFilter, 'Select a PGM file', '');
if isequal(filename, 0)
disp('Program has stopped');
else
fullFilePath = fullfile(pathname, filename);
end
%**************Test Sample Operations*******
testSample=imread(fullFilePath); %
testSample=reshape(testSample,[],1);
X=double(testSample);
for lay=1:NofLayers+1
if(lay==1) %First connections
netH=W{lay}'*X;
H{lay}=sigmoid(netH);
elseif (lay==NofLayers+1) %Last connections
netO=W{lay}'*H{lay-1};
Out=round(sigmoid(netO));
else % between connections layers
netH=W{lay}'*H{lay-1};
H{lay}=sigmoid(netH);
end
end
fprintf('Result is: %d\n', Out);
%**********************Helper Functions*********
%Sigmoid Activation Function
function y = sigmoid(x)
y = 1 ./ (1 + exp(-x));
end
You can run and test it with the files that provided in zip file. In this kind of model as I know I need to try it with high number of layer and neuron. I tried with 4-20 5-30 5-35 ... Generally it returns 1 and this is the problem that I am struggling with.
If you can give any comment, feedback I would appreciate it. Again thank you for giving a time.

回答(1 个)

Shivansh
Shivansh 2024-6-29
Hi Omar!
It seems like your model is predicting the label "1" more often and might be overfitted on it.
The implementation of the MLP looks fine and should be able to provide better results for this problem.
There are a few areas in your code that might improve the performance of your model.
The first issue can be the distribution of classes in the training and testing dataset. Try oversampling or undersampling techniques in case of unbalanced classes in the dataset.
You can use normal distribution with smaller values for weights initialization to prevent the possible saturation of the sigmoid function. You can modify the weight initialization to use the "randn" method and multiply it by a constant of ~0.05.
The current learning rate of 0.5 might be a little high for your problem. Try reducing the learning rate and analyze the impact on the model.
I also didn't see any bias terms in the provided code. The inclusion of bias terms can impact the model significantly.
The "sigmoid" function might be fine for the model but you might want to experiment with different activation functions like "relu" or "tanh".
You can try these changes and analyze the impact on the model to find the issue.
You can refer to the following documentation for more information on "randn":
I hope it helps in resolving the issue.

类别

Help CenterFile Exchange 中查找有关 Image Data Workflows 的更多信息

产品


版本

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by