What functions is the patternnet for the hidden layer and output layers?

Question

Timmy 2015-1-15

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/170069-what-functions-is-the-patternnet-for-the-hidden-layer-and-output-layers

评论： Greg Heath 2015-1-20

Hi, I am trying to learn how a NN works. I created a NN using MatLab patternet to classify XOR. However, when I input manually, it has a different result than the net(input). According to this article, if you use the GUI, it used the sigmoid transfer functions in both the hidden layer and the output layer, bullet 7. And if you used cmd-line, it used the tan-sigmoid transfer functions in both the hidden and output layers, bullet 2. I tried both version and it still give me a different result.Here is my code:

input = [0 0; 0 1; 1 0; 1 1]';
xor = [0 1; 1 0; 1 0; 0 1]';
% Create a larger sample size
input10 = repmat(input,1,10);
xor10 = repmat(xor,1,10);
% MatLab NN
net = patternnet(2);
net = train(net, input10, xor10);
% Get the weights
IW = net.IW;
b = net.b;
LW = net.LW;
IW = [IW{1}'; b{1}'];
LW = [LW{2}'; b{2}'];
%%Using tan-sigmoid
% Input to hidden layer
hid = zeros(2,1);
hidsig = zeros(2,1);
in = input(:,1);
for i = 1:2
hid(i) = dot([in;1],IW(:,i));
hidsig(i) = tansig(hid(i));
end
% Hidden to output layer without normalization
out = zeros(2,1);
outsig = zeros(2,1);
for i = 1:2
out(i) = dot([hidsig;1],LW(:,i));
outsig(i) = tansig(hidsig(i));
end
outsoftmax = softmax(out);
outsoftmaxsig = softmax(outsig);
% Hidden to output layer with normalization
normout = zeros(2,1);
normoutsig = zeros(2,1);
normhidsig = hidsig./norm(hidsig);
for i = 1:2
normout(i) = dot([normhidsig;1],LW(:,i));
normoutsig(i) = tansig(normhidsig(i));
end
normoutsoftmax = softmax(normout);
normoutsoftmaxsig = softmax(normoutsig);
result = net(in);
disp(result);
disp('tan-sigmoid');
disp(outsig);
disp(outsoftmax);
disp(outsoftmaxsig);
disp(normoutsig);
disp(normoutsoftmax);
disp(normoutsoftmaxsig);
%%Using sigmoid
% Input to hidden layer
hid = zeros(2,1);
hidsig = zeros(2,1);
in = input(:,1);
for i = 1:2
hid(i) = dot([in;1],IW(:,i));
hidsig(i) = sigmf(hid(i),[1,0]);
end
% Hidden to output layer without normalization
out = zeros(2,1);
outsig = zeros(2,1);
for i = 1:2
out(i) = dot([hidsig;1],LW(:,i));
outsig(i) = sigmf(hidsig(i),[1,0]);
end
outsoftmax = softmax(out);
outsoftmaxsig = softmax(outsig);
% Hidden to output layer with normalization
normout = zeros(2,1);
normoutsig = zeros(2,1);
normhidsig = hidsig./norm(hidsig);
for i = 1:2
normout(i) = dot([normhidsig;1],LW(:,i));
normoutsig(i) = sigmf(normhidsig(i),[1,0]);
end
normoutsoftmax = softmax(normout);
normoutsoftmaxsig = softmax(normoutsig);
result = net(in);
disp('sigmoid');
disp(outsig);
disp(outsoftmax);
disp(outsoftmaxsig);
disp(normoutsig);
disp(normoutsoftmax);
disp(normoutsoftmaxsig);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Greg Heath 2015-1-18

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/170069-what-functions-is-the-patternnet-for-the-hidden-layer-and-output-layers#answer_165217

在 MATLAB Online 中打开

1. Do not use xor as the name of a variable. It is the name of a function

 help xor
 doc xor

2.

   input  = [0 0; 0 1; 1 0; 1 1]';
   target  = xor(input)              % NO SEMICOLON!
 3. There is no good reason to add exact duplicates to this training set.  In general, however, adding noisy duplicates  can help if the net is to be used in an  environment of noise, interference and measurement  errors.
 4. If you want to know what transfer functions are being used, all you have to do is ask
net = patternnet(2) % NO SEMICOLON!
 5. Also note that there are default normalizations
 Hope this helps.
 *Thank you for formally accepting my answer*
 Greg

2 个评论
显示无隐藏无

Timmy 2015-1-19

In general programming, using the name of a function as the name of a variable is a big NO-NO. However, in this example, it is the emphasis that variable is XOR. Furthermore, I will not be using the function. I am also using other targets in the program, such as AND and OR, so dont want to use the name target.
Two things. First, the function xor takes two parameters and give a logical output, which is not what I want. I want a 2-output nodes system, therefore I make them as they were. Second the semicolon is to suppress the display that will generate by the equal sign, as I dont need it to display.
The reason behind adding duplicates is to generate enough samples for the NN to train on. By default, the NN is to train on 70%, validate on 15% and test on 15%. If I have just four inputs, two will be used to train, one will validate and one will be test, which will be inconclusive.
I am not trying to find out the transfer functions, but rather how it is used to classify the input.
I know there is normalizations but I dont know where it is applying to. As you can see in my code, I try with and without normalization.

Greg Heath 2015-1-20

1.In general programming, using the name of a function as the name of a variable is a big NO-NO. However, in this example, it is the emphasis that variable is XOR. Furthermore, I will not be using the function. I am also using other targets in the program, such as AND and OR, so dont want to use the name target.

Ridiculous answer. Write code so that others can follow it easily.

2.Two things. First, the function xor takes two parameters and give a logical output, which is not what I want. I want a 2-output nodes system, therefore I make them as they were. Second the semicolon is to suppress the display that will generate by the equal sign, as I dont need it to display.

The purpose of NO SEMICOLON is to verify, DURING DESIGN that the output is what you think it should be.

In this case it will also answer your question as to what transfer functions are used.

The finished design will have the semicolons.

3.The reason behind adding duplicates is to generate enough samples for the NN to train on. By default, the NN is to train on 70%, validate on 15% and test on 15%. If I have just four inputs, two will be used to train, one will validate and one will be test, which will be inconclusive.

Exact duplicates add nothing to the design except reduce the number of epochs. The purpose of validation and test data is to make sure the design works on nontraining data. However, your nontraining data contains exactly the same vectors. Therefore, it is a waste of time.

Typically, what is done is to add noise to the training and validation duplicates. Then test on the noisefree original data.

4.I am not trying to find out the transfer functions, but rather how it is used to classify the input.

Targets are 0 or 1 and can be obtained from class indices via the function ind2vec

With mutually exclusive classes, the suggested output transfer function is softmax because the outputs are supposed to be consistent unit-sum input-conditional posterior probability estimates in [0,1];

The maximum output indicates the class and is determined using the function vec2ind.

With non-mutually exclusive classes (e.g., tall, dark, handsome), unit sum is not applicable and logsig is the appropriate output transfer function.

The classes are determined by thresholds determined by validation data.

5.I know there is normalizations but I dont know where it is applying to. As you can see in my code, I try with and without normalization

I'll check. I cannot use your code.

Greg

请先登录，再进行评论。

What functions is the patternnet for the hidden layer and output layers?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

What functions is the patternnet for the hidden layer and output layers?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无