building a Transformer for sorting numbers
4 次查看(过去 30 天)
显示 更早的评论
Hi,
I am trying to build a tranformer to sort some input numbers. it give error. first it asked me to have output layer which is there as a FC. here is the code, can somebody help me. I have seen many code use FC as an output head
% a complete transformer for sorting numbers
% Clear workspace
clear; clc;
% Check for GPU availability
if canUseGPU
disp('GPU is available. Training on GPU.');
executionEnvironment = 'gpu';
else
disp('GPU is not available. Training on CPU.');
executionEnvironment = 'cpu';
end
% Hyperparameters
numHeads = 1; % Number of attention heads
numLayers = 1; % Number of encoder and decoder layers
embeddingSize = 64; % Embedding size
hiddenSize = 128; % Hidden layer size
maxSequenceLength = 10; % Maximum sequence length
batchSize = 64; % Batch size
numEpochs = 10; % Number of epochs
learningRate = 0.001; % Learning rate
% Generate synthetic dataset
numSamples = 10000;
inputData = rand(numSamples, maxSequenceLength); % Random numbers between 0 and 1
outputData = sort(inputData, 2); % Sorted version of input data
% Split into training and validation sets
splitRatio = 0.8;
numTrain = floor(splitRatio * numSamples);
trainInput = inputData(1:numTrain, :);
trainOutput = outputData(1:numTrain, :);
valInput = inputData(numTrain+1:end, :);
valOutput = outputData(numTrain+1:end, :);
%% converting data to cell
% Define custom transformer encoder layer
for i=1:length(trainInput)
trainInput_cell{i,1}=trainInput(i,:);
trainOutput_cell{i,1}=trainOutput(i,:);
% valInput_cell{i}=valInput(i,:);
% valOutput_cell{i}=valOutput(i,:);
end
for i=1:length(valInput)
% trainInput_cell{i}=trainInput(i,:);
% trainOutput_cell{i}=trainOutput(i,:);
valInput_cell{i,1}=valInput(i,:);
valOutput_cell{i,1}=valOutput(i,:);
end
%% definging networks
% Define the full model
inputLayer = sequenceInputLayer(1, 'Name', 'input'); % Input is a sequence of scalars
embeddingLayer = fullyConnectedLayer(embeddingSize, 'Name', 'embedding');
positionalEncoding = positionalEncodingLayer(maxSequenceLength, embeddingSize, 'positionalEncoding');
%% ****************************************************************
encoderLayers = [];
for i = 1:numLayers
encoderLayers = [
encoderLayers
multiHeadAttentionLayer(numHeads, embeddingSize, ['encoderAttention', num2str(i)])
additionLayer(2, 'Name', ['encoderAdd' num2str(i)])
layerNormalizationLayer('Name', ['encoderNorm1' num2str(i)])
fullyConnectedLayer(hiddenSize, 'Name', ['encoderFC1' num2str(i)])
reluLayer('Name', ['encoderRelu' num2str(i)])
fullyConnectedLayer(embeddingSize, 'Name', ['encoderFC2' num2str(i)])
additionLayer(2, 'Name', ['encoderAdd2' num2str(i)])
layerNormalizationLayer('Name', ['encoderNorm2' num2str(i)])
];
end
% Define custom transformer decoder layer
decoderLayers = [];
for i = 1:numLayers
decoderLayers = [
decoderLayers
multiHeadAttentionLayer(numHeads, embeddingSize, ['decoderAttention1' num2str(i)])
additionLayer(2, 'Name', ['decoderAdd1' num2str(i)])
layerNormalizationLayer('Name', ['decoderNorm1' num2str(i)])
multiHeadAttentionLayer(numHeads, embeddingSize, ['decoderAttention2' num2str(i)])
additionLayer(2, 'Name', ['decoderAdd2' num2str(i)])
layerNormalizationLayer('Name', ['decoderNorm2' num2str(i)])
fullyConnectedLayer(hiddenSize, 'Name', ['decoderFC1' num2str(i)])
reluLayer('Name', ['decoderRelu' num2str(i)])
fullyConnectedLayer(embeddingSize, 'Name', ['decoderFC2' num2str(i)])
additionLayer(2, 'Name', ['decoderAdd3' num2str(i)])
layerNormalizationLayer('Name', ['decoderNorm3' num2str(i)])
];
end
%% ****************************************************************
% Assemble the encoder
encoder = [
inputLayer
embeddingLayer
% positionalEncoding
encoderLayers
];
% Assemble the decoder
decoder = [
% inputLayer
% embeddingLayer
% positionalEncoding
decoderLayers
];
% Output layer
outputLayer = fullyConnectedLayer(1, 'Name', 'output'); % Predicts a scalar at each time step
% Assemble the full model
layers = [
encoder
decoder
outputLayer
% regressionLayer('Name', 'regression') % Use regression for continuous output
];
% Convert to a layerGraph for visualization
% net = dlnetwork;
% net = addLayers(net,layers);
net=layerGraph(layers);
net = connectLayers(net,"embedding","encoderAdd1/in2");
net = connectLayers(net,"encoderNorm11","encoderAdd21/in2");
net = connectLayers(net,"encoderNorm21","decoderAdd11/in2");
net = connectLayers(net,"decoderNorm11","decoderAdd21/in2");
net = connectLayers(net,"decoderNorm21","decoderAdd31/in2");
% analyzeNetwork(net)
% plot(net)
% Training options
options = trainingOptions('adam', ...
'MaxEpochs', numEpochs, ...
'MiniBatchSize', batchSize, ...
'InitialLearnRate', learningRate, ...
'Shuffle', 'every-epoch', ...
'ValidationData', {valInput, valOutput}, ...
'ValidationFrequency', 30, ...
'ExecutionEnvironment', executionEnvironment, ...
'Plots', 'training-progress', ...
'Verbose', false);
% analyzeNetwork(net)
% Train the model
trained_net = trainNetwork(trainInput, trainOutput, net, options);
1 个评论
Walter Roberson
2025-2-25
PositionalEncodingLayer appears to be part of https://www.mathworks.com/matlabcentral/fileexchange/120873-eeg_blink
回答(1 个)
Gayathri
2025-4-15
The error is occurring because "positionalEncodingLayer" is not a predefined function in MATLAB. You will need to create this custom layer yourself or use the definition provided in the examples where you found the code.
You can also use the "sinusoidalPositionEncodingLayer" or "positionEmbeddingLayer" to encode position information for which predefined functions are available within MATLAB.
Please refer to the documentation links below to understand about the above mentioned functions.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Build Deep Neural Networks 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!