Reducing 3D to 2D in Neural Network Training

Question

I MING 2024-10-30，9:20

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2162975-reducing-3d-to-2d-in-neural-network-training

编辑： Umang Pandey 2024-11-5，4:19

I want to reduce 3-dimensional data to 2-dimensional in neural network training, not by using preprocessing, but by using a customized network in training, or a built-in network.Here is my customized layer, it doesn't achieve the result I want, I tried to reduce the dimension from 128x1xN to 128xN

    
classdef DownDimension < nnet.layer.Layer
    % 自定義降維層，將 (S*S*C) 降為 (S*C)
    
    methods
        function layer = DownDimension(name)
            % 層建構函式
            layer.Name = name;
            layer.Description = "Squeeze layer from (S*S*C) to (S*C)";
        end
        
        function Z = predict(layer, X)
            % 前向傳播操作
            % 假設 X 的尺寸為 (128, 1)
            
            % 顯示 X 的原始尺寸
            disp('Original size of X:');
            disp(size(X));
            
            % 如果需要轉換為 (128, 1, N)
            Z = reshape(X, [size(X,1), 1, size(X,2)]);
            
            % 顯示 Z 的新尺寸
            disp('New size of Z after reshaping:');
            disp(size(Z));
        end
    end
end

3 个评论
显示 1更早的评论隐藏 1更早的评论

I MING 2024-10-30，9:38

在 MATLAB Online 中打开

我要先將我的程式進行二維的計算，再經過降維度經過一維捲積層，下面是我的程式碼，是要實做期刊中transformer-encodeing層訓練的

numHeads = 8;
numKeyChannels = 16;
modelSize=1;
kernalsize=17;
filter_num=1;
d_model=128;
frame_mo=128;
flattenedSize = [128 * 128, 1];
%% encoder_block1================
TF_block1_colmean=[
    imageInputLayer([d_model frame_mo], 'Name', "input") % 改為聲譜圖輸入層
    selfattention2dLayer(numHeads, numKeyChannels, modelSize, "encoder_block1_self2d")
    %colmean
    ColumnMeanLayer("encoder_block1_colmean")
    DownDimension("encoder_block1_col_downdim1")
    convolution1dLayer(kernalsize, filter_num, 'Name', 'encoder_block1_colmean_conv1', 'Padding', 'same') % 核大小為3，16個濾波器
    reluLayer('Name', 'encoder_block1_colmean_relu')
    convolution1dLayer(kernalsize, filter_num, 'Name', 'encoder_block1_colmean_conv2', 'Padding', 'same') % 核大小為3，32個濾波器
    sigmoidLayer('Name', 'encoder_block1_colmean_sigmoid')
    ];
TF_block1_rowmean=[
    %rowmean
    RowMeanLayer('encoder_block1_rowmean')
    DownDimension("encoder_block1_row_downdim1")
    convolution1dLayer(kernalsize, filter_num, 'Name', 'encoder_block1_rowmean_conv1', 'Padding', 'same') % 核大小為3，16個濾波器
    reluLayer('Name', 'encoder_block1_rowmean_relu')
    convolution1dLayer(kernalsize, filter_num, 'Name', 'encoder_block1_rowmean_conv2', 'Padding', 'same') % 核大小為3，32個濾波器
    sigmoidLayer('Name', 'encoder_block1_rowmean_sigmoid')
    VectorMultiplicationLayer("encoder_block1_mult")
    %add&mean
    additionLayer(2, 'Name', "encoder_block1_add1")
    layerNormalizationLayer('Name', "encoder_block1_norm1")
    ];
feedforward_layers=[
    %feed-forward
    fullyConnectedLayer(64, 'Name', "encoder_block1_fc1")
    reluLayer('Name', "encoder_block1_relu")
    fullyConnectedLayer(d_model*frame_mo*1, 'Name', "encoder_block1_fc2")
    reshapeLayerfc('encoder_block1_reshape', [d_model,frame_mo])
    additionLayer(2, 'Name', "encoder_block1_add2")
    layerNormalizationLayer('Name', "encoder_block1_norm2")
    ];
%encoder_block1_connect
lgraph1 = layerGraph(TF_block1_colmean);
%lgraph1 = addLayers(lgraph1,TF_block1_colmean);
lgraph1 = addLayers(lgraph1,TF_block1_rowmean);
lgraph1 = addLayers(lgraph1,feedforward_layers);
%add輸入
lgraph1 = connectLayers(lgraph1, 'input', 'encoder_block1_add1/in2');
%將selfattention接到row
lgraph1 = connectLayers(lgraph1, 'encoder_block1_self2d', 'encoder_block1_rowmean');
%將矩陣乘法層數相連接
lgraph1 = connectLayers(lgraph1, 'encoder_block1_colmean_sigmoid', 'encoder_block1_mult/in2');
%第二個add輸入
lgraph1 = connectLayers(lgraph1, 'encoder_block1_norm1', 'encoder_block1_add2/in2');
lgraph1 = connectLayers(lgraph1, 'encoder_block1_norm1', 'encoder_block1_fc1');
%plot(lgraph1);
analyzeNetwork(lgraph1);

我的self-attention是使用paper中描述二維的自定義層，下面是我的二維self-attention

classdef selfattention2dLayer < nnet.layer.Layer
    properties
        % Layer properties.
        NumHeads
        NumKeyChannels
        ModelSize
    end
    properties (Learnable)
        % Layer learnable parameters.
        QueryWeights
        KeyWeights
        ValueWeights
        OutputWeights
    end
    methods
        function layer = selfattention2dLayer(numHeads, numKeyChannels, modelSize, name)
            % Create an instance of the layer.
            layer.Name = name;
            layer.NumHeads = numHeads;
            layer.NumKeyChannels = numKeyChannels;
            layer.ModelSize = modelSize;
            % Initialize learnable parameters.
            layer.QueryWeights = randn([1 1 modelSize numKeyChannels], 'single');
            layer.KeyWeights = randn([1 1 modelSize numKeyChannels], 'single');
            layer.ValueWeights = randn([1 1 modelSize numKeyChannels], 'single');
            layer.OutputWeights = randn([1 1 numKeyChannels*numHeads modelSize], 'single');
        end
        function Z = predict(layer, X)
            % Implement the forward pass of the layer.
            % X is the input feature map (H x W x C x N).
            % Specify data format
            dataFormat = 'SSCB'; % 'SSCB' indicates spatial-spatial-channel-batch
            % Create zero biases
            zeroBiasQuery = zeros(1, 1, size(layer.QueryWeights, 4), 'single');
            zeroBiasKey = zeros(1, 1, size(layer.KeyWeights, 4), 'single');
            zeroBiasValue = zeros(1, 1, size(layer.ValueWeights, 4), 'single');
            zeroBiasOutput = zeros(1, 1, size(layer.OutputWeights, 4), 'single');
            % Linear projections.
            Q = dlconv(X, layer.QueryWeights, zeroBiasQuery, 'DataFormat', dataFormat);
            K = dlconv(X, layer.KeyWeights, zeroBiasKey, 'DataFormat', dataFormat);
            V = dlconv(X, layer.ValueWeights, zeroBiasValue, 'DataFormat', dataFormat);
            % Get the dimensions.
            [H, W, ~, N] = size(X);
            % Check dimensions before reshaping
            % sizeQ = size(Q);
            % sizeK = size(K);
            % sizeV = size(V);
            % disp(['Size of Q: ', num2str(sizeQ)]);
            % disp(['Size of K: ', num2str(sizeK)]);
            % disp(['Size of V: ', num2str(sizeV)]);
            % Reshape Q, K, V to [H*W, numKeyChannels, NumHeads, N]
            Q = reshape(Q, [], layer.NumKeyChannels, layer.NumHeads, N);
            K = reshape(K, [], layer.NumKeyChannels, layer.NumHeads, N);
            V = reshape(V, [], layer.NumKeyChannels, layer.NumHeads, N);
            % Permute Q, K, V to [H*W, NumHeads, numKeyChannels, N] for batch-wise computation
            Q = permute(Q, [1, 3, 2, 4]);
            K = permute(K, [1, 3, 2, 4]);
            V = permute(V, [1, 3, 2, 4]);
            % Scaled dot-product attention.
            dk = size(K, 3);
            scores = pagemtimes(Q, 'none', K, 'transpose') / sqrt(dk);
            attentionWeights = softmax(scores, 'DataFormat', 'SSCB');
            A = pagemtimes(attentionWeights, V);
            % Permute A back to [H*W, numKeyChannels*NumHeads, N]
            A = permute(A, [1, 3, 2, 4]);
            % Reshape A to [H, W, numKeyChannels*NumHeads, N]
            A = reshape(A, H, W, [], N);
            % Update OutputWeights dimensions to match A's channel dimension
            layer.OutputWeights = randn([1 1 size(A, 3) size(layer.OutputWeights, 4)], 'single');
            % Concatenate heads and project.
            Z = dlconv(A, layer.OutputWeights, zeroBiasOutput, 'DataFormat', dataFormat);
            % disp(['Converted self-Z type: ', class(Z)]);
        end
    end
end

I MING 2024-10-30，9:38

Yes, I tried to reduce the dimension from 128x1xN to 128xN

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Umang Pandey 2024-11-5，4:19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2162975-reducing-3d-to-2d-in-neural-network-training#answer_1540865

编辑：Umang Pandey 2024-11-5，4:19

Hi,

You can make use of the "squeeze" function to convert the matrix of dimension "128X1XN" to "128XN". The function simply removes the dimension of length 1. You can refer to the following MATLAB documentation for details on implementation and examples:

https://www.mathworks.com/help/matlab/ref/squeeze.html

However, this function would remove the dimension of length 1, if you want to perform Dimensionality Reduction using some neural net which would preserve the data for some expected parameters while reducing dimensions, you can make use of some popular DR techniques like PCA (Principal Component Analysis), SOM (Self-Organizing Maps), etc. You can refer to the following MATLAB documentation for more information:

Reduce dimensionality using PCA : https://www.mathworks.com/help/stats/reducedimensionalitytask.html
SOM : https://www.mathworks.com/help/deeplearning/ref/selforgmap.html

Best,

Umang

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Reducing 3D to 2D in Neural Network Training

3 个评论
显示 1更早的评论隐藏 1更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Reducing 3D to 2D in Neural Network Training

3 个评论 显示 1更早的评论隐藏 1更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

3 个评论
显示 1更早的评论隐藏 1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论