Problem with categories LSTM network

1 次查看(过去 30 天)
Hi everyone,
I am using LSTM network to break ciphers.Input is password encoded to matrix 94x6. I am using ASCII table from where I picked these 94 valid characters . Password will be always 6 characters. My categories are 94. I generetad train data and it worked quite good. Now I would like to add Validation data. They should't be the same which can be handled. But when I generate them NN won't because it thinks it does nt ahve the same categories. I checked that and look like totally the same . Any help would be appriecated.
k = 5; % posun znakov pre Cezarovu sifru
dlzka_Hesla = 6; % dlzka nahodne generovanych hesiel
N = 4000; % pocet vygenerovanych hesiel (velkost trenovacej mnoziny)
max_Epochs = 100; % maximalny pocet epoch trenovania siete
sifra = 1;
% vytvorime si vektor 94 znakov pre mozne vytvorenie hesiel
uniqueCharacters = char(33:126);
NumValid = 1000;
NumTest = 1000;
[textData,ValidationData,testData] = GenPasswords(uniqueCharacters,dlzka_Hesla,N,NumValid,NumTest);
% pre kazdy vstup do siete (jedno heslo) vytvor kategoricke premenne
[XTrain,YTrain,vec_cipher2] = GenTrainData(uniqueCharacters,sifra,N,textData,k);% konvertovanie stringu na one hot vectory
[XValidation,YValidation,vec_cipher3] = GenTrainData(uniqueCharacters,sifra,length(ValidationData),ValidationData,k);
Cat_Train = sort(categories([YTrain{:}]));
Cat_Val = sort(categories([YValidation{:}]));
CAT = [ Cat_Train Cat_Val];
if (sum (strcmp(Cat_Train,Cat_Val)) ==94)
disp ('Kategorie su identicke')
end
function [XTrain,YTrain,vec_cipher2] = GenTrainData(uniqueCharacters,sifra,N,textData,k)
% ulozime si ich pocet uniaktnych zakov
numUniqueCharacters = numel(uniqueCharacters);
c = 1;
for i = 1:N
% vyber jeden heslo
characters = textData{i};
% zisti pocet znakov
sequenceLength = numel(characters);
% zisti indexy kazdeho unikatneho znaku v riadku
[~,idx] = ismember(characters,uniqueCharacters);
% vytvor vektor reprezentujuci, ci sa dany znak nachadza v tomto vstupe
% teda, kategoricke premenne
X = zeros(numUniqueCharacters,sequenceLength);
% pre vsetky znaky, ktore sa nachadzaju v texte nastav vlajku na 1
for j = 1:sequenceLength
X(idx(j),j) = 1;
end
% vytvorime ocakavane mapovanie vystupu pouzitim Cezarovej sifry
if (sifra ==1)
cipher = caesar_cipher(textData{i}, k);
end
if (sifra ==2)
cipher = vignerie_cipher(textData{i},[25, 14, 17, 10]);
end
vec_cipher2(i,1:6) = cipher;
% ak sifra obsahuje znaky, ktore nie su v slovniku, tak ignorujeme
b = false;
for s=1:length(cipher)
if(sum(ismember(uniqueCharacters, cipher(s)))<=0)
b = true;
break;
end
end
% dont add this example to the training data
if(b)
cipher;
continue;
end
% vytvorime zasifrovany vektor vstupneho textu, ktory vznikol pouzitim sifry
charactersOutput = cellstr(cipher')';
% konvertujeme vystupny vektor na kategoricku premennu (pole znakov)
Y = categorical(charactersOutput);
% priradime vytvorene vstupno-vystupne pary do datasetu
% XTrain{c} = X;
% YTrain{c} = Y;
XTrain{c} = X;
YTrain{c} = Y;
c = c + 1;
end
end
function [textData,validationData,testData] = GenPasswords(uniqueCharacters,dlzka_Hesla,N,NumValid,NumTest)
% vygenerujeme nahodne indexy pre tvorbu hesiel
r = randi([1 length(uniqueCharacters)],dlzka_Hesla,N);
% vygenerujeme hesla v textovej podobe na zaklade indexov
textData = cell(N,1);
for e=1:N
textData{e} = uniqueCharacters(r(:,e));
end
r = randi([1 length(uniqueCharacters)],dlzka_Hesla,10000);
k = 1;
validationData = cell(NumValid,1);
% vygenerovanie validacnej mnoziny
for e = 1:N
validationData{k,1} = uniqueCharacters(r(:,e));
if( sum(ismember(validationData{k,1},textData{e,1}))~= 6)
k = k+1;
if k ==1001
break;
end
end
end
testData = cell(NumTest,1);
for i = 1:NumTest
% vygenerujeme heslo v textovej podobe na zaklade indexov
testData{i} = uniqueCharacters(r(:,i));
end
end

回答(0 个)

类别

Help CenterFile Exchange 中查找有关 MATLAB 的更多信息

产品


版本

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by