Main Content

Generate Text Using Deep Learning

This example shows how to train a deep learning long short-term memory (LSTM) network to generate text.

To train a deep learning network for text generation, train a sequence-to-sequence LSTM network to predict the next character in a sequence of characters. To train the network to predict the next character, specify the input sequences shifted by one time step as the responses.

To input a sequence of characters into an LSTM network, convert each training observation to a sequence of characters represented by the vectors xRD, where D is the number of unique characters in the vocabulary. For each vector, xi=1 if x corresponds to the character with index i in a given vocabulary, and xj=0 for ji.

Load Training Data

Extract the text data from the text file sonnets.txt.

filename = "sonnets.txt";
textData = fileread(filename);

The sonnets are indented by two whitespace characters and are separated by two newline characters. Remove the indentations using replace and split the text into separate sonnets using split. Remove the main title from the first three elements and the sonnet titles which appear before each sonnet.

textData = replace(textData,"  ","");
textData = split(textData,[newline newline]);
textData = textData(5:2:end);

View the first few documents.

textData(1:10)
ans = 10×1 cell
    {'From fairest creatures we desire increase,↵That thereby beauty's rose might never die,↵But as the riper should by time decease,↵His tender heir might bear his memory:↵But thou, contracted to thine own bright eyes,↵Feed'st thy light's flame with self-substantial fuel,↵Making a famine where abundance lies,↵Thy self thy foe, to thy sweet self too cruel:↵Thou that art now the world's fresh ornament,↵And only herald to the gaudy spring,↵Within thine own bud buriest thy content,↵And tender churl mak'st waste in niggarding:↵Pity the world, or else this glutton be,↵To eat the world's due, by the grave and thee.'                                 }
    {'When forty winters shall besiege thy brow,↵And dig deep trenches in thy beauty's field,↵Thy youth's proud livery so gazed on now,↵Will be a tatter'd weed of small worth held:↵Then being asked, where all thy beauty lies,↵Where all the treasure of thy lusty days;↵To say, within thine own deep sunken eyes,↵Were an all-eating shame, and thriftless praise.↵How much more praise deserv'd thy beauty's use,↵If thou couldst answer 'This fair child of mine↵Shall sum my count, and make my old excuse,'↵Proving his beauty by succession thine!↵This were to be new made when thou art old,↵And see thy blood warm when thou feel'st it cold.'               }
    {'Look in thy glass and tell the face thou viewest↵Now is the time that face should form another;↵Whose fresh repair if now thou not renewest,↵Thou dost beguile the world, unbless some mother.↵For where is she so fair whose unear'd womb↵Disdains the tillage of thy husbandry?↵Or who is he so fond will be the tomb,↵Of his self-love to stop posterity?↵Thou art thy mother's glass and she in thee↵Calls back the lovely April of her prime;↵So thou through windows of thine age shalt see,↵Despite of wrinkles this thy golden time.↵But if thou live, remember'd not to be,↵Die single and thine image dies with thee.'                                    }
    {'Unthrifty loveliness, why dost thou spend↵Upon thy self thy beauty's legacy?↵Nature's bequest gives nothing, but doth lend,↵And being frank she lends to those are free:↵Then, beauteous niggard, why dost thou abuse↵The bounteous largess given thee to give?↵Profitless usurer, why dost thou use↵So great a sum of sums, yet canst not live?↵For having traffic with thy self alone,↵Thou of thy self thy sweet self dost deceive:↵Then how when nature calls thee to be gone,↵What acceptable audit canst thou leave?↵Thy unused beauty must be tombed with thee,↵Which, used, lives th' executor to be.'                                                      }
    {'Those hours, that with gentle work did frame↵The lovely gaze where every eye doth dwell,↵Will play the tyrants to the very same↵And that unfair which fairly doth excel;↵For never-resting time leads summer on↵To hideous winter, and confounds him there;↵Sap checked with frost, and lusty leaves quite gone,↵Beauty o'er-snowed and bareness every where:↵Then were not summer's distillation left,↵A liquid prisoner pent in walls of glass,↵Beauty's effect with beauty were bereft,↵Nor it, nor no remembrance what it was:↵But flowers distill'd, though they with winter meet,↵Leese but their show; their substance still lives sweet.'                   }
    {'Then let not winter's ragged hand deface,↵In thee thy summer, ere thou be distill'd:↵Make sweet some vial; treasure thou some place↵With beauty's treasure ere it be self-kill'd.↵That use is not forbidden usury,↵Which happies those that pay the willing loan;↵That's for thy self to breed another thee,↵Or ten times happier, be it ten for one;↵Ten times thy self were happier than thou art,↵If ten of thine ten times refigur'd thee:↵Then what could death do if thou shouldst depart,↵Leaving thee living in posterity?↵Be not self-will'd, for thou art much too fair↵To be death's conquest and make worms thine heir.'                                }
    {'Lo! in the orient when the gracious light↵Lifts up his burning head, each under eye↵Doth homage to his new-appearing sight,↵Serving with looks his sacred majesty;↵And having climb'd the steep-up heavenly hill,↵Resembling strong youth in his middle age,↵Yet mortal looks adore his beauty still,↵Attending on his golden pilgrimage:↵But when from highmost pitch, with weary car,↵Like feeble age, he reeleth from the day,↵The eyes, 'fore duteous, now converted are↵From his low tract, and look another way:↵So thou, thyself outgoing in thy noon:↵Unlook'd, on diest unless thou get a son.'                                                            }
    {'Music to hear, why hear'st thou music sadly?↵Sweets with sweets war not, joy delights in joy:↵Why lov'st thou that which thou receiv'st not gladly,↵Or else receiv'st with pleasure thine annoy?↵If the true concord of well-tuned sounds,↵By unions married, do offend thine ear,↵They do but sweetly chide thee, who confounds↵In singleness the parts that thou shouldst bear.↵Mark how one string, sweet husband to another,↵Strikes each in each by mutual ordering;↵Resembling sire and child and happy mother,↵Who, all in one, one pleasing note do sing:↵Whose speechless song being many, seeming one,↵Sings this to thee: 'Thou single wilt prove none.''}
    {'Is it for fear to wet a widow's eye,↵That thou consum'st thy self in single life?↵Ah! if thou issueless shalt hap to die,↵The world will wail thee like a makeless wife;↵The world will be thy widow and still weep↵That thou no form of thee hast left behind,↵When every private widow well may keep↵By children's eyes, her husband's shape in mind:↵Look! what an unthrift in the world doth spend↵Shifts but his place, for still the world enjoys it;↵But beauty's waste hath in the world an end,↵And kept unused the user so destroys it.↵No love toward others in that bosom sits↵That on himself such murd'rous shame commits.'                           }
    {'For shame! deny that thou bear'st love to any,↵Who for thy self art so unprovident.↵Grant, if thou wilt, thou art belov'd of many,↵But that thou none lov'st is most evident:↵For thou art so possess'd with murderous hate,↵That 'gainst thy self thou stick'st not to conspire,↵Seeking that beauteous roof to ruinate↵Which to repair should be thy chief desire.↵O! change thy thought, that I may change my mind:↵Shall hate be fairer lodg'd than gentle love?↵Be, as thy presence is, gracious and kind,↵Or to thyself at least kind-hearted prove:↵Make thee another self for love of me,↵That beauty still may live in thine or thee.'                     }

Convert Text Data to Sequences

Convert the text data to sequences of vectors for the predictors and categorical sequences for the responses.

Create special characters to denote "start of text", "whitespace", "end of text" and "newline". Use the special characters "\x0002" (start of text), "\x00B7" ("·", middle dot), "\x2403" ("␃", end of text), and "\x00B6" ("", pilcrow) respectively. To prevent ambiguity, you must choose special characters that do not appear in the text. Because these characters do not appear in the training data, they can be used for this purpose.

startOfTextCharacter = compose("\x0002");
whitespaceCharacter = compose("\x00B7");
endOfTextCharacter = compose("\x2403");
newlineCharacter = compose("\x00B6");

For each observation, insert the start of text character at the beginning and replace the whitespace and newlines with the corresponding characters.

textData = startOfTextCharacter + textData;
textData = replace(textData,[" " newline],[whitespaceCharacter newlineCharacter]);

Create a vocabulary of the unique characters in the text.

uniqueCharacters = unique([textData{:}]);
numUniqueCharacters = numel(uniqueCharacters);

Loop over the text data and create a sequence of vectors representing the characters of each observation and a categorical sequence of characters for the responses. To denote the end of each observation, include the end of text character.

numDocuments = numel(textData);
XTrain = cell(1,numDocuments);
YTrain = cell(1,numDocuments);
for i = 1:numel(textData)
    characters = textData{i};
    sequenceLength = numel(characters);
    
    % Get indices of characters.
    [~,idx] = ismember(characters,uniqueCharacters);
    
    % Convert characters to vectors.
    X = zeros(numUniqueCharacters,sequenceLength);
    for j = 1:sequenceLength
        X(idx(j),j) = 1;
    end
    
    % Create vector of categorical responses with end of text character.
    charactersShifted = [cellstr(characters(2:end)')' endOfTextCharacter];
    Y = categorical(charactersShifted, [string(uniqueCharacters(2:end)'); endOfTextCharacter]);
    XTrain{i} = X;
    YTrain{i} = Y;
end

View the first observation and the size of the corresponding sequence. The sequence is a D-by-S matrix, where D is the number of features (the number of unique characters) and S is the sequence length (the number of characters in the text).

textData{1}
ans = 
'□From·fairest·creatures·we·desire·increase,¶That·thereby·beauty's·rose·might·never·die,¶But·as·the·riper·should·by·time·decease,¶His·tender·heir·might·bear·his·memory:¶But·thou,·contracted·to·thine·own·bright·eyes,¶Feed'st·thy·light's·flame·with·self-substantial·fuel,¶Making·a·famine·where·abundance·lies,¶Thy·self·thy·foe,·to·thy·sweet·self·too·cruel:¶Thou·that·art·now·the·world's·fresh·ornament,¶And·only·herald·to·the·gaudy·spring,¶Within·thine·own·bud·buriest·thy·content,¶And·tender·churl·mak'st·waste·in·niggarding:¶Pity·the·world,·or·else·this·glutton·be,¶To·eat·the·world's·due,·by·the·grave·and·thee.'
size(XTrain{1})
ans = 1×2

    62   611

View the corresponding response sequence. The sequence is a 1-by-S categorical vector of responses.

YTrain{1}
ans = 1×611 categorical
     F      r      o      m      ·      f      a      i      r      e      s      t      ·      c      r      e      a      t      u      r      e      s      ·      w      e      ·      d      e      s      i      r      e      ·      i      n      c      r      e      a      s      e      ,      ¶      T      h      a      t      ·      t      h      e      r      e      b      y      ·      b      e      a      u      t      y            s      ·      r      o      s      e      ·      m      i      g      h      t      ·      n      e      v      e      r      ·      d      i      e      ,      ¶      B      u      t      ·      a      s      ·      t      h      e      ·      r      i      p      e      r      ·      s      h      o      u      l      d      ·      b      y      ·      t      i      m      e      ·      d      e      c      e      a      s      e      ,      ¶      H      i      s      ·      t      e      n      d      e      r      ·      h      e      i      r      ·      m      i      g      h      t      ·      b      e      a      r      ·      h      i      s      ·      m      e      m      o      r      y      :      ¶      B      u      t      ·      t      h      o      u      ,      ·      c      o      n      t      r      a      c      t      e      d      ·      t      o      ·      t      h      i      n      e      ·      o      w      n      ·      b      r      i      g      h      t      ·      e      y      e      s      ,      ¶      F      e      e      d            s      t      ·      t      h      y      ·      l      i      g      h      t            s      ·      f      l      a      m      e      ·      w      i      t      h      ·      s      e      l      f      -      s      u      b      s      t      a      n      t      i      a      l      ·      f      u      e      l      ,      ¶      M      a      k      i      n      g      ·      a      ·      f      a      m      i      n      e      ·      w      h      e      r      e      ·      a      b      u      n      d      a      n      c      e      ·      l      i      e      s      ,      ¶      T      h      y      ·      s      e      l      f      ·      t      h      y      ·      f      o      e      ,      ·      t      o      ·      t      h      y      ·      s      w      e      e      t      ·      s      e      l      f      ·      t      o      o      ·      c      r      u      e      l      :      ¶      T      h      o      u      ·      t      h      a      t      ·      a      r      t      ·      n      o      w      ·      t      h      e      ·      w      o      r      l      d            s      ·      f      r      e      s      h      ·      o      r      n      a      m      e      n      t      ,      ¶      A      n      d      ·      o      n      l      y      ·      h      e      r      a      l      d      ·      t      o      ·      t      h      e      ·      g      a      u      d      y      ·      s      p      r      i      n      g      ,      ¶      W      i      t      h      i      n      ·      t      h      i      n      e      ·      o      w      n      ·      b      u      d      ·      b      u      r      i      e      s      t      ·      t      h      y      ·      c      o      n      t      e      n      t      ,      ¶      A      n      d      ·      t      e      n      d      e      r      ·      c      h      u      r      l      ·      m      a      k            s      t      ·      w      a      s      t      e      ·      i      n      ·      n      i      g      g      a      r      d      i      n      g      :      ¶      P      i      t      y      ·      t      h      e      ·      w      o      r      l      d      ,      ·      o      r      ·      e      l      s      e      ·      t      h      i      s      ·      g      l      u      t      t      o      n      ·      b      e      ,      ¶      T      o      ·      e      a      t      ·      t      h      e      ·      w      o      r      l      d            s      ·      d      u      e      ,      ·      b      y      ·      t      h      e      ·      g      r      a      v      e      ·      a      n      d      ·      t      h      e      e      .      ␃ 

Create and Train LSTM Network

Define the LSTM architecture. Specify a sequence-to-sequence LSTM classification network with 200 hidden units. Set the feature dimension of the training data (the number of unique characters) as the input size, and the number of categories in the responses as the output size of the fully connected layer.

inputSize = size(XTrain{1},1);
numHiddenUnits = 200;
numClasses = numel(categories([YTrain{:}]));

layers = [
    sequenceInputLayer(inputSize)
    lstmLayer(numHiddenUnits,'OutputMode','sequence')
    fullyConnectedLayer(numClasses)
    softmaxLayer];

Specify the training options. Choosing among the options requires empirical analysis. To explore different training option configurations by running experiments, you can use the Experiment Manager app.

  • Train using the Adam optimizer.

  • Specify the number of training epochs as 1000 and the initial learn rate as 0.01.

  • Set the learn rate drop factor to 0.05 and the learn rate drop period to 50.

  • Set the gradient threshold to 10.

  • The mini-batch size option specifies the number of observations to process in a single iteration. Specify a mini-batch size that evenly divides the data to ensure that the function uses all observations for training. Otherwise, the function ignores observations that do not complete a mini-batch. Set the mini-batch size to 77.

  • Specify to shuffle the data every epoch by setting the 'Shuffle' option to 'every-epoch'.

  • To monitor the training progress, set the 'Plots' option to 'training-progress'.

  • To suppress verbose output, set 'Verbose' to false.

  • Because the training data has sequences with rows and columns corresponding to channels and time steps, respectively, specify the input data format 'CTB' (channel, time, batch).

options = trainingOptions('adam', ...
    'MaxEpochs',1000, ...
    'InitialLearnRate',0.01, ...
    'LearnRateDropFactor',0.05, ...
    'LearnRateDropPeriod',50, ...
    'GradientThreshold',10, ...
    'MiniBatchSize',77,...
    'Shuffle','every-epoch', ...
    'Plots','training-progress', ...
    'Verbose',false, ...
    'InputDataFormats','CTB', ...
    'Metrics','accuracy');

Train the neural network using the trainnet function. For classification, use cross-entropy loss. The model is a sequence-to-sequence classification network. For each time step of the input data, the target label is the next character of the sequence. That is, the model classifies each time step with the next character for that time step.

By default, the trainnet function uses a GPU if one is available. Training on a GPU requires a Parallel Computing Toolbox™ license and a supported GPU device. For information on supported devices, see GPU Computing Requirements. Otherwise, the trainnet function uses the CPU. To specify the execution environment, use the ExecutionEnvironment training option.

[net, info] = trainnet(XTrain,YTrain,layers,"crossentropy",options);

Generate New Text

Use the generateText function, listed at the end of the example, to generate text using the trained network.

The generateText function generates text character by character, starting with the start of text character and reconstructs the text using the special characters. The function samples each character using the output prediction scores. The function stops predicting when the network predicts the end-of-text character or when the generated text is 500 characters long.

Generate text using the trained network.

vocabulary = string(categories(Y));
generatedText = generateText(net,vocabulary,uniqueCharacters,startOfTextCharacter,newlineCharacter,whitespaceCharacter,endOfTextCharacter)
generatedText = 
    "When and though det thee, kellit, thy frams awtor
     Tom that recoud is my lifer may pride.
     Which I betterpity nour so mons?
     Shol'd aving sowrast what thou doys did be;
     Tove,.F, thou for the roomemy graven's I sto then rake,
     Times it to I fell why of your love's greet;
     I con thee amoul loss thie powernatirs,
     Which fallers high, and I give and all,
     Unthorgast beath prowqerthites coptoni;
     Or, on theesp her beauty's rose hervers'd in,
     Now living black'd as yet do thit shor though:
     Olf whose daintity n"

Text Generation Function

The generateText function generates text character by character, starting with the start of text character and reconstructs the text using the special characters. The function samples each character using the output prediction scores. The function stops predicting when the network predicts the end-of-text character or when the generated text is 500 characters long.

function generatedText = generateText(net,vocabulary,uniqueCharacters,startOfTextCharacter,newlineCharacter,whitespaceCharacter,endOfTextCharacter)

Create the vector of the start of text character by finding its index.

numUniqueCharacters = numel(uniqueCharacters);
X = zeros(numUniqueCharacters,1);
idx = strfind(uniqueCharacters,startOfTextCharacter);
X(idx) = 1;

Generate the text character by character using the trained LSTM network using predict and datasample. Stop predicting when the network predicts the end-of-text character or when the generated text is 500 characters long. The datasample function requires Statistics and Machine Learning Toolbox™.

For large collections of data, long sequences, or large networks, predictions on the GPU are usually faster to compute than predictions on the CPU. Otherwise, predictions on the CPU are usually faster to compute. For single time step predictions, use the CPU. To use the CPU for prediction, set the 'ExecutionEnvironment' option of predict to 'cpu'.

generatedText = "";
maxLength = 500;
while strlength(generatedText) < maxLength
    % Predict the next character scores.
    [characterScores,state] = predict(net,X,InputDataFormats='CTB');
    net.State = state;
    % Sample the next character.
    newCharacter = datasample(vocabulary,1,'Weights',characterScores);
    
    % Stop predicting at the end of text.
    if newCharacter == endOfTextCharacter
        break
    end
    
    % Add the character to the generated text.
    generatedText = generatedText + newCharacter;
    
    % Create a new vector for the next input.
    X(:) = 0;
    idx = strfind(uniqueCharacters,newCharacter);
    X(idx) = 1;
end

Reconstruct the generated text by replacing the special characters with their corresponding whitespace and newline characters.

generatedText = replace(generatedText,[newlineCharacter whitespaceCharacter],[newline " "]);

end

See Also

| | | |

Related Topics