How do I properly format txt to be used in deep learning text generation.
1 次查看(过去 30 天)
显示 更早的评论
Im currently following the 'Generate Text using Deep Learning' toolbox but using a different piece of text.
I don't understand where this part of the code comes from:
I understand what it does as i can see it in the text, but where does \x2403 come from. The reason i ask is because in my text, everywhere there is an apostrophe, whethere in a word like can't, or where theres are quotes this symbol shows up Ô ...
Later on when i try and train, i get this error:
Error using trainNetwork (line 165)
Invalid training data. Labels must not contain undefined values.
Error in txtgen (line 73)
net = trainNetwork(XTrain,YTrain,layers,options);
Im not sure if this is related but either way the Ô shouldn't be there i dont think...
0 个评论
回答(1 个)
Harshit Jain
2019-3-29
Values of the form (\x0002) are unicode values for the respective characters. You can read more about unicode characters here
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Text Analytics Toolbox 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!