How to make this long if-else chain compact?
3 次查看(过去 30 天)
显示 更早的评论
I am trying to generate a markov chain based text generator of 2nd order. As a first step, I did a zeroth order Markov chain.But it has a long if-else chain. How to make this compact. When I am proceeding to 1st and 2nd order markov chains, I don't know how to manage this. And the code is given below:
%Zeroth order markov chain generator from a training data of a text file
close all;
clear ;
clc; % close all figure windows, clear variables, clear screen
%read text file and split it into characters in a cell
text = fileread('pp');
[~, text] = strsplit(text,'.','DelimiterType', ...
'RegularExpression','CollapseDelimiters',false);%split the string
%one big cell is returned, unwrap it and convert to lowercase
text = lower(text);
%delete white spaces (and keep a single space where continuous
%double spaces come)
ws = cellfun(@(x)any(x),isstrprop(text,'wspace'));
spIndex = strfind(ws, ones(1,2));
text(spIndex)=[];
%remove punctuation and any non-alpha characters
punc = cellfun(@(x)any(x),isstrprop(text,'punct'));
pIndex = strfind(punc, ones(1,1));
text(pIndex)=[];
%replace char(10) with space
text=strrep(text, char(10), ' ');
% for easy processing I am converting the alphabets to
% numbers (cell to number array) like a-z as 1-26 and space as 27.
seq=cell2mat(text);
seq=double(seq)-(double('a')-1);%alphabets a-z has ids 1-26
seq(seq == -64)=27;%space symbol has id 27
%find distribution of the sequence
xRange = 1:27; %# Range of integers to compute a probability for
N = hist(seq,xRange); %# Bin the data
dist=N./numel(seq);
cdist=cumsum(dist);
%generate sequence according to distribution
fileID = fopen('chain','w');
for k=1:numel(seq)
p=rand;
if ((p >= 0) && (p <= cdist(1)))
fprintf(fileID,'%s\n','1');
elseif ((p > cdist(1)) && (p <=cdist(2)))
fprintf(fileID,'%s\n','2');
elseif ((p > cdist(2)) && (p <=cdist(3)))
fprintf(fileID,'%s\n','3');
elseif ((p > cdist(3)) && (p <=cdist(4)))
fprintf(fileID,'%s\n','4');
elseif ((p > cdist(4)) && (p <=cdist(5)))
fprintf(fileID,'%s\n','5');
elseif ((p > cdist(5)) && (p <=cdist(6)))
fprintf(fileID,'%s\n','6');
elseif ((p > cdist(6)) && (p <=cdist(7)))
fprintf(fileID,'%s\n','7');
elseif ((p > cdist(7)) && (p <=cdist(8)))
fprintf(fileID,'%s\n','8');
elseif ((p > cdist(8)) && (p <=cdist(9)))
fprintf(fileID,'%s\n','9');
elseif ((p > cdist(9)) && (p <=cdist(10)))
fprintf(fileID,'%s\n','10');
elseif ((p > cdist(10)) && (p <=cdist(11)))
fprintf(fileID,'%s\n','11');
elseif ((p > cdist(11)) && (p <=cdist(12)))
fprintf(fileID,'%s\n','12');
elseif ((p > cdist(12)) && (p <=cdist(13)))
fprintf(fileID,'%s\n','13');
elseif ((p > cdist(13)) && (p <=cdist(14)))
fprintf(fileID,'%s\n','14');
elseif ((p > cdist(14)) && (p <=cdist(15)))
fprintf(fileID,'%s\n','15');
elseif ((p > cdist(15)) && (p <=cdist(16)))
fprintf(fileID,'%s\n','16');
elseif ((p > cdist(16)) && (p <=cdist(17)))
fprintf(fileID,'%s\n','17');
elseif ((p > cdist(17)) && (p <=cdist(18)))
fprintf(fileID,'%s\n','18');
elseif ((p > cdist(18)) && (p <=cdist(19)))
fprintf(fileID,'%s\n','19');
elseif ((p > cdist(19)) && (p <=cdist(20)))
fprintf(fileID,'%s\n','20');
elseif ((p > cdist(20)) && (p <=cdist(21)))
fprintf(fileID,'%s\n','21');
elseif ((p > cdist(21)) && (p <=cdist(22)))
fprintf(fileID,'%s\n','22');
elseif ((p > cdist(22)) && (p <=cdist(23)))
fprintf(fileID,'%s\n','23');
elseif ((p > cdist(23)) && (p <=cdist(24)))
fprintf(fileID,'%s\n','24');
elseif ((p > cdist(24)) && (p <=cdist(25)))
fprintf(fileID,'%s\n','25');
elseif ((p > cdist(25)) && (p <=cdist(26)))
fprintf(fileID,'%s\n','26');
elseif ((p > cdist(26)) && (p <=cdist(27)))
fprintf(fileID,'%s\n','27');
end
end
fclose(fileID);
fileID = fopen('chain','r');
gen_text = fscanf(fileID,'%f');%zeroth order markov generated text
fclose(fileID);
gen_text=gen_text+96;
gen_text(gen_text==(27+96))=32;
gen_text=gen_text';
gen_text=char(gen_text);
fileID = fopen('pp_zero_mc','w');
fprintf(fileID, '%s',gen_text);
fclose(fileID);
Otherwise I have to look for different method of generating sequence. Please help.
0 个评论
回答(1 个)
dpb
2014-11-8
Several alternatives -- probably bestest is table lookup
state=floor(interp1(cdist,[1:nStates],p));
Can get there also with
doc histc % optional second output
Either of these can be vectorized and extended to higher dimensions as well.
3 个评论
dpb
2014-11-10
As I noted, either can be extended to higher dimensions. Under
help interp1
one finds in the "See also" section
See also interp1q, interpft, ... interp2, interp3, interpn, ...
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Markov Chain Models 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!