How to make this long if-else chain compact?

2 次查看(过去 30 天)
I am trying to generate a markov chain based text generator of 2nd order. As a first step, I did a zeroth order Markov chain.But it has a long if-else chain. How to make this compact. When I am proceeding to 1st and 2nd order markov chains, I don't know how to manage this. And the code is given below:
%Zeroth order markov chain generator from a training data of a text file
close all;
clear ;
clc; % close all figure windows, clear variables, clear screen
%read text file and split it into characters in a cell
text = fileread('pp');
[~, text] = strsplit(text,'.','DelimiterType', ...
'RegularExpression','CollapseDelimiters',false);%split the string
%one big cell is returned, unwrap it and convert to lowercase
text = lower(text);
%delete white spaces (and keep a single space where continuous
%double spaces come)
ws = cellfun(@(x)any(x),isstrprop(text,'wspace'));
spIndex = strfind(ws, ones(1,2));
text(spIndex)=[];
%remove punctuation and any non-alpha characters
punc = cellfun(@(x)any(x),isstrprop(text,'punct'));
pIndex = strfind(punc, ones(1,1));
text(pIndex)=[];
%replace char(10) with space
text=strrep(text, char(10), ' ');
% for easy processing I am converting the alphabets to
% numbers (cell to number array) like a-z as 1-26 and space as 27.
seq=cell2mat(text);
seq=double(seq)-(double('a')-1);%alphabets a-z has ids 1-26
seq(seq == -64)=27;%space symbol has id 27
%find distribution of the sequence
xRange = 1:27; %# Range of integers to compute a probability for
N = hist(seq,xRange); %# Bin the data
dist=N./numel(seq);
cdist=cumsum(dist);
%generate sequence according to distribution
fileID = fopen('chain','w');
for k=1:numel(seq)
p=rand;
if ((p >= 0) && (p <= cdist(1)))
fprintf(fileID,'%s\n','1');
elseif ((p > cdist(1)) && (p <=cdist(2)))
fprintf(fileID,'%s\n','2');
elseif ((p > cdist(2)) && (p <=cdist(3)))
fprintf(fileID,'%s\n','3');
elseif ((p > cdist(3)) && (p <=cdist(4)))
fprintf(fileID,'%s\n','4');
elseif ((p > cdist(4)) && (p <=cdist(5)))
fprintf(fileID,'%s\n','5');
elseif ((p > cdist(5)) && (p <=cdist(6)))
fprintf(fileID,'%s\n','6');
elseif ((p > cdist(6)) && (p <=cdist(7)))
fprintf(fileID,'%s\n','7');
elseif ((p > cdist(7)) && (p <=cdist(8)))
fprintf(fileID,'%s\n','8');
elseif ((p > cdist(8)) && (p <=cdist(9)))
fprintf(fileID,'%s\n','9');
elseif ((p > cdist(9)) && (p <=cdist(10)))
fprintf(fileID,'%s\n','10');
elseif ((p > cdist(10)) && (p <=cdist(11)))
fprintf(fileID,'%s\n','11');
elseif ((p > cdist(11)) && (p <=cdist(12)))
fprintf(fileID,'%s\n','12');
elseif ((p > cdist(12)) && (p <=cdist(13)))
fprintf(fileID,'%s\n','13');
elseif ((p > cdist(13)) && (p <=cdist(14)))
fprintf(fileID,'%s\n','14');
elseif ((p > cdist(14)) && (p <=cdist(15)))
fprintf(fileID,'%s\n','15');
elseif ((p > cdist(15)) && (p <=cdist(16)))
fprintf(fileID,'%s\n','16');
elseif ((p > cdist(16)) && (p <=cdist(17)))
fprintf(fileID,'%s\n','17');
elseif ((p > cdist(17)) && (p <=cdist(18)))
fprintf(fileID,'%s\n','18');
elseif ((p > cdist(18)) && (p <=cdist(19)))
fprintf(fileID,'%s\n','19');
elseif ((p > cdist(19)) && (p <=cdist(20)))
fprintf(fileID,'%s\n','20');
elseif ((p > cdist(20)) && (p <=cdist(21)))
fprintf(fileID,'%s\n','21');
elseif ((p > cdist(21)) && (p <=cdist(22)))
fprintf(fileID,'%s\n','22');
elseif ((p > cdist(22)) && (p <=cdist(23)))
fprintf(fileID,'%s\n','23');
elseif ((p > cdist(23)) && (p <=cdist(24)))
fprintf(fileID,'%s\n','24');
elseif ((p > cdist(24)) && (p <=cdist(25)))
fprintf(fileID,'%s\n','25');
elseif ((p > cdist(25)) && (p <=cdist(26)))
fprintf(fileID,'%s\n','26');
elseif ((p > cdist(26)) && (p <=cdist(27)))
fprintf(fileID,'%s\n','27');
end
end
fclose(fileID);
fileID = fopen('chain','r');
gen_text = fscanf(fileID,'%f');%zeroth order markov generated text
fclose(fileID);
gen_text=gen_text+96;
gen_text(gen_text==(27+96))=32;
gen_text=gen_text';
gen_text=char(gen_text);
fileID = fopen('pp_zero_mc','w');
fprintf(fileID, '%s',gen_text);
fclose(fileID);
Otherwise I have to look for different method of generating sequence. Please help.

回答(1 个)

dpb
dpb 2014-11-8
Several alternatives -- probably bestest is table lookup
state=floor(interp1(cdist,[1:nStates],p));
Can get there also with
doc histc % optional second output
Either of these can be vectorized and extended to higher dimensions as well.
  3 个评论
dpb
dpb 2014-11-10
As I noted, either can be extended to higher dimensions. Under
help interp1
one finds in the "See also" section
See also interp1q, interpft, ... interp2, interp3, interpn, ...

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Markov Chain Models 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by