Errors in looping/erros checking

I have made a few changes from last time but nothing seemed to work out. The codes that I wrote did not manage to process the textfile when the text file is called. To see whether something is process or not, I used the fprintf statement as an indicator where it'll show the total number of words that was loaded and the result was dissapointing. May someone point out which part of it that does not seem to work?
[FILENAME, pathname] = uigetfile('*.wsb','Read Matlab Code File');
if isequal(FILENAME,0) || isequal(pathname,0)
fprintf('User pressed cancel\n');
else
fprintf('User selected: %s \n', FILENAME);
end
fid = fopen(FILENAME,'r+');
if fid<0
%error could not find the file
return,
end
total_no_words=0;
lineNUM=1;
while ~feof(fid)
tline = fgetl(fid);
if ~isempty(tline)
%line is empty, skip it
total_no_words=total_no_words+1;
if sum(isletter(tline))==length(tline)
%line does not contain character besides letters
%we finally have a string
tline=strtrim(tline);
if sum(isspace(tline))==0
%tline contain no spaces and only contain letters
if length(tline) > 3 && length(tline)<26
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
wordbank=struct;
letters= 'a':'z'; % a, b, c, ..., z
for ichar = 1:length(letters)
wordbank.(letters(ichar))=cell.empty;
wordbank.(tline(1)){end+1,1} = tline
end
end
end
end
end
end
lineNUM=lineNUM+1;
end
fprintf('LOAD WORD BANK \n');
fprintf('Loading word bank: none....started\n');
fprintf('Loading word bank: %s\b\b\b\n',FILENAME);
fprintf('Successfully loaded %d words from the word bank file\n',total_no_words)
fprintf('Removing invalid words...%d words were successfully removed...\n') %not complete
fprintf('Removing duplicate words and sorting...done\n')
fprintf('Removed %d duplicate words\n')
fprintf('Searching for and removing any plural forms of words ending in S:%%\n')
fprintf('Removed %d plural word\n')
fprintf('Building word indices and calculating beginning letter counts...done\n')
fprintf('Calculating word length counts...done\n')
fprintf('Final word count: %d\n')

 采纳的回答

Your code here is counting the number of lines (fgetl), not the number of words.
tline = fgetl(fid);
if ~isempty(tline)
%line is empty, skip it
total_no_words=total_no_words+1;
To get the # of words, one option is to parse the line based on white space delimiter.
word_array = get_tokens(tline,' ');
%returns a cell array with words seperated by spaces
num_words = num+words + length(word_array);
%counts the number of words

5 个评论

yesterday I originally put it like this,but nothing seems to work too.
%%lines of codes
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
total_no_words=total_no_words+1;
then i changed it to
tline = fgetl(fid);
if ~isempty(tline)
%line is empty, skip it
total_no_words=total_no_words+1;
just to see if it actually counted the line, meaning the loop would work but it won't even count the lines.
I'm confused. tline is a char array containing possibly multiple words and white spaces.
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
total_no_words=total_no_words+1;
This still only counts the number of lines. Plus, I don't understand why you would compare the string to the lowercase version of itself to see if it matches.
Additionally, ~isempty(tline) I think is wrong. tline, according to http://www.mathworks.com/help/techdoc/ref/fgetl.html, the documentation, will return a -1 in the case of an empty line. So tline should never be empty. In this case, the better option would be something like -- if tline(1) ~= -1 && length(tline) > 1 --
Additionally, ~isempty(tline) I think is wrong. tline, according to http://www.mathworks.com/help/techdoc/ref/fgetl.html, the documentation, will return a -1 in the case of an empty line. So tline should never be empty. In this case, the better option would be something like -- if tline(1) ~= -1 && length(tline) > 1 --
Initially it was meant to be an error checking but I changed a few things to make it look simpler but unfortunately it did not.
as for
This still only counts the number of lines. Plus, I don't understand why you would compare the string to the lowercase version of itself to see if it matches.
I know that it is counting the number of lines, I intended to check what was wrong with my loop by doing so. Probably the variable I am giving it was a bit confusing for you, I am sorry. and I am comparing it to the lower case version of it due to some requirement in the program.
I still don't know exactly what you are trying to do. Can you please be more clear on the goal of this project?
Also, I don't understand what isn't working correctly.

请先登录,再进行评论。

更多回答(1 个)

At the very least, change
for ichar = 1:length(letters)
wordbank.(letters(ichar))=cell.empty;
wordbank.(tline(1)){end+1,1} = tline
end
to
for ichar = 1:length(letters)
wordbank.(letters(ichar))=cell.empty;
end
wordbank.(tline(1)){end+1,1} = tline
Your code still won't be right but that change might perhaps get you out of the mental grove you are stuck in.

2 个评论

tried that before, but nothing seemed to work
Instead of "nothing seemed to work" a description of the occurring errors/problems would be more helpful.

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Text Data Preparation 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by