Removing double empty lines from a text file

7 次查看(过去 30 天)
If a file contains more than one consecutive empty lines, they are replaced by one empty line.
% reading file
fid=fopen(outFile,'rt');
Data = textscan(fid,'%s','Delimiter','\n');
Data=Data{1}; % get rid of nesting
k=1; emptylines_occured=0;
for j=1:numel(Data)
if ~strcmp(Data(j),'') % not empty line
if emptylines_occured
newData{k}=''; k=k+1;
emptylines_occured=0;
end
newData(k)=Data(j); k=k+1;
else % empty line
emptylines_occured=1;
end
end
fclose(fid);
% writing file
fid=fopen(outFile,'wt');
for j=1:numel(newData)
fprintf(fid, '%s\n',newData{j});
end
fclose(fid);
Is there a more concise way?

采纳的回答

Stephen23
Stephen23 2018-2-8
编辑:Stephen23 2018-2-9
You can easily write the new file at the same time as you read the old one, which is faster and uses much less memory. Here is a simple version that create the new file with at most one empty line between any two non-empty lines:
[f1d,msg] = fopen('test_old.txt','rt');
assert(f1d>=3,msg)
[f2d,msg] = fopen('test_new.txt','wt');
assert(f2d>=3,msg)
prv = 'X';
while ~feof(f1d)
new = fgetl(f1d);
if numel(new) || numel(prv)
fprintf(f2d,'%s\n',new);
end
prv = new;
end
fclose(f1d);
fclose(f2d);
The test files are attached. Define prv as an empty char to ignore the leading empty line/s.
  3 个评论
Stephen23
Stephen23 2018-2-8
@Walter Roberson: the original question uses the t option for both reading and writing, so presumably this is not a problem.
bbb_bbb
bbb_bbb 2018-2-9
编辑:Stephen23 2018-2-9
This works excellently. Thanks.

请先登录,再进行评论。

更多回答(1 个)

Walter Roberson
Walter Roberson 2018-2-8
编辑:Walter Roberson 2018-2-8
%read the file _and_ do the work of deleting extra empty lines.
new_text = regexprep( fileread(outFile), '(\r?\n)(\r?\n)+', '$1');
%write the result to a new file
fid = fopen('text_new.txt', 'w');
fwrite(fid, new_text);
fclose(fid)
  3 个评论
Walter Roberson
Walter Roberson 2018-2-8
new_text = regexprep( fileread(outFile), '(\r?\n\r?\n)(\r?\n)+', '$1');
bbb_bbb
bbb_bbb 2018-2-8
编辑:bbb_bbb 2018-2-8
There is still problem with non-english characters. They are turned into 0xFF.

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by