Use terminal to speed up file removal
1 次查看(过去 30 天)
显示 更早的评论
Hi all, I've got large number of CSVs generated each time a system changes state. Basically, the CSVs start as a single row [1x3] array, and any data is added as a new row. I've written simple loop that checks for any "empty" CSVs (only containing the single row) and remove this file. This however takes many (>10) minutes to complete and I want to try the same in terminal. Code as shown:
CSV_Filenames_STRUCT = dir(sprintf('%s/*.csv',ResultDirectory));
CSV_Filenames_CELL = {CSV_Filenames_STRUCT.name};
StartingNumberOfFiles = size(CSV_Filenames_CELL,2);
for NthFile = 1:StartingNumberOfFiles
NumberOfPeaks = size(textread(sprintf('%s/%s',ResultDirectory,CSV_Filenames_CELL{1,NthFile}),'%s'),1) - 1; % Number of rows less one for the 'x,y,value'
if ~NumberOfPeaks % Essentially empty
delete(sprintf('%s/%s',ResultDirectory,CSV_Filenames_CELL{1,NthFile}));
end
end
I've not used terminal much, and wondering if it'd be faster for the above when there are many files to process, and how to code the check for the single line check So far, I've got something like:
for f in *.csv;
do
L=`wc -l "$f" | awk '{print $1}'`
if test $L -eq 1
then
mv $f ./MT;
fi
done
which isn't quite working (there's spaces in the filename as shown below), but I'm out of my depth here so calling for help on how to use the "system"/"unix" options through Matlab. I'm running OS-X and Kubuntu Linux. I should also mention that the filenames have spaces in them like: "Filter 0000001 Fwd,Alignment Black Screen - Ref_01 Input_19 (2017-10-17 @ 13.30.20.103).csv"
3 个评论
回答(2 个)
Jan
2017-10-17
I'm not sure if I understand your question correctly: You want to delete all files, which have one column only - correct?
FULLFILE is smarter than creating file names by sprintf().
CSV_Filenames_STRUCT = dir(fullfile(ResultDirectory, '*.csv'));
CSV_Filenames_CELL = {CSV_Filenames_STRUCT.name};
StartingNumberOfFiles = numel(CSV_Filenames_CELL);
for NthFile = 1:StartingNumberOfFiles
File = fullfile(ResultDirectory, CSV_Filenames_CELL{NthFile});
fid = fopen(File, 'r');
if fid == -1, error('Cannot open file: %s', File); end
line1 = fgetl(fid);
line2 = fgetl(fid);
fclose(fid);
if ~ischar(line2)
delete(File);
end
end
Is this faster? It tries to import 2 lines only.
0 个评论
Stephen23
2017-10-17
Remove the textread and replace it with something like this (pseudocode):
fid = fopen(...,'rt');
fgetl(fid); % read first row
if feof(fid) % check if end of file
delete(...)
end
"I've removed sprintf's and replaced with concatenation strings "
I would recommend using fullfile: it actually makes the intention clearer.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Type Conversion 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!