How to glean the files I want from so many data files?
1 次查看(过去 30 天)
显示 更早的评论
Dear MATLAB Experts,
I have had this problem for a couple of days and I could not think of a way to resolve this. I do appreciate any helps to get my way out!
I have about 100000 files which names are in the following format:
sprintf ("%dfile%d.csv" , identifier 1, identifier 2)
where identifier 1 is numbers 1, 2, 3, 4, 5 ..., 50 (in order) and identifier 2 could be any number. Some examples are
1file1234.csv
1file2003.csv
1file11111111.csv
2file6667.csv
2file99999.csv
3file1.csv
3file10000.csv
.
.
.
50file3456.csv
50file123456.csv
The files I am interested in are the ones meet this condition:
for each identifier 1 from (1 to 50), the file with smallest identifier 2 is of interest. All other should be deleted. In the above examples, we need
1file1234.csv
2file6667.csv
3file1.csv
.
.
.
50file3456.csv
How can I write a code in MATLAB does this for me?
Thank you so much in advance.
0 个评论
采纳的回答
Image Analyst
2017-1-30
Use dir() on the folder to get the filenames. Then use sscanf() to extract the two numbers. Easy, but let us know if you can't do it.
5 个评论
Walter Roberson
2017-1-31
dinfo = dir(fullfile( '*file*.csv'));
filelist = {dinfo.name};
bestids = inf(1,50);
bestnames = cell(1,50);
for K = 1 : length(filelist)
this_file = filelist{K};
ids = sscanf(this_file, '%dfile%d.csv');
oldbest = bestids(ids(1));
if ids(2) >= oldbest
delete(this_file);
else
oldbestname = bestnames{ids(1)};
if ~isempty(oldbestname)
delete(oldbestname);
end
bestids(ids(1)) = ids(2);
bestnames{ids(1)} = this_file;
end
end
At each point, if the id2 is not the best we have seen for this id1 then delete the current file; if it is better than our previous best then delete the previous best and keep the one we just encountered.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 File Operations 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!