Find the filename with the biggest number?
8 次查看(过去 30 天)
显示 更早的评论
Hi, so I have a folder with a bunch of files that come in each day. They look like this...
20130721_SPLBRENT3_140554.mat
20130721_SPLBRENT3_160554.mat
20130721_SPLBRENT3_180554.mat
20130722_SPLBRENT3_075651.mat
20130722_SPLBRENT3_095651.mat
20130723_SPLBRENT3_075949.mat
20130723_SPLBRENT3_102025.mat
So, for example, 20130722_SPLBRENT3_095651.mat is from 7/22/2013 and the data in the file was gathered at 9:56am. I am trying to write a code that finds the latest data (20130723_SPLBRENT3_102025.mat), NOT the last file uploaded (because all the files are uploaded at once and one from the 21st may come in before one from the 23rd). How do I search for the file with the latest date and time in the file name?
3 个评论
Jan
2013-7-24
+1: This is a nice example to demonstrate different techniques to improve code. Sorry, Jacqueline, I know that this was not your intention. But at least a fast, faster and fastest solution is still a solution :-)
回答(3 个)
Jan
2013-7-24
编辑:Jan
2013-7-24
Some simplifications to Azzi's code:
s = {'20130721_SPLBRENT3_140554.mat'; ...
'20130721_SPLBRENT3_160554.mat'; ...
'20130721_SPLBRENT3_180554.mat'; ...
'20130722_SPLBRENT3_075651.mat'; ...
'20130722_SPLBRENT3_095651.mat'; ...
'20130723_SPLBRENT3_075949.mat'; ...
'20130723_SPLBRENT3_102025.mat'}
a = regexp(s, '_|\.', 'split');
b = cat(1, a{:});
date = datenum(b(:,1), 'yyyymmdd') + datenum(b(:,3), 'HHMMSS');
[max_date,idx] = max(date);
latest_file = s{idx};
When a function operates on cells directly like REGEXP and DATENUM, CELLFUN especially when combined with anonymous functions is much slower. When s contains 10'000 distinct strings, omitting CELLFUN reduces the runtime from 6.7 seconds to 0.16 seconds (R2009a/64/Win7). In addition the leaner code is less prone to typos and easier to understand and debug.
Of course the runtime does not matter here most likely, because the number of files might be small. But it could be useful for other problems, when equivalent solutions are applied.
Btw., this is reduces the runtime by further 50%:
c = CStrCatStr(b(:, 1), 'T', b(:, 3));
date = DateStr2Num(c, 30);
See FEX: CStrCatStr and FEX: DateStr2Num. But be aware, that downloading and compiling would need much more time that you ever could win for such small problems. But it can be useful when working with millions of files or with 1000 files in real-time.
And the last thought about efficient programs: I've shown different methods to perform the same operations faster. But exploiting, that the chronological order equals the alphabetical order is again 4 times faster than the C-Mex monsters. The recognition of such useful patterns in the data is usually much more important than multi-cores, Gigas (Hz or Bytes) or sophisticated vectorizations. Then the person, who decided to use these nice names solved the problem most efficiently already.
3 个评论
Cedric
2013-7-25
编辑:Cedric
2013-7-25
It splits file names using either '_' or '.' as a separator:
>> s = regexp('20130722_SPLBRENT3_075651.mat', '_|\.', 'split')
s =
'20130722' 'SPLBRENT3' '075651' 'mat'
The pipe | means "or", and the . has to be backslash-ed because it has a special meaning in regular expressions ( '\.' codes the dot character, and '.' is a wildcard for any character).
Azzi Abdelmalek
2013-7-24
编辑:Azzi Abdelmalek
2013-7-24
s={'20130721_SPLBRENT3_140554.mat'
'20130721_SPLBRENT3_160554.mat'
'20130721_SPLBRENT3_180554.mat'
'20130722_SPLBRENT3_075651.mat'
'20130722_SPLBRENT3_095651.mat'
'20130723_SPLBRENT3_075949.mat'
'20130723_SPLBRENT3_102025.mat'}
a=cellfun(@(x) regexp(x,'_|\.','split'),s,'un',0)
date=cell2mat(cellfun(@(x) datenum([x{1} ' ' x{3}],'yyyymmdd HHMMSS'),a,'un',0))
[max_date,idx]=max(date)
latest_file=s{idx} % The latest file
latest_date=datestr(max_date,'dd-mm-yyyy HH:MM:SS')
Jan
2013-7-24
Congratulations! If the format of the names is "20130723_SPLBRENT3_102025", the alphabetical order equals the temporal order. Then this is sufficient:
list = dir(fullfile(FolderName, '*.mat'));
name = {list.name};
sorted = sort(name);
latest = sorted{length(sorted)};
In all cases I have seen yet, the reply of dir is alphabetically sorted already. But as long as this is not documented, I'd rely on an explicit sorting.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 File Operations 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!