Count number of unique .mat files in a folder

40 次查看(过去 30 天)
Hello,
I have a set of folders and subfolder which I wish to analyze. I used the following code to determine the .mat files Info and names in a specific subfolder:
function [dirinfo, folders, matFilesInfo, matFilesNames] = getFolderNames( rootPath )
dirinfo = dir(rootPath);
dirinfo(~[dirinfo.isdir]) = []; %remove non-directories
tf = ismember( {dirinfo.name}, {'.', '..'});
dirinfo(tf) = []; %remove current and parent directory.
if ~isempty(dirinfo)
folders = extractfield(dirinfo,'name');
else
folders = [];
end
matFilesInfo = dir(fullfile(rootPath, '*.mat'));
if ~isempty(matFilesInfo)
matFilesNames = extractfield(matFilesInfo,'name');
else
matFilesNames = [];
end
end
For example I have the following values in matFilesNames:
6×1 cell array
{'RX_Phase_SA_RXG0_TXG20_1.mat'}
{'RX_Phase_SA_RXG0_TXG20_2.mat'}
{'RX_Phase_SA_RXG0_TXG20_3.mat'}
{'RX_Phase_SA_RXG0_TXG30_1.mat'}
{'RX_Phase_SA_RXG0_TXG30_2.mat'}
{'RX_Phase_SA_RXG0_TXG40_1.mat'}
I would like to have unique values with the count for each unique case.
For the above example:
3 files for 'RX_Phase_SA_RXG0_TXG20', 2 files for 'RX_Phase_SA_RXG0_TXG30' and 1 file for 'RX_Phase_SA_RXG0_TXG40'
How can I do this w/o huge number of loops and string splits?
Thank you!

采纳的回答

Voss
Voss 2021-12-22
编辑:Voss 2021-12-22
Your code appears to be collecting information (dirinfo, folders) about sub-directories within the given rootPath, and then also information about the mat-files within rootPath, as opposed to mat-files within the sub-directories of rootPath. I don't know whether this is what is intended.
Regardless, you have a cell array and you want to parse each element of it (assuming the relevant part to compare is the part of each element before the last underscore in it), create a set of unique elements (with just the part before the last underscore), and count how many times each one is in the cell array. Here is one way you can do that:
matFilesNames = { ...
'RX_Phase_SA_RXG0_TXG20_1.mat'; ...
'RX_Phase_SA_RXG0_TXG20_2.mat'; ...
'RX_Phase_SA_RXG0_TXG20_3.mat'; ...
'RX_Phase_SA_RXG0_TXG30_1.mat'; ...
'RX_Phase_SA_RXG0_TXG30_2.mat'; ...
'RX_Phase_SA_RXG0_TXG40_1.mat'}; % using your example
% get the index of the last underscore in each name
idx = cellfun(@(x)find(x == '_',1,'last'),matFilesNames);
% store the part of each name up to but not including the last underscore
matFilesNames = arrayfun(@(x,y)x{1}(1:y),matFilesNames,idx-1,'UniformOutput',false);
% now get the unique set of names and index (j) where each one from the
% original set is in the unique set
[unique_names,~,j] = unique(matFilesNames);
% and count them up
n_unique = numel(unique_names);
counts = zeros(n_unique,1);
for i = 1:n_unique
counts(i) = nnz(j == i);
end
display(unique_names);
unique_names = 3×1 cell array
{'RX_Phase_SA_RXG0_TXG20'} {'RX_Phase_SA_RXG0_TXG30'} {'RX_Phase_SA_RXG0_TXG40'}
display(counts);
counts = 3×1
3 2 1

更多回答(1 个)

Stephen23
Stephen23 2022-1-10
编辑:Stephen23 2022-1-10
The simple MATLAB approach is to use one of the histogram functions rather than a loop, e.g.:
% S = dir(fullfile(rootPath,'**','*.mat'));
% C = {S.name};
C = { ...
'RX_Phase_SA_RXG0_TXG20_1.mat'; ...
'RX_Phase_SA_RXG0_TXG20_2.mat'; ...
'RX_Phase_SA_RXG0_TXG20_3.mat'; ...
'RX_Phase_SA_RXG0_TXG30_1.mat'; ...
'RX_Phase_SA_RXG0_TXG30_2.mat'; ...
'RX_Phase_SA_RXG0_TXG40_1.mat'};
[U,~,X] = unique(regexprep(C,'_[^_]+$',''))
U = 3×1 cell array
{'RX_Phase_SA_RXG0_TXG20'} {'RX_Phase_SA_RXG0_TXG30'} {'RX_Phase_SA_RXG0_TXG40'}
X = 6×1
1 1 1 2 2 3
N = histcounts(X)
N = 1×3
3 2 1
optional output display:
compose("%s: %d",string(U),N(:))
ans = 3×1 string array
"RX_Phase_SA_RXG0_TXG20: 3" "RX_Phase_SA_RXG0_TXG30: 2" "RX_Phase_SA_RXG0_TXG40: 1"

类别

Help CenterFile Exchange 中查找有关 File Operations 的更多信息

标签

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by