Storing 200GB audio spectrograms in a tall table, is this possible?

1 次查看(过去 30 天)
Hi,
I'm processing 200GB of 1 minute audio files in a way that for each file I store in a table the filename, a timestamp and the spectum (1x64000) for each of the 60s. Then I save each table to a mat file:
for f=1:length(totFiles)
%Audio data
File=tot(f).name(1:end-4);
Fecha=datetime(str2double(File(9:12)),str2double(File(13:14)),...
str2double(File(15:16)),str2double(File(18:19)),...
str2double(File(20:21)),str2double(File(22:23)));
%Audio read
[x,fs]=audioread(strcat(tot(f).folder,'/',tot(f).name));
long=length(x)/fs;%long audio en s
%Spectrum calculation each second
xf=reshape(x,1*fs,[]);
sp=pwelch(xf,fs,fs/2,fs,fs,'power');%Ojo si wlen =! 1*fs
%Table creation
T(1:60,:)=table((Fecha+seconds(1:60))',...
strcat(repmat(File,60,1),suff),sp',...
'VariableNames',{'Fecha','File','sp'});
location=('/Volumes/Almacén/matlab/espectrosCortegada/');
save(strcat(location,'espectrosFile_',num2str(f),'.mat'),'T');
clear T;clear x;
end
The problem is that whe I want to recover all this files in a tall array trough a datastore i get the error:
ds=datastore('/Volumes/Almacén/matlab/espectrosCortegada/*.mat')
Error using datastore
Cannot determine the datastore type for the specified location.
Specify the 'Type' name-value pair argument to indicate the type of datastore to create.
>> ds=datastore('/Volumes/Almacén/matlab/espectrosCortegada/*.mat','Type','file')
Error using datastore
Incorrect number of input arguments. Specify a function handle with the 'ReadFcn' parameter.
Any clue on how to face this problem or if this even possible?

采纳的回答

jibrahim
jibrahim 2023-3-23
Hi David,
Please find below a possible solution that uses Audio Toolbox functionality. It uses a sample dataset as an example.
% Download the Free Spoken Digit Data Set (FSDD).
% FSDD consists of 2000 recordings of four speakers saying the numbers 0
% through 9 in English.
downloadFolder = matlab.internal.examples.downloadSupportFile("audio","FSDD.zip");
dataFolder = tempdir;
unzip(downloadFolder,dataFolder)
dataset = fullfile(dataFolder,"FSDD");
% Create an audioDatastore that points to the dataset.
ads = audioDatastore(dataset,IncludeSubfolders=true);
% Create a transformed datastore that computes spectra from audio data.
% Here, use pwelch.
adsSpec = transform(ads,@(x)pwelch(x,'power'));
% Use writeall to write spectra to disk. Set UseParallel to
% true to perform writing in parallel.
outputLocation = fullfile(tempdir,"MyFeatures");
writeall(adsSpec,outputLocation,WriteFcn=@myCustomWriter,UseParallel=true);
% Create a signalDatastore that points to the out-of-memory features. The
% read function returns a spectrum/timestamp pair.
sds = signalDatastore(outputLocation,IncludeSubfolders=true, ...
SignalVariableNames=["spec","timestamp"],ReadOutputOrientation="row");
% Read one pair of spectrum/timestamp
y = read(sds)
% Create a tall table
t = tall(sds);
function myCustomWriter(spec,writeInfo,~)
% myCustomWriter(spec,writeInfo,~) writes spectra/time stamps
% pair to MAT files.
filename = strrep(writeInfo.SuggestedOutputName,".wav",".mat");
% also write a time stamp as an example
timestamp = datetime('now');
save(filename,"spec","timestamp");
end
  7 个评论
jibrahim
jibrahim 2023-4-4
Not sure if this is what you want, but if you change one line of code in Procesado to:
features = [Kurtosis,Entropy];
then this works:
t = tall(sds);
Y = mean(cell2mat(t(:,1)));
Y = gather(Y)

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 AI for Signals 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by