Acquire and Analyze Audio Data from Mobile Device
This example shows how to use FFT (Fast Fourier Transform) to analyze microphone audio data collected by MATLAB Mobile. A mobile device with Sensor Access turned on is required. To turn on Sensor Access, in the MATLAB Mobile app go to Sensors > More, and switch on Sensor Access.
Connect to Device
Create the connection to the mobile device and enable microphone.
mobileDevObject = mobiledev; mobileDevObject.MicrophoneEnabled = 1;
Record Background Sound
Turn on the microphone to start recording background noise. Read recorded 2 seconds background audio data. The recorded data is a double matrix of size NumSamples-by-NumChannels. Find the maximum value to set the threshold between background noise and detected speech.
mobileDevObject.logging = 1; disp('Sampling background noise...') pause(2) mobileDevObject.logging = 0; audioData = readAudio(mobileDevObject); disp('Maximum sound of the background noise: ') threshold = max(abs(audioData), [], "all")
Sampling background noise... Maximum sound of the background noise: threshold = 0.0122
Record Speech
Start the speech recording.
disp('Speak into the device microphone for a few seconds. For example, say: ') mobileDevObject.logging = 1; tic disp('"Testing MATLAB Mobile Audio"') startTime = 0; totalAudio = [];
Speak into the device microphone for a few seconds. For example, say: "Testing MATLAB Mobile Audio"
Detect Speech to Trigger Acquisition
Attempt to detect speech for 5 seconds. Pause every 200 ms and read the buffer. If the max value from the window is greater than the threshold*1.5, discard previous collected background audio data and start collecting intended speech audio data. If speech is not detected, process audio data collected in the last 5 seconds.
while toc < 5 && startTime == 0 pause(.2) audioData = readAudio(mobileDevObject); if max(abs(audioData)) > threshold * 1.5 startTime = toc totalAudio = audioData; else totalAudio = vertcat(totalAudio, audioData); end end
startTime = 1.4202
Acquire Audio Data
Pause every 200 ms and read the buffer. Collect audio data until the speech ends or until the timeout is reached. If no speech is detected in 400 ms, terminate acquisition.
if startTime ~= 0 numberOfIntervalsStopped = 0; while numberOfIntervalsStopped < 2 && toc < 10 pause(.2) audioData = readAudio(mobileDevObject); if max(abs(audioData)) < threshold * 1.5 numberOfIntervalsStopped = numberOfIntervalsStopped + 1; else numberOfIntervalsStopped = 0; end totalAudio = vertcat(totalAudio,audioData); end end mobileDevObject.logging = 0;
Preprocess Audio Data
Only one channel of data is needed. n is the size of leftAudio and is used for graphing and processing. Get the microphone sample rate to determine the frequency scale later.
endTime = toc; leftAudio = totalAudio(:,1); n = numel(leftAudio); if n == 0 disp(' ') disp('No audio data recorded. Try to run the script again.') clear mobileDevObject return end sampleRate = mobileDevObject.Microphone.SampleRate;
Plot Audio Data in Time Domain
Use elapsed time to determine the timestamps of ticks on the graph. Convert the timestamps to their corresponding sample to find their locations on the x axis. Display them with the xticks function. Use the original ticks array for the labels.
figure(1); plot(leftAudio) title('Sound wave'); timeElapsed = endTime - startTime ticks = 0:floor(timeElapsed); sampleTicks = ticks * n/timeElapsed; xticks(sampleTicks) xticklabels(ticks) xlabel('Time(s)') ylabel('Amplitude')
timeElapsed = 8.7632
Process Audio Data in Frequency Domain
Use the fft function to convert the amplitudes into the frequency-domain given the original time-domain data.
fftData = fft(leftAudio); % Signal length is equal to the number of samples. signalLength = n; % Normalize the FFT data by dividing by signalLength. fftNormal = abs(fftData/signalLength); % The second half of the FFT data is a reflection of the first half % and is not relevant in this case, so remove those values. fftNormal = fftNormal(1:floor(signalLength/2)+1); % Multiply the final values by 2 to account for removed values. fftNormal(2:end-1) = 2*fftNormal(2:end-1); % freqs is the x-axis scale of the graph. freqs = sampleRate*(0:(signalLength/2))/signalLength; % Convert factor from index to frequency. scale = sampleRate/signalLength;
Plot Audio Data in Frequency Domain of 0-1000 Hz
cutoff = 1000/scale; figure(2); plot(freqs(1:floor(cutoff)),fftNormal(1:floor(cutoff))) title("Frequency Domain Graph") xlabel("Frequency (Hz)") ylabel("Amplitude") ax = gca; ax.XAxis.Exponent = 0;
Final Frequency Analysis and Clean Up
Print the dominant frequency, which is the index of the maximum amplitude from the fft. Convert that value to Hz using the calculated scale.
[mVal, mInd] = max(fftNormal); fprintf("Dominant frequency: %d Hz\n",floor(mInd * scale)); if startTime == 0 disp(' ') disp('The voice of the speech is too low compared to the background noise, analysis might not be precise. Try to run the script again and speak louder.'); end clear mobileDevObject
Dominant frequency: 125 Hz