File Exchange

image thumbnail

Heart Sound Classifier

version (7.88 MB) by Bernhard Suhm
Heart Sound Classification demo as explained in the Machine Learning eBook, but now expanded to demonstrate Wavelet scattering


Updated 16 Oct 2019

View License

This submission provides the code explained by the (upcoming) eBook on the complete machine learning workflow. Based on the heart sound recordings of the PhysioNet 2016 challenge, a model is developed that classifies heart sounds into normal vs abnormal, and deployed in a prototype (heart) screening application. The workflow demonstrates:
1) using datastore for efficiently reading large number of data files from several folders
2) using tools from signal processing, wavelets and statistics for feature extraction
3) using ClassificationLearner app to interactively train, compare and optimize classifiers without writing any code
4) programmatically training an ensemble classifier with misclassification costs
5) applying an automated feature selection to select a smaller subset of relevant features
6) performing C code generation for deployment to an embedded system
7) applying Wavelet scattering to automatically extract features that outperform manually engineered ones

Cite As

Bernhard Suhm (2020). Heart Sound Classifier (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (44)

Bernhard Suhm

David Moore: please contact me offline with more details. Could be that you accidentally passed the whole exported model to code generation instead of just the classification object. Or maybe you somehow created a very large model, some size limitations apply to code generation.

David Moore

This is a fantastic example, thanks. Just one issue - I get the following error when running the 'Code Generation' section of the script. Any ideas?

The input to coder.const cannot be reduced to a constant: Unsupported value.

Error in ==> loadCompactModel Line: 33 Column: 64
Code generation failed: View Error Report

qiu tao

shiela marie puro

soo very good tutorial. easy to understand

Bernhard Suhm

Certainly you can apply the same workflow to other data sets. You may find that other models work better on your problem (than SVM or random forests), and you probably have to adapt the features (but that is encapsulated in a different function) - or apply Wavelets. With the R2020a release I will push an update that features an all-new Auto ML example, including applying Wavelet scattering to this data, which can be a blueprint for applying it to her signal-type data.

James Flynn

Your work has really intrigued me and I am curious if this program can work with other data. My partner and I are working on another classifier, and we are interested in retrofit your code to our data to have as base-like for our actual classifier, any thoughts?

Bernhard Suhm

Re: X.A. - the feature extraction slides a 5 sec window over the actual heart sound recordings, which vary in length. That's why we end up with many more rows than audio wave files. If you want to study what features are extracted, look at extractFeatures.m among the HelperFunctions.


the feature table (13015*28) , how does it correlates to the 3240 audio wave files? 4 rows for each file? thanks

Sainath Urankar

Bernhard Suhm

OK, everyone, this latest version (1.6) has a bunch of new things:
• Moved hyperparameter tuning and cost matrices into the Classification Learner (requires R2019b)
• Added links applying Deep Learning at the end, and bonus section applying Wavelet scattering
• Minor tweaks to ensure script also runs on MacOS

Bernhard Suhm

That's right, there was a backward code incompatibility in the code generation section. I'm about to release a significantly updated version, leveraging cool new features available in R2019b, and demonstrating an automated feature extraction technique (Wavelet scattering). Stay tuned.

Moustafa SALEH

@Arun Pradhan
Instead of loading the "TrainedEnsembleModel_FeatSel.mat", try to regenerate the mat file.

Lucas Holtz

@Jacob: the challenge website including the data seems to be relocated by now. I found the website data in their archive following this link:

The respecitve links for the .zip files can be found on the bottom of that page:

add these into your live script and you are good to go!


weii chieun lim

@Arun Pradhan
I encountered the same problem. Do you find any solution for it?

Arun Pradhan

While running this scrip , at the stage of code generation ( Integrate Analytics with system) section , I am getting this errors

Error using rmfield (line 65)
A field named 'BinnedX' doesn't exist.

Error in classreg.learning.coderutils.classifToStruct (line 37)
dataSummary = rmfield(dataSummary,'BinnedX');

Error in classreg.learning.classif.CompactClassificationEnsemble/toStruct (line 397)
s = classreg.learning.coderutils.classifToStruct(this);

Error in saveCompactModel (line 23)
compactStruct = toStruct(compactObj); %#ok<NASGU>

Bernhard Suhm

@Jocab: the training data indeed is expected to be located in Data/training/training-a,b,... Not sure whether that solves your problem later in the script.

Jacob Hoffman

I'm getting an error when the fileDataStore tries to read from "training" folder. The when unzipped contains several folders called training-a, training-b, etc. Should the contents of these folders be combined into a single folder called training? I tried this and it got past this step, but then I got errors later. I think because some of the files in each of these individual training folders had the same names and therefore overwrote each other when copied to the same folder.

teimoor bahrami

hi i want to predict heart attack with physionet PTB diagnostic database. should i down load the whole database for this. i want the classification be tow class case normal and heart attack

kumia kuma

Sylvia Simon

Learning the machine leading with it makes things easier. Great!

jiao zhang

it is very convenient!


Bernhard Suhm

@Muhammad: you must not have run the section on the live script that generates or loads the model after selecting specific features. You can simply run load('TrainedEnsembleModel_FeatSel.mat') to create that variable.

Undefined function or variable 'trained_model_featsel'.

Error in code_genration (line 2)


@Youjie Ye
I had the same problem. The function importAudioFile is in the folder HelperFunctions. You need to add the folder HelperFunctions to the path. If the code didn't do it for you, then right click on this folder in the left panel in Matlab and select 'Add to Path', then 'Selected Folders'.

Youjie Ye

Hi Mr. suhm,
I'm reading your ebook, it is really a great tutorial. However, I met a error which make the program can not countinue running when I did hand-on exercise and ran the 'Access and Explore Data' of classifier. Here was the error information:

Error using fileDatastore (line 64)
Function importAudioFile does not exist.


Best regards,
Youjie (

Bernhard Suhm

Re: JJ's problems.
1) If you plan to run the actual feature extraction, which accesses the data via a datastore, make sure you have the training data downloaded. The first code section will do that, but by default it's disabled (getTrainingData = 0)
2) gcp is a MATLAB command that initiates a pool of computation resources (including multiple cores of your CPU if available), but you need the Parallel Computing toolbox installed, or you'll get an error.



I tried to extract features for the feature_table by running the code, but I encountered two difficulties: 1) fileDatastore creates an error in line 105 -> Error using fileDatastore (line 105) Cannot find files or folders matching ...
2) I need to know gcp in order to run this line of the code: n_parts = numpartitions(training_fds, gcp); what us gcp?


Ehsan Modiri

It is perfect and easy to use. However, there is still a problem in the link below which the Mathworks offered it as the "Mastering Machine Learning: A Step-by-Step Guide with MATLAB".
They did not update the latest version of your function.

Bernhard Suhm

With today's update, the code matches the example from our advanced machine learning eBook without further changes, and the code generation issue that several ran into should be resolved (now, the reduced model will use 15 features). Keep the feedback coming!


Very good job!

yiran duan

Shengwen Li

texas instruments

Forrest Titcomb

I get an error on line 102 , fileDatastore It says, " can't find files or folders matching" and then it lists a path. In the left pane there is a data folder with a validation subfolder with lots of .wav files in it. I tried to delete the data folder, so it could be reloaded, but I didn't have permission. The scripts runs up through the FFT plots.

Bernhard Suhm

My apologies, for some reason the last update didn't pick up those corrections. You can fix it by modifying a line in extractCodegenFeatures to "number_of_features = 14;"

Islam Alam

This is example is awesome.
However, there is an error with the HelperFunctions/trifbank.m where the first line is vfunction instead of function.

Then, when I corrected this typo, I got the following error, which I cannot solve at the moment:

??? X data must have 14 columns.

Error in ==> classifyHeartSounds Line: 20 Column: 20
Code generation failed: View Error Report

Could you please provide some help to solve the second error?

Thanks so much for such a great tutorial.

Rohan Leekha

Error using mfcc
Error using Extract functions

How do i correct these errors could you help me out sir


Good day Mr. Suhm,

Could you please reupload/show me the code of the wavelet_features helper function?
I am doing a similar machine learning project with sEMG, have learned a lot from this lecture already and seeing/understanding the wavelet_features function would be of great help to me.

Best regards,
Riad ( )


Would be cool, but does unfortunately not work. One error is in HelperFunctions/trifbank.m where the first line is 'vfunction' instaed of 'function'.

In the life script, one line reads save('FeatureTable1', 'feature_table'); but should be save('FeatureTable', 'feature_table');

Codegen also fails with

??? X data must have 14 columns.

Error in ==> classifyHeartSounds Line: 20 Column: 20
Code generation failed: View Error Report

sunny shah

xiaojuan ni


- Moved hyperparameter tuning and cost matrices into the Classification Learner
- Added "bonus" section applying Wavelet scattering
- Fixed problem caused by 'binnedX' field introduced in R2019a
- Converted paths to be compatible with MacOS and Linux

Updated version to exactly match the exampled used for the "Advanced Machine Learning" eBook after obtaining permission to use code authored by a third party.

Fixed bugs uncovered by hanspeter

Actually use version without the signal_entropy feature

Removed reference to signal_entropy.m, which was owned by someone outside MathWorks.

MATLAB Release Compatibility
Created with R2019b
Compatible with R2016b to any release
Platform Compatibility
Windows macOS Linux