applying smote for multi-channel unbalanced dataset

Question

Demet 2022-10-1

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1815700-applying-smote-for-multi-channel-unbalanced-dataset

回答： William Rose 2022-10-2

Hello,

I have a 19-channel EEG data that I want to classify into two groups, but there is an imbalance between the sample numbers of the two groups(number of group1=1308 number of group2= 38624, almost 30 times difference). For this reason, I want to implement smote, but I could not find how to use matlab's smote functions for multi-channel data[1, 2, 3,4, etc]. I would be very grateful if you could help me do this.

Also, I am thinking of applying smote to each channel separately, but I am not sure if it is make sense to do this. And, which method do you think it is more appropriate to use for the data with 30 times more difference between two groups?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

William Rose 2022-10-1

编辑：William Rose 2022-10-2

[I moved this comment to be an answer.]

@Demet,

...

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

William Rose 2022-10-2

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1815700-applying-smote-for-multi-channel-unbalanced-dataset#answer_1064925

@Demet,

"Matlab's smote functions", to which you refer, are from the File Exchange. That is not part of standard Matlab. The File Exchange is user-supplied, not-supported-by-Mathworks code. It is good to be clear when posting a question. There are at least four sets of smote routines on the File Exchange, so please specify which one you are using. Use the File Exchange site to ask the author a question, and read the reviews and discussion, in case others have asked or answered your question.

I assume you are extracting features from the EEG on each channel, and using those features as the input to the smote routine. I would try it using 1 channel, to get your code working and to make sure the calls to the smote routine operate without error. Then, once you have it working for one channel, add the features observed on the other channels, concantenated into one long vector.

See this paper. They extract 2325 features from each 60 second EEG, on each channel*. They use borderline SMOTE on 32-channel EEGs, to augment the data which is used to train a neural network for determining emotion from the EEG. It is not clear how they combine the 32 channels of features when they do borderline SMOTE.

*The features they extract are are power at 5 frequencies, measured using the STFT at 465 overlapping 2-second long windows. Successive windows overlap by about 94%. If I were them, I would use 50% overlap, to reduce the number of highly redundant data points.

2 个评论
显示无隐藏无

Demet 2022-10-2

Hello,

First of all, thank you for your response.

The data I have is multi-channel, and I am thinking clasify it as multichannel then extracting features to see how well or bad the feature extraction process works.

I will not accept your aswer currently as I'm waiting for an answer by someone who really has worked with smote or other unders-over sampling functions to reply.

Demet 2022-10-2

Hello,

First of all, thank you for your response.

The data I have is multi-channel, and I am thinking clasify it as multichannel then extracting features to see how well or bad the feature extraction process works.

I will not accept your aswer currently as I'm waiting for an answer by someone who really has worked with smote or other unders-over sampling functions to reply.

请先登录，再进行评论。

Answer 2

William Rose 2022-10-2

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1815700-applying-smote-for-multi-channel-unbalanced-dataset#answer_1065110

@Demet, you're welcome.

I notice that the manuscript link I provided took a long time to load, when I tried it. Here is an alternative lin for the same manuscript: Y. Chen et al. Effects of Data Augmentation Method Borderline-SMOTE. IEEE Access, 2021.

The advantage of doing feature extraction before classifying is that it reduces the dimensionality. A single 60 second, 19 channel EEG, downsampled to 128 Hz, has 146,000 points. The feature extraction appoach of Chen et al (2010) reduces this by a factor of 16, and I htink they could have reduced it by a factor of 128, with similar results. Another example of feature extraction before classification is Kalashami et al., EEG Feature Extraction and Data Augmentation in Emotion Recognition, 2022. They extract a total of 334 features from each 60 second, 32 channel recording.

Good luck.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

applying smote for multi-channel unbalanced dataset

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

回答（2 个）

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

applying smote for multi-channel unbalanced dataset

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

回答（2 个）

2 个评论 显示 无隐藏 无

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

2 个评论
显示无隐藏无

0 个评论
显示 -2更早的评论隐藏 -2更早的评论