How can I convert 1D sparse data into learnable format for machine learning?

Question

0 个投票

Hello

I wanted to do sequence-to-sequence regression where I have sparse 1D arrays as inputs and 1D signals as outputs. I tried a lstm network but the training loss was just fluctuating instead of decreasing. I figured that might be bacause of the sparsity of the data. Is there any way to deal with this problem, like by changing the sparse dataset into some machine learnable format?

Thanks in advance.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Piyush Dubey 2023-9-1

编辑：Piyush Dubey 2023-9-4

0 个投票

Hi NASRIN,

I understand that you have a sparse one-dimensional data because of which the training loss is fluctuating instead of consistently decreasing and you want to convert it to a machine learnable format.

In this case, you can try out the following approaches to deal with yours sparse dataset::

1. Principle Component Analysis (PCA): PCA is a dimensionality reduction method used to reduce the dimension of the dataset and select prominent features only in the output.

Please refer the following MathWorks documentation link for more information on “PCA” using MATLAB: https://in.mathworks.com/help/stats/pca.html

2. Feature Hashing: Feature hashing is a technique used on sparse datasets in which the dataset can be binned into the desired number of outputs. “hashSimilarityModel” is helpful in performing this operation.

You can refer the following MathWorks documentation link for better understanding on “hashSimilarityModel”: https://in.mathworks.com/help/predmaint/ref/hashsimilaritymodel.html

3. Performing Feature Selection and Extraction: Within sparse data, you can handpick or notice the parameters that are causing significant changes to the training loss value, contributing more towards training of the model. Eliminating and selecting weights for specific features can also help deal with sparse data.

4. “t-Distributed Stochastic Neighbor Embedding (t-SNE)”: “tsne” first removes each row of the input data ‘X’ that contains any NaN values. Then, if the standardized name-value pair is true, “tsne” centers ‘X’ by subtracting the mean of each column, and scales ‘X’ by dividing its columns by their standard deviations.

Please refer the following MathWorks documentation links to know more about implementation of “tsne”: https://in.mathworks.com/help/stats/tsne.html

https://in.mathworks.com/help/stats/t-sne.html

I hope this helps.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How can I convert 1D sparse data into learnable format for machine learning?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

标签

Community Treasure Hunt

How can I convert 1D sparse data into learnable format for machine learning?

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论