Using Sparse Matrix in Series Network

7 次查看(过去 30 天)
I'm attempting to use a Series Network for a classification problem. However, I'm finding that the data I'm using currently is far too large. To deal with this I've been attempting to use the sparse function to turn them into sparse matrices, but I keep getting the error message "Invalid input data type". I've written a smaller script with a simpler data file to test it out, and even switched to a regression problem to see if the issue was the way I was changing the classification problem. However, even if I can get this simple problem to run, trying to change the input to sparse and input into the training function causes errors. Is it possible to use sparse matrices in series networks? Or am I out of luck?
I've attached a test file that contains the very simple script for an example. The dataset being used is the "accident.mat" file from MATLAB.

回答(1 个)

Shivam Malviya
Shivam Malviya 2022-10-11
Hi Rachel,
I understand that you want to work with a large dataset. To reduce the size of the dataset, you are converting it into a sparse matrix.
Is it possible to use sparse matrices in series networks?
No, trainNetwork doesn't support sparse matrix. That said, I have informed the concerned team about this.
To handle large datasets with neural networks, you may use Datastores. I have attached a sample script which loads one row at a time from the hwydata variable in accidents.mat file.
Please refer to the following links for more information;
I hope the above information helps.
  2 个评论
Rachel Bennett
Rachel Bennett 2022-10-11
This is very helpful thank you, though I'm sorry to learn that I cannot use sparse matrices. I tested the file you sent, and it works, although on a much larger problem it seems to take a long time to run. Additionally, what if I were to change it back to a classification problem (like my original problem?) When I do that, I get the error "Invalid training data. The output size (2) of the last layer does not match the number of classes of the response (1). Is this due to the fact that we're reading in only one line of data at a time? I've attached my sample code, which is just changed slightly from your original example.
Shivam Malviya
Shivam Malviya 2022-10-12
Hi Rachel,
"On a much larger problem, it seems to take a long time to run."
  • We can improve the performance by making the batch size and the read size equal. I have updated the script. Please find the attached script.
I could not execute the attached script because I don't have "preTexas.mat". Could you please share that file?
Also, is it using less memory now?
  • If not, make sure that your MAT file version is 7.3. As other versions do not support partial loading.
  • You may use the following workflow to check the version
  • Execute the following command
type <MAT-File>.mat
  • Check the comment at the beginning of the file.
  • Make sure it is like the one below.
MATLAB 7.3 MAT-file
  • If it is not, do as follows;
load('<MAT-File>.mat');
save('mycopy.mat','-v7.3');
Please find the attached script.
I hope this helps!

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息

产品


版本

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by