Partitioning data for Time Series TCN model Training, Validation, and Testing

Question

Isabelle Museck 2024-6-5

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2125801-partitioning-data-for-time-series-tcn-model-training-validation-and-testing

回答： Krishna 2024-6-6

Hello there, I am trying to build a TCN model to predict a continuous variable. I have time series data in which I am using 3 input features (accelrometer measuments in x,y,z directions) to estimate/predict a continuous variable. I have acceleromter data from 10 different trials stored in a 10x1 cell and each cell has the three accelerometer measurments over time stored in a 500x3 table for that trial. The target continous varable I am trying to predict is simialrly stored in a 10x1 cell array with each cell contaning a the a 500x1 table which is the true value of the predicted variable over time named "Taget". If I am trying to build a TCN model with this data what is the best way to partition the data for training, testing (10%), and validation (10%)? I think I need to use the tspartition function but am not sure how to use it for this type of data. Do I need to combine the data from all 10 trials into one large table and then partition? Or should I partition each trial seprately, train the model on a singluar trial, and then retrain the model on the next trial and so on. Any help would be greatly appreciated!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Krishna 2024-6-6

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2125801-partitioning-data-for-time-series-tcn-model-training-validation-and-testing#answer_1468256

Hello Isabelle,

Based on your description, I think you're seeking the correct method for dividing your time series data into training, testing, and validation sets. I can share an effective approach that I have personally utilized.

You've mentioned having 10 observations, with each one comprising both input and output data. Specifically, the input data consists of a time series sequence of 500 steps with 3 features, and the output data is a sequence of 500 steps for a single variable. Therefore, your data should be organized as 1x10 sequences within a cell array, where each sequence is represented as a list of 500x4, including 3 inputs and 1 output.
To partition this data into training, testing, and validation sets, you can use the cvpartition function. However, it's important to note that cvpartition generates two sets at a time, necessitating its use twice. Initially, divide the data into a training set and a combined testing/validation set. Subsequently, split the latter into distinct testing and validation sets. After this the whole trainData would contain 8 sequences(80 percent) and validate and test would contain 1 sequence each (10 percent each).
Once partitioned, proceed to organize the training data into Xtrain, which comprises the input sequences of 500x3, and Ytrain, which includes the output sequences of 500x1.

Please go through the following documentation to learn more,

https://in.mathworks.com/help/stats/cvpartition.html

Hope this helps.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Partitioning data for Time Series TCN model Training, Validation, and Testing

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Partitioning data for Time Series TCN model Training, Validation, and Testing

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论