The NARX‐training algorithm uses the entire dataset in each training epoch (i.e., “full‐batch” training, not mini‐batches). And the "dividerand" function Splits the data into three sets—Training, Validation, and Test at random. During each epoch, the network uses the entire “training set” to compute weight updates.
- If your data aren’t shuffled first, a “block” split can inadvertently bias the training set.
- A biased training set can cause the network to fit only a subset of your operating range and then fail badly on the validation or test sets.
- By using "dividerand" function, you ensure that all regions of your input‐output space are (approximately) represented in the training portion which fosters better convergence on a truly global solution.
You can refer the below documentation link for more info on "dividerand" function:
