real or categorical predictors, which one is faster?
显示 更早的评论
In regressions, is there a guidline to treat predictors as real values or categorical?
In a fitting problem with input as X, y where X contains the hour of the day information, e.g. 1, 2, 3, etc.., I tend to consider it as a categorical predictor because the length of unique(X) is limited (i.e. 24). Surprislingly, the fitting procedures seem slower than treating it as real values in a gaussian process fitrgp.
My questions are:
- why does it take longer with categorical predictor?
- in a similar situation, is there a guidline to decide whether take the predictors as real values or categorical inputs?
3 个评论
Walter Roberson
2023-9-17
Have you experimented with passing uint8 data? I don't know if that is permitted; if it is then it would signal that discrete algorithms are to be used
mono
2023-9-17
"why does it take longer with categorical predictor?"
I'd venture owing to the large number of dummy variables introduced by having 24 levels of time being modeled as categorical instead of continuous/discrete. You could try artificially reducing the same data set to 24, 12, 2 levels and see if that hypothesis is correct.
Regardless of whether it's true or not, it's still the model definition and purpose that should be controlling decisions such as this, not anything to do with compute time.
采纳的回答
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Gaussian Process Regression 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!