Hi,
I have a high dimensional data where I've managed to build a classification model using fitctree that is returning satisfactory accuracy. The predictors contain a decent proportion of unknown values represented as NaN.
I chose fitctree because it can handle the unknowns. Now I need to reduce the number of predictors using feature selection because recording all the predictors in the final model is not practical.
Is there a feature selection function that will ignore unknown values? I have looked at fscnca and stepwiselm but both don't seem to work. Removing rows containing NaN in the predictor will ignore many other potentially useful predictors and there is no easy way to replace/estimate the unknowns.
Thank you.