Hi @Roger Breton, to address the issue of obtaining negative XYZ values after applying your regression model, you can consider a few approaches to constrain your model and ensure that the predicted values remain within a valid range:
- Non-Negative Least Squares Regression: Instead of using a standard linear regression, you can employ a non-negative least squares (NNLS) regression. This method constrains the coefficients to be non-negative, which can help ensure that the predicted values remain positive. In MATLAB, you can use the ‘lsqnonneg’ function for this purpose. Refer to the following MathWorks documentation to know more: https://www.mathworks.com/help/matlab/ref/lsqnonneg.html
- Constrain Output Values: After obtaining predictions from your regression model, you can apply a simple post-processing step to ensure that all values are non-negative. For example, you can set any negative value to a small positive threshold (e.g., 0.1).
- Regularization: Applying regularization techniques can sometimes help stabilize the model and reduce the likelihood of extreme predictions. Lasso (L1) or Ridge (L2) regularization can be explored, which can be implemented in MATLAB using functions like 'lasso' or 'ridge'. Refer to the following MathWorks documentations to know more: https://www.mathworks.com/help/stats/lasso.html
- Transformations: Consider applying a transformation to your data that inherently constrains the values to be positive. For instance, you can transform the XYZ values using a logarithmic or exponential function before fitting the model, and then reverse the transformation on the predictions.
- Linear Scaling: As you suggested, you can apply a linear scaling to the predicted values. This can be done by adding a constant to ensure all values are above a certain threshold.
- Use Constrained Optimization: If you are comfortable with optimization techniques, you could formulate your regression problem as a constrained optimization problem where the constraints enforce non-negativity on the predicted values.
Here is an example of how you might implement a simple post-processing step in MATLAB:
% Assuming 'predicted_Y' is the output from your regression model
>> offset = 0.1; % Choose an appropriate offset
>> scaled_Y = max(predicted_Y, offset);
By applying one or more of these techniques, you should be able to mitigate the issue of negative predictions and ensure your model outputs are realistic for spectrophotometric data.