About Generalized Linear Models

Question

Baloo 2022-9-15

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1805955-about-generalized-linear-models

回答： Dolon Mandal 2023-9-12

Dear All,

my question my sound naive, but I am pretty new to the field. I am trying to train a GLM on a dataset, consisting of 4 predictor vectors + 1 binary response vector.

First, what I did was to use the function stepwiseglm() to find which was the model select most often, while bootstrapping vector responses, and selecting prredictors accordingly. As a result, I got that a model with two continuous variables was selected most often.

Second, I wanted to focus on such two variables, and study more in detail their behavior. I thus implemented an analysis using the two selected predictor vectors + 1 binary response vector, and I launched the glmfit() function, again bootstrapping variables.

Here comes my question: apparently, despite the same setup of the functions, I get different results for the coefficients aassociated with the two predictor variables (different in absolute values and also in the sign). Moreover, while the model is associated with a significant p-value when running stepwiseglm(), this is not the case with glmfit().

I was not able to find out how the two functions compute the coefficients, and how the fit works (I was expecting very similar results, but this is apparently not the case).

To confound even more, I found that that if I perform a fit with fitglm() the results I get are similar to what retrieved with stepwiseglm().

Could you please provide some further detail on what would be the best choice in my case, and where is the difference between the stepwiseglm() and glmfit() algorithms, apart from the adding/removing of variables?

I thank you in advance.

Best regards

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Dolon Mandal 2023-9-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1805955-about-generalized-linear-models#answer_1307721

The differences you observe in the coefficients and p-values between `stepwiseglm`, `glmfit`, and `fitglm` can be attributed to the different algorithms and methodologies employed by these functions. Here's an explanation of each function and their differences:

1. `stepwiseglm`: This function performs stepwise model selection using generalized linear models (GLMs). It automatically adds or removes predictors to find the best subset of predictors based on a specified criterion (e.g., AIC, BIC). The selection process is based on statistical tests and model comparison. However, it's important to note that stepwise selection can be sensitive to the specific dataset and may not always produce the most accurate or stable results.

2. `glmfit`: This function fits a GLM using maximum likelihood estimation (MLE). It estimates the model coefficients by maximizing the likelihood of the observed data given the model. `glmfit` does not perform automatic variable selection or model comparison. It simply estimates the coefficients based on the specified predictors and response.

3. `fitglm`: This function also fits a GLM using maximum likelihood estimation (MLE). It is similar to `glmfit`, but it provides additional flexibility and options for specifying the model, including different link functions and error distributions. `fitglm` allows for more customization in the GLM fitting process.

The differences in coefficients and p-values between `stepwiseglm` and `glmfit` can arise due to the different approaches used for model selection and estimation. `stepwiseglm` may prioritize a subset of predictors based on the selection criterion, while `glmfit` estimates coefficients for all specified predictors without any selection process.

In your case, since you have already identified the two predictor variables using `stepwiseglm`, it might be more appropriate to use `fitglm` to fit the GLM with the selected predictors. `fitglm` provides more flexibility and control over the model specification and fitting process.

It's important to note that no single method guarantees the "best" choice of predictors or model. It's recommended to consider the specific characteristics of your data, the goals of your analysis, and the underlying assumptions of the GLM to make an informed decision.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

About Generalized Linear Models

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

About Generalized Linear Models

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论