程序代写代做代考 Logistic Regression 2

Logistic Regression 2
Billie Anderson, Mark Newman
2020-09-22
Agenda
▪ Question from last class
• At your work, what is an example of something with a yes/no outcome? What predictors can help make that determination?
▪ Logistic Regression 2
▪ In-Class Examples
▪ Question for next class
Logistic Regression (Review)
What we know so far about logistic regression
▪ logistic regression is appropriate when the outcome/response (Y variable) is binary
▪ the Y variable or the ‘success’ probability is not modeled directly, the log of the odds is what is modeled in logistic regression
▪ use the glm() function in R to build the logistic regression
▪ interpretation of the coefficient/parameter for a continuous explanatory/predictor (X) variable
▪ compute the ‘success’ probability, \(\pi(x)\) for a given observation in the data set
▪ examine R output and tell if the explanatory/predictor variable is useful in predicting the Y variable
\(H_0: \beta = 0\)
\(H_1: \beta \ne 0\)
We will be looking at more statistical tests for the model this week!
Model Tests for Simple Logistic Regression
Going to continue with our arthritis example we have been working through in class for the last several weeks.
data <- vcd::Arthritis data$Better <- as.numeric(data$Improved > “None”)
str(data)
## ‘data.frame’: 84 obs. of 6 variables:
## $ ID : int 57 46 77 17 36 23 75 39 33 55 …
## $ Treatment: Factor w/ 2 levels “Placebo”,”Treated”: 2 2 2 2 2 2 2 2 2 2 …
## $ Sex : Factor w/ 2 levels “Female”,”Male”: 2 2 2 2 2 2 2 2 2 2 …
## $ Age : int 27 29 30 32 46 58 59 59 63 63 …
## $ Improved : Ord.factor w/ 3 levels “None”<"Some"<..: 2 1 1 3 3 3 1 3 1 1 ... ## $ Better : num 1 0 0 1 1 1 0 1 0 0 ... head(data) ## ID Treatment Sex Age Improved Better ## 1 57 Treated Male 27 Some 1 ## 2 46 Treated Male 29 None 0 ## 3 77 Treated Male 30 None 0 ## 4 17 Treated Male 32 Marked 1 ## 5 36 Treated Male 46 Marked 1 ## 6 23 Treated Male 58 Marked 1 Model Tests for Simple Logistic Regression (cont) Last week, we built the model using the glm() function. fit <- glm(Better ~ Age, data = data, family = binomial) coeftest(fit) ## ## z test of coefficients: ## ## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.642071 1.073167 -2.4619 0.01382 *
## Age 0.049249 0.019357 2.5443 0.01095 *
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
exp(coef(fit))
## (Intercept) Age
## 0.0712136 1.0504819
There are usually two types of hypothesis tests you want to perform with logistic regression. The first one is what we studied extensively last week, in which we tried to determine if age was a good predictor of whether or not a patient reported and improvement of symptoms.
Another test is sometimes referred to as a ‘goodness of fit’ test. Meaning, how well does the model fit the data. Does the output of the model align with the observed data. There are several ‘goodness of fit’ tests, we will examine the deviance statistic and the log-likelihood statistic.
Assessing the Model: the Log-Likelihood statistic
We have seen that the logistic regression model predicts the probability of an event occurring (a ‘success’) for a given observation, denoted as \(\pi(X_i)\).
For any one observation in the data, the Y will be \(0\) if the outcome does not occur and \(1\) if the outcome does occur.
In linear regression, if you wanted to know whether a model fits the data, you compare the observed and predicted values of the outcome. We do a similar procedure in logistic regression.
Likewise, in logistic regression, you can use the observed and predicted values to assess the fit of the model.
In logistic regression, this measure is known as the log-likelihood statistic.
\[\text{log-likelihood}=\sum_{i=1}^{n}\left[ Y_iln(\pi(X_i))+(1-Y_i)ln(1-\pi(X_i)) \right]\]
The log-likelihood is based on summing the probabilities associated with the predicted and actual outcomes.
The log-likelihood statistic is analogous to the residual sum of squares in multiple regression in the sense that it is an indicator of how much unexplained information there is after the model has been fitted.
Large values of the log-likelihood statistic indicate poorly fitting statistical models, because the larger the value of the log-likelihood, the more unexplained observations there are.
Assessing the Model: the Deviance Statistic
The deviance is very closely related to the log-likelihood statistic.
Deviance = \(-2\) x log-likelihood
The overall fit of the model is assessed using the deviance statistic.
Deviance is sometimes referred to as -2LL because of the way it is calculated. It is more convenient when testing goodness of fit for a logistic model to use the Deviance statistic because it follows a chi-square distribution, which means we can more easily perform a statistical test.
You can calculate a log-likelihood or deviance for different models and to compare these models by looking at the difference between their deviances. One use of this is to compare the state of a logistic regression model against some kind of baseline state. The baseline state that’s usually used is the model when only the intercept is included.
In multiple regression, the baseline model we use is the mean of all the Y variable (this is our best guess of the outcome when we have no other information). In logistic regression, if we want to predict the outcome, what would our best guess be?
We can’t use the mean of Y in logistic regression because our outcome is made of \(0s\) and \(1s\) and so the mean is meaningless. However, if we know the frequency of \(0s\) and \(1s\), then the best guess will be the category with the largest number of cases. So, if the outcome occurs 107 times, and doesn’t occur only 72 times, then our best guess of the outcome will be that it occurs (because it occurs more often than it doesn’t).
Assessing the Model: the Deviance Statistic (cont.)
So, much like multiple regression, our baseline model is the model that gives us the best prediction when we know nothing other than the values of the outcome: in logistic regression this will be used to predict the outcome that occurs most often – that is, the logistic regression model when only the intercept is included. If we then add one or more predictor(s) to the model, we can compute the improvement of the model as follows:
\[\chi^2=\left(-2LL(\text{baseline}) \right) – \left(-2LL(\text{new}) \right)\] \[\chi^2=2LL(\text{new})-2LL(\text{baseline})\] And, the degrees of freedom for this statistic is:
\[df=k_{new} – k_{baseline}\] where \(k\) is the number of parameters (estimated coefficients) in the model.
We take the new model deviance and subtract from it the deviance for the baseline model (the model when only the intercept is included). This difference is known as a likelihood ratio, and has a chi-square distribution with degrees of freedom equal to the number of parameters, k, in the new model minus the number of parameters in the baseline model.
The number of parameters in the baseline model will always be \(1\) (the intercept is the only parameter to be estimated); any subsequent model will have degrees of freedom equal to the number of predictors plus \(1\) (the number of predictors plus one parameter representing the intercept).
Small values of the deviance statistic indicate a good model fit. Large values of the deviance statistic indicate a poor model fit for the data.
Goodness of Fit Statistics in R
The output below shows the summary for the fit model.
summary(fit)
##
## Call:
## glm(formula = Better ~ Age, family = binomial, data = data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.51065 -1.12772 0.07936 1.06771 1.76109
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.64207 1.07317 -2.462 0.0138 *
## Age 0.04925 0.01936 2.544 0.0110 *
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 116.45 on 83 degrees of freedom
## Residual deviance: 109.16 on 82 degrees of freedom
## AIC: 113.16
##
## Number of Fisher Scoring iterations: 4
Remember that larger values of the deviance statistic indicate poorer-fitting statistical models. R provides two deviance statistics: the null deviance and the residual deviance. The null deviance is the deviance of the model that contains no predictors other than the intercept, in other words \(-2\)LL(baseline). The residual deviance is the deviance for the model, in other words \(-2\)LL(new)
Goodness of Fit Statistics
We can compare the models (one with intercept-only and one with one predictor, age) by finding the difference in the Deviance statistics.
The difference has a Chi-square distribution.
We can find this difference in one of two ways.
First, we can compute it.
Second, we can use the anova() function.
Let’s do some computations so we can understand what the anova() output is providing us.
library(lmtest)
fit <- glm(Better ~ Age, data = data, family = binomial) Goodness of Fit Statistics (cont.) str(fit) ## List of 30 ## $ coefficients : Named num [1:2] -2.6421 0.0492 ## ..- attr(*, "names")= chr [1:2] "(Intercept)" "Age" ## $ residuals : Named num [1:84] 4.71 -1.3 -1.31 3.9 2.46 ... ## ..- attr(*, "names")= chr [1:84] "1" "2" "3" "4" ... ## $ fitted.values : Named num [1:84] 0.212 0.229 0.238 0.256 0.407 ... ## ..- attr(*, "names")= chr [1:84] "1" "2" "3" "4" ... ## $ effects : Named num [1:84] -0.114 2.544 -0.621 1.63 1.043 ... ## ..- attr(*, "names")= chr [1:84] "(Intercept)" "Age" "" "" ... ## $ R : num [1:2, 1:2] -4.39 0 -237.66 51.66 ## ..- attr(*, "dimnames")=List of 2 ## .. ..$ : chr [1:2] "(Intercept)" "Age" ## .. ..$ : chr [1:2] "(Intercept)" "Age" ## $ rank : int 2 ## $ qr :List of 5 ## ..$ qr : num [1:84, 1:2] -4.3868 0.0958 0.0971 0.0995 0.112 ... ## .. ..- attr(*, "dimnames")=List of 2 ## .. .. ..$ : chr [1:84] "1" "2" "3" "4" ... ## .. .. ..$ : chr [1:2] "(Intercept)" "Age" ## ..$ rank : int 2 ## ..$ qraux: num [1:2] 1.09 1.19 ## ..$ pivot: int [1:2] 1 2 ## ..$ tol : num 1e-11 ## ..- attr(*, "class")= chr "qr" ## $ family :List of 12 ## ..$ family : chr "binomial" ## ..$ link : chr "logit" ## ..$ linkfun :function (mu) ## ..$ linkinv :function (eta) ## ..$ variance :function (mu) ## ..$ dev.resids:function (y, mu, wt) ## ..$ aic :function (y, n, mu, wt, dev) ## ..$ mu.eta :function (eta) ## ..$ initialize: expression({ if (NCOL(y) == 1) { if (is.factor(y)) y <- y != levels(y)[1L] n <- rep.int(1, nobs) y[weights =| __truncated__ ## ..$ validmu :function (mu) ## ..$ valideta :function (eta) ## ..$ simulate :function (object, nsim) ## ..- attr(*, "class")= chr "family" ## $ linear.predictors: Named num [1:84] -1.312 -1.214 -1.165 -1.066 -0.377 ... ## ..- attr(*, "names")= chr [1:84] "1" "2" "3" "4" ... ## $ deviance : num 109 ## $ aic : num 113 ## $ null.deviance : num 116 ## $ iter : int 4 ## $ weights : Named num [1:84] 0.167 0.177 0.181 0.191 0.241 ... ## ..- attr(*, "names")= chr [1:84] "1" "2" "3" "4" ... ## $ prior.weights : Named num [1:84] 1 1 1 1 1 1 1 1 1 1 ... ## ..- attr(*, "names")= chr [1:84] "1" "2" "3" "4" ... ## $ df.residual : int 82 ## $ df.null : int 83 ## $ y : Named num [1:84] 1 0 0 1 1 1 0 1 0 0 ... ## ..- attr(*, "names")= chr [1:84] "1" "2" "3" "4" ... ## $ converged : logi TRUE ## $ boundary : logi FALSE ## $ model :'data.frame': 84 obs. of 2 variables: ## ..$ Better: num [1:84] 1 0 0 1 1 1 0 1 0 0 ... ## ..$ Age : int [1:84] 27 29 30 32 46 58 59 59 63 63 ... ## ..- attr(*, "terms")=Classes 'terms', 'formula' language Better ~ Age ## .. .. ..- attr(*, "variables")= language list(Better, Age) ## .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1 ## .. .. .. ..- attr(*, "dimnames")=List of 2 ## .. .. .. .. ..$ : chr [1:2] "Better" "Age" ## .. .. .. .. ..$ : chr "Age" ## .. .. ..- attr(*, "term.labels")= chr "Age" ## .. .. ..- attr(*, "order")= int 1 ## .. .. ..- attr(*, "intercept")= int 1 ## .. .. ..- attr(*, "response")= int 1 ## .. .. ..- attr(*, ".Environment")=
## .. .. ..- attr(*, “predvars”)= language list(Better, Age)
## .. .. ..- attr(*, “dataClasses”)= Named chr [1:2] “numeric” “numeric”
## .. .. .. ..- attr(*, “names”)= chr [1:2] “Better” “Age”
## $ call : language glm(formula = Better ~ Age, family = binomial, data = data)
## $ formula :Class ‘formula’ language Better ~ Age
## .. ..- attr(*, “.Environment”)=
## $ terms :Classes ‘terms’, ‘formula’ language Better ~ Age
## .. ..- attr(*, “variables”)= language list(Better, Age)
## .. ..- attr(*, “factors”)= int [1:2, 1] 0 1
## .. .. ..- attr(*, “dimnames”)=List of 2
## .. .. .. ..$ : chr [1:2] “Better” “Age”
## .. .. .. ..$ : chr “Age”
## .. ..- attr(*, “term.labels”)= chr “Age”
## .. ..- attr(*, “order”)= int 1
## .. ..- attr(*, “intercept”)= int 1
## .. ..- attr(*, “response”)= int 1
## .. ..- attr(*, “.Environment”)=
## .. ..- attr(*, “predvars”)= language list(Better, Age)
## .. ..- attr(*, “dataClasses”)= Named chr [1:2] “numeric” “numeric”
## .. .. ..- attr(*, “names”)= chr [1:2] “Better” “Age”
## $ data :’data.frame’: 84 obs. of 6 variables:
## ..$ ID : int [1:84] 57 46 77 17 36 23 75 39 33 55 …
## ..$ Treatment: Factor w/ 2 levels “Placebo”,”Treated”: 2 2 2 2 2 2 2 2 2 2 …
## ..$ Sex : Factor w/ 2 levels “Female”,”Male”: 2 2 2 2 2 2 2 2 2 2 …
## ..$ Age : int [1:84] 27 29 30 32 46 58 59 59 63 63 …
## ..$ Improved : Ord.factor w/ 3 levels “None”<"Some"<..: 2 1 1 3 3 3 1 3 1 1 ... ## ..$ Better : num [1:84] 1 0 0 1 1 1 0 1 0 0 ... ## $ offset : NULL ## $ control :List of 3 ## ..$ epsilon: num 1e-08 ## ..$ maxit : num 25 ## ..$ trace : logi FALSE ## $ method : chr "glm.fit" ## $ contrasts : NULL ## $ xlevels : Named list() ## - attr(*, "class")= chr [1:2] "glm" "lm" Goodness of Fit Statistics (cont.) Remember, when we created the fit object in R, an object has ‘variables’ associated with it. We refer to and can use these ‘variables’ much in the same manner as we do when we refer to variables in a data frame. model_chi <- fit$null.deviance - fit$deviance model_chi ## [1] 7.285188 Similarly, the degrees of freedom for the model are stored in the variable df.residual and for the null model are stored as df.null. We can compute the degrees of freedom associated with the chi-square statistic that we just computed by subtracting the degrees of freedom exactly as we did for the deviance values. chidf <- fit$df.null - fit$df.residual chidf ## [1] 1 Goodness of Fit Statistics (cont.) To calculate the probability associated with this chi-square statistic we can use the pchisq() function. This function needs three arguments: the value of chi-square (test statistic that we have just computed as model_chi), degrees of freedom (which we have just computed as chidf) and the appropriate TRUE or FALSE response to the lower.tail = argument. chisq_pvalue <- pchisq(model_chi, chidf, lower.tail = FALSE) chisq_pvalue ## [1] 0.006952544 Goodness of Fit Statistics (cont.) Or more simply anova(fit, test="Chisq") ## Analysis of Deviance Table ## ## Model: binomial, link: logit ## ## Response: Better ## ## Terms added sequentially (first to last) ## ## ## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 83 116.45
## Age 1 7.2852 82 109.16 0.006953 **
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Interpretation of the Deviance Statistic for the Arthritis Data
Now, what does all this mean???
The deviance statistic is performing a hypothesis test for how well the model fits the data.
\(H_0:\)model is not a good fit for the data
\(H_1\): model is a good fit for the data
OR
\(H_0\): model with intercept-only is sufficient for modeling the data
\(H_1\): model with ‘current’ predictor variables is sufficient for modeling the data (i.e. model is a good fit; what you are observe and what you expect are similar)
From the anova() output we observe that the p-value associated with the deviance statistic is significant, \(0.006953\), (less than \(\alpha=0.05\)), we reject \(H_0\) and accept \(H_1\) and conclude that the logistic model with age as predictor of whether the patient is reporting an improvement in symptoms (Better=\(1\)) is a good fit for this data.
Plotting a Binary Response
It is a good idea to plot your data, if possible, to see what the predicted probabilities look like with regards to the response variable.
We will continue using the Arthritis data example.
Let’s plot the predicted probabilities of Better = \(1\) against the predictor variable, age.
gg <- ggplot(data, aes(x = Age, y = Better)) + xlim(5, 95) + theme_bw() + geom_point(position = position_jitter(height = 0.02, width = 0)) + stat_smooth( method = "glm", method.args = list(family = 'binomial'), alpha = 0.1, fill = "blue", size = 2.5, fullrange = TRUE) Plotting a Binary Response gg  Plotting a Binary Response (cont.) Why do you look at plots like the one above? To search for non-linearity in the data. If you detect non-linearity, it means you might need a squared term or an interaction term in the model. You can add a smoother on top of the fitted line to help you detect non-linearity. gg <- gg + stat_smooth(method = "lm", se=FALSE, size=1.2, color="black", fullrange=TRUE) gg <- gg + stat_smooth(method = "loess", se=FALSE, span=0.95, colour="red", size=1.2) Plotting a Binary Response (cont.) Looks like there is a non-linear relationship between age and Better. gg  Multiple Logistic Regression Multiple logistic regression is a natural extension of simple logistic regression. The ‘multiple’ means we now have more than one \(X_i\), that is, more than one explanatory or predictor variable. And, the \(X_i's\) can be continuous or categorical explanatory variables. Going to be introducing some new notation since we have multiple explanatory variables. Let \(x_i=(x_{i1},x_{i2},...x_{ip})\) be a vector of vector of \(p\) explanatory variables for each observation (case) \(i\). Then, the multiple logistic regression model is specified as \[log\left(\frac{\pi_i}{1-\pi_i} \right)=\alpha + x_{i1}\beta_1 + x_{i2}\beta_2+...+x_{ip}\beta_p\] The \(x's\) can contain any of the following: ▪ quantitative variables (e.g., age, income) ▪ polynomial powers of quantitative variables (e.g., age, \(age^2\), \(age^3\)) ▪ transformations of quantitative variables (e.g., log(salary)) ▪ factors, represented as dummy variables for qualitative predictors (e.g., \(P_1\), \(P_2\), \(P_3\), \(P_4\) for four political party affiliations) ▪ interaction terms (e.g., sex * age, or age * income) Arthritis Data We are sticking with our arthritis example we have been working through in class, but now we are going to add more predictor variables. data <- vcd::Arthritis data$Better <- as.numeric(data$Improved > “None”)
str(data)
## ‘data.frame’: 84 obs. of 6 variables:
## $ ID : int 57 46 77 17 36 23 75 39 33 55 …
## $ Treatment: Factor w/ 2 levels “Placebo”,”Treated”: 2 2 2 2 2 2 2 2 2 2 …
## $ Sex : Factor w/ 2 levels “Female”,”Male”: 2 2 2 2 2 2 2 2 2 2 …
## $ Age : int 27 29 30 32 46 58 59 59 63 63 …
## $ Improved : Ord.factor w/ 3 levels “None”<"Some"<..: 2 1 1 3 3 3 1 3 1 1 ... ## $ Better : num 1 0 0 1 1 1 0 1 0 0 ... head(data) ## ID Treatment Sex Age Improved Better ## 1 57 Treated Male 27 Some 1 ## 2 46 Treated Male 29 None 0 ## 3 77 Treated Male 30 None 0 ## 4 17 Treated Male 32 Marked 1 ## 5 36 Treated Male 46 Marked 1 ## 6 23 Treated Male 58 Marked 1 Arthritis Data (cont.) We are going to use Age, Sex and Treatment in the multiple logistic regression. This model would have the form. \[log(\left(\frac{\pi_i}{1-\pi_i} \right)=\alpha + x_{i1}\beta_1 + x_{i2}\beta_2+x_{i3}\beta_3\] \(x_1\) is age, \(x_2\) is Sex, and \(x_3\) is Treatment. For binary variables, R uses dummy coding. The level that is the ‘base’ or ‘reference’ level is the level that is labeled as a 0. All other levels of the categorical variable are compared to the level that is labeled 0. For this data, \[x_2= \left\{ \begin{array}{ll} 0 & \text{Female} \\ 1 & \text{Male} \\ \end{array} \right. \] \[x_3=\left\{ \begin{array}{ll} 0 & \text{Placebo} \\ 1 & \text{Treatment} \\ \end{array} \right.\] Let’s build the model and interpret the coefficients! Arthritis Data (cont.) data$AgeMC <- data$Age - mean(data$Age) fit <- glm(Better ~ AgeMC + Sex + Treatment, data = data, family = binomial) coeftest(fit) ## ## z test of coefficients: ## ## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.414485 0.359090 -1.1543 0.248392
## AgeMC 0.048747 0.020663 2.3591 0.018318 *
## SexMale -1.487831 0.594769 -2.5015 0.012366 *
## TreatmentTreated 1.759804 0.536495 3.2802 0.001037 **
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
The z-value and Pr(>|z|) column are the individual tests of significance for the explanatory/predictor variables. That is, each one of the rows is testing
\(H_0: \beta_i = 0\)
\(H_1: \beta_i \ne 0\)
Since all the individual p-values are \(\le \alpha=0.05\), we can conclude that all these explanatory variables, age, sex, treatment, are useful/helpful in determining if a patient will report an improvement in symptoms (Better=\(1\)).
Age
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.414485 0.359090 -1.1543 0.248392
## AgeMC 0.048747 0.020663 2.3591 0.018318 *
## SexMale -1.487831 0.594769 -2.5015 0.012366 *
## TreatmentTreated 1.759804 0.536495 3.2802 0.001037 **
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Remember, we interpret the coefficients from a logistic regression using an odds ratio, which exponentiate the parameter estimates (coefficients).
Odds ratio for Age: \(e^{0.048747} = 1.05\)
Interpretation: As age increased by one year, the odds of the patient reporting their symptoms improve increase by 1.05.
OR
Remember, you can take the odds ratio and subtract from \(1\) and this tells you how much the odds of the response either increase or decrease.
As age increases by one year, the odds of the patient reporting improved symptoms increases by 5%.
Sex
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.414485 0.359090 -1.1543 0.248392
## AgeMC 0.048747 0.020663 2.3591 0.018318 *
## SexMale -1.487831 0.594769 -2.5015 0.012366 *
## TreatmentTreated 1.759804 0.536495 3.2802 0.001037 **
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Odds ratio for Sex: \(e^{-1.487831}=0.23\)
You are always comparing each level of the categorical predictor to the ‘base’ or ‘reference’ level, which is Female for this problem.
Interpretation: Males are 0.23 times as likely to report improved symptoms as compared to females.
Treatment
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.414485 0.359090 -1.1543 0.248392
## AgeMC 0.048747 0.020663 2.3591 0.018318 *
## SexMale -1.487831 0.594769 -2.5015 0.012366 *
## TreatmentTreated 1.759804 0.536495 3.2802 0.001037 **
## —
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Odds ratio for Treatment: \(e^{1.759804}=5.81\)
You are always comparing each level of the categorical predictor to the ‘base’ or ‘reference’ level, which is Placebo for this problem.
Interpretation: Treated subjects are 5.81 times as likely to report improved symptoms as compared to untreated subjects
Conditional, Full-model and Effects Plots
Types of Plots
▪ Conditional Plots: Stratified plot of Y or logit(Y) vs. one X, conditioned by other predictors only that subset is plotted for each

▪ Full-model Plots: Plots of fitted response surface, P(Y=1), the probabilities from the model, showing all predictors, usually shown in separate panels

Conditional Plots
A conditional plot allows you to visualize the output of the model (the predicted values) and the explanatory/predictor variables; allows you to visualize this relationship.
A conditional plot produce ‘panels’ and each ‘panel’ displays the predicted probability of the response stratified by one of the categorical explanatory/predictor variables.
Let’s take a look!
Going back to our arthritis data example.
gg <- ggplot(data, aes(x = Age, y = Better, color = Treatment)) + xlim(5, 95) + theme_bw() + geom_point(position = position_jitter(height = 0.02, width = 0)) + stat_smooth( method = "glm", method.args = list(family = 'binomial'), alpha = 0.1, fill = "blue", size = 2.5, fullrange = TRUE) Conditional Plots (cont.) gg  Conditional Plots (cont.) How about if we add another categorical variable, Sex, what do the predictions look like. gg + facet_wrap(~ Sex)  Conditional Plots (cont.) The model cannot produce any reasonable estimates for Males when we are using Age as a predictor variable as well. This is a multivariate way of visualizing the data. If we dig a little deeper, we can see one of the issues is that there are ‘small’ samples of Male in both the Treatment and Placebo groups, compared to Females. addmargins(xtabs(~Sex + Treatment, data = data), 2) ## Treatment ## Sex Placebo Treated Sum ## Female 32 27 59 ## Male 11 14 25 Full Model Plots Full model plots allow you to visualize the predicted probabilities from the model for all the explanatory/predictor variables instead of stratifying by one of the variables. We will use binreg_plot() which is a function in the vcd library that allows you to plot the predicted probabilities from any model object. To stay consistent with the text and obtain the same visualizations as the text, we are going to plot the multiple logistic regression for the arthritis data that we looked at previously, and we are going to center the data around the age value of 50. fit <- glm(Better ~ Age + Sex + Treatment, data = data, family=binomial) binreg_plot( fit, type = "link", xlim = c(25,75), ylim = c(-3,3))  In-Class Example Donner party Question for next class ▪ Last class we asked for an example of something with a yes/no outcome. Did your co workers suggest more than one possible predictor?