class: center, middle, inverse, title-slide .title[ # PSY 503: Foundations of Statistical Methods in Psychological Science ] .subtitle[ ## Generalized Linear Madness: Logistic Regression ] .author[ ### Jason Geller, Ph.D.Β (he/him/his) ] .institute[ ### Princeton University ] .date[ ### Updated:2022-11-29 ] --- # Today - Finishing cat x cat interactions - Introduction to logistic regression - Fitting the model: - Binomial distribution - Logit Link - Likelihood - Parameter interpretation - Statistical inference: - Model fit and model diagnoistics - Comparing models --- # Linear Model - Model `\(y\)` as a linear function of predictors $$ y = b_0 + b_1*x $$ -- `$$y\sim~Normal(\mu, \sigma)$$` --- - What about if `\(y\)` is not normal? - What if `\(y\)` is categorical with two possible outcomes? <br> <br> `$$y = b_0 + b_1*x$$` `$$y\sim~Bernoulli(p)$$` `$$y\sim~Binomial(n,p)$$` -- - π€― -- *Logistic regression* --- # Binomial Distribution The simplest kind of categorical data is each case is binary <img src="images/examples.PNG" width="100%" style="display: block; margin: auto;" /> -- 0β1 Pass/fail Choose/Donβt choose Support/Not support Relapse/Not relapse --- # Binomial Distribution > Distribution of discrete counts of independent outcomes over a certain set of trials .pull-left[ `$$y\sim binomial(N,p)$$` - N = number of trials - p = probability of βy = 1β - y = {0, 1} - `\(\mu=E(X)=np\)` - `\(Var(X)=np(1-p)\)` - `\(SD(X)=\sqrt{np(1-p)} \text{, where \(p\) is the probability of the βsuccess"}\)` ] .pull-right[ <img src="images/bigraph2.png" width="100%" style="display: block; margin: auto;" /> ] --- background-image: url(images/bour.jpg) background-size: contain --- # Bernoulli Distribution > Distribution of a single, discrete binary outcome .pull-left[ `$$y\sim Bernoulli(p)$$`] .pull-right[ <img src="images/bigraph.png" width="100%" style="display: block; margin: auto;" /> ] --- <img src="images/bernoulli2-1.png" width="80%" style="display: block; margin: auto;" /> --- # Generalized Linear Model - Regression models with non-normal response likelihoods `$$\begin{align*} Y_i & \sim \mathrm{Dist}(\mu_i, \tau) \\ g(\mu_i) & = \eta_i \\ \eta_i & = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \ldots \end{align*}$$` - Where: - `\(y\)` is referenced with `\(\mu\)` - `\(g\)` is link function `$$p(y = 1) = p$$` `$$log(\frac{p}{1 - p})$$` --- # Fitting a Logistic Regression in R ```r # use GLM istead of LM # family changes bi_model <- glm( rvote ~ income, family = binomial( link = "logit")) ``` --- # Logit Link - The logit link transforms a linear model in the log-odds metric: `$$\begin{equation*} \log\left(\frac{p}{1 - p}\right)=\beta_0+\beta_1X \end{equation*}$$` - To a non-linear model in the probability metric: `$$\begin{equation*} p(logit)=\frac{e^{logit}}{1+e^{logit}} \end{equation*}=\frac{exp(logit)}{1+exp(logit)}$$` --- # Logit Link - Step 1 - Transform probability into odds - The "odds" of an event are just the probability it happens divided by the probability it does not happen `$$\textrm{Odds} = \frac{\# \textrm{successes}}{\# \textrm{failures}}= \frac{\# \textrm{successes}/n}{\# \textrm{failures}/n}= \frac{p}{1-p}$$` --- # Logit Link - Step 2 .pull-left[ - When we convert a probability to odds, odds always > 0 - Problematic for linear model - Take log of the odds $$ logit = log(\frac{p}{1-p})$$ ] .pull-right[ <img src="images/odds_fig.png" width="100%" style="display: block; margin: auto;" /> ] --- # Logit Link .pull-left[ - This log odds conversion has a nice property - It converts odds of less than one to negative numbers, because the log of a number between 0 and 1 is always negative - Our data is now linear! ] .pull-right[ <img src="log_reg_files/figure-html/unnamed-chunk-10-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # However... - We need a function that scrunches the output of the linear predictor into a range appropriate for this parameter <img src="images/lin2.png" width="50%" style="display: block; margin: auto;" /> --- # Squiggles - We need a function that scrunches the output of the linear predictor into a range appropriate for this parameter -- - A squiggle (logistic or Sigmoid curve) to the rescue! <img src="images/lin3.png" width="100%" style="display: block; margin: auto;" /> --- # Anatomy of Logistic Curve <img src="images/anatlog.png" width="100%" style="display: block; margin: auto;" /> --- <img src="images/lin4.png" width="100%" style="display: block; margin: auto;" /> --- <img src="images/lin5.png" width="100%" style="display: block; margin: auto;" /> `$$p(logit)=\frac{e^{logit}}{1+e^{{logit}}}$$` --- # Probabilities, Odds, and Log Odds <img src="images/relationship.png" width="100%" style="display: block; margin: auto;" /> --- <img src="log_reg_files/figure-html/unnamed-chunk-17-1.png" width="100%" /> --- <img src="images/odds.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- # Best Fitting Squiggle - In standard linear regression we use least squares to find the best fitting line <img src="images/lin_fit.png" width="60%" style="display: block; margin: auto;" /> --- # Likelihood - In logistic regression we try to maximize the likelihood of the data - Minimize the *deviance* (more on this later) - *Similar, but not the same as probability* --- # Probabality vs. Likelihood - Probability .pull-left[ > If I assume a distribution with certain parameters, how often do I see a particular value in the data? - Prβ‘γ(π¦>0βπ=0,π=1)=.50γ - Prβ‘γ(β1<π¦<1βπ=0,π=1)=.68γ - Prβ‘γ(0<π¦<1βπ=0,π=1)=.34γ - Prβ‘γ(π¦>2βπ=0,π=1)=.02γ -*Here distribution is fixed and data can change* ] .pull-right[ <img src="images/normal.png" width="100%" style="display: block; margin: auto;" /> ] --- # Likelihood .pull-left[ - `\(L(π,πβπ₯)\)` - Holding a sample of data constant, which parameter values are more likely? - Which values have higher likelihood? *Here data is fixed and distribution can change* ] .pull-right[ <img src="images/likelihood-2.png" width="100%" style="display: block; margin: auto;" /> ] --- <img src="images/like1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="images/like2.png" width="100%" style="display: block; margin: auto;" /> --- <img src="images/like4.png" width="100%" style="display: block; margin: auto;" /> --- <img src="images/like5.png" width="100%" style="display: block; margin: auto;" /> --- <img src="images/like6.png" width="100%" style="display: block; margin: auto;" /> --- # Maximum Likelihood Estimation - Logistic regression - Specify a likelihood function - Add data - Find the parameters values that maximize the - likelihood function, given the observed data --- # ML: Logistic Regression - We need find where points lie on line and get corresponding logits <br> <br> <br> .pull-left[ <img src="images/logit_points.jpeg" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="images/logit_line.jpeg" width="100%" style="display: block; margin: auto;" /> ] --- # ML: Logistic Regression - plot them as a function of probabilitiy <br> <br> <img src="images/max.png" width="50%" style="display: block; margin: auto;" /> --- # ML: Logistic Regression <img src="images/log_ML1.png" width="90%" style="display: block; margin: auto;" /> --- # ML: Logistic Regression <img src="images/log_ML2.png" width="90%" style="display: block; margin: auto;" /> --- # Now What? - We keep rotating the log odds(line) and projecting data on to it and transforming into probabilities - Until we find maximum likelihood! <img src="images/mle-many.png" width="70%" style="display: block; margin: auto;" /> --- # Slope and Intercept Parameters <br> <br> .pull-left[ <img src="images/MLE_1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="images/mle-2.png" width="100%" style="display: block; margin: auto;" /> ] --- class: middle # Interactive: Understanding Maximum Likelihood Estimation: https://rpsychologist.com/likelihood/ --- # Dataset - The Bechdel test <img src="images/BechdelTest.jpeg" width="100%" style="display: block; margin: auto;" /> --- # Data ```r #install.packages(fivethirtyeight) bechdel=fivethirtyeight::bechdel glimpse(bechdel) ``` ``` ## Rows: 1,794 ## Columns: 15 ## $ year <int> 2013, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 20β¦ ## $ imdb <chr> "tt1711425", "tt1343727", "tt2024544", "tt1272878", "tt0β¦ ## $ title <chr> "21 & Over", "Dredd 3D", "12 Years a Slave", "2 Guns", "β¦ ## $ test <chr> "notalk", "ok-disagree", "notalk-disagree", "notalk", "mβ¦ ## $ clean_test <ord> notalk, ok, notalk, notalk, men, men, notalk, ok, ok, noβ¦ ## $ binary <chr> "FAIL", "PASS", "FAIL", "FAIL", "FAIL", "FAIL", "FAIL", β¦ ## $ budget <int> 13000000, 45000000, 20000000, 61000000, 40000000, 225000β¦ ## $ domgross <dbl> 25682380, 13414714, 53107035, 75612460, 95020213, 383624β¦ ## $ intgross <dbl> 42195766, 40868994, 158607035, 132493015, 95020213, 1458β¦ ## $ code <chr> "2013FAIL", "2012PASS", "2013FAIL", "2013FAIL", "2013FAIβ¦ ## $ budget_2013 <int> 13000000, 45658735, 20000000, 61000000, 40000000, 225000β¦ ## $ domgross_2013 <dbl> 25682380, 13611086, 53107035, 75612460, 95020213, 383624β¦ ## $ intgross_2013 <dbl> 42195766, 41467257, 158607035, 132493015, 95020213, 1458β¦ ## $ period_code <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,β¦ ## $ decade_code <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,β¦ ``` --- # Fitting a Linear Model <img src="log_reg_files/figure-html/fi-1.png" width="80%" /> - Not good for two reasons: 1. Outcome is non-normal 2. The predicted regression line is unbounded, meaning that it can go from βto-β --- # Model `$$\begin{align*} \text{pass}_i & \sim \operatorname{Bernoulli}(n = 1, p_i) \\ \operatorname{logit}(p_i) & = \beta_0 + \beta_1 \text{budget_2013}_i, \end{align*}$$` ```r fit <- glm( data = bechdel, family = binomial(link="logit"), pass ~ 1 + budget_2013) new=-5.972374e-09*1e8 ``` --- # Logistic Regression in R ```r fit <- glm( data = bechdel, family = binomial(link = "logit"), pass ~ 1 + budget_2013) %>% tidy() fit ```
term
estimate
std.error
statistic
p.value
(Intercept)
0.111
0.069
1.61
0.107
budget_2013
-5.97e-09
9.56e-10
-6.25
4.11e-10
--- # Summary Output ``` ## ## Call: ## glm(formula = pass ~ 1 + budget, family = binomial(link = "logit"), ## data = bechdel) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -1.1955 -1.1221 -0.9046 1.2088 1.9312 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 4.248e-02 6.606e-02 0.643 0.52 ## budget -5.797e-09 1.083e-09 -5.354 8.6e-08 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 2467.3 on 1793 degrees of freedom ## Residual deviance: 2436.2 on 1792 degrees of freedom ## AIC: 2440.2 ## ## Number of Fisher Scoring iterations: 4 ``` --- # Interpreting the intercept: `\(\beta_0\)`
term
estimate
std.error
statistic
p.value
(Intercept)
0.0425
0.0661
0.643
0.52
budget
-5.8e-09
1.08e-09
-5.35
8.6e-08
- Intercept - When budget = 0, log-odds of Y are 0.04 - A movie with a budget of zero has a 51% passing the test --- # Interpreting slope: `\(\beta_1\)` - If X is quantitative predictor: - As `\(X_i\)` increases by 1 unit, log-odds of Y increases by `\(\beta_1\)` - As `\(X_i\)` increases by 1 unit, the odds of Y multiply by a factor of `\(exp(\beta_1)\)` - Budget_2013 = pretty small so change to -0.5972374 which is interpreted as the expected change in log odds for every 100,000,000 spent on movie budget - Divide by 4 rule to get upper limit of max probability change (Gelman & Hill, 2007) (every 1 unit increase in X,P(y) will increase by no more than `\(\beta/4\)`) --- # Interpreting slope: Ξ²1 - If X is a categorical predictor - The difference in the log-odds between group X and the baseline is `\(\beta_1\)` - The odds of Y for group X are expected to be exp{Ξ²1} times the odds of Y for the baseline group. --- # Coffiencet Effect Size: Odd's Ratio - For every 100,000,000 spent in movie budget, the odds of passing are decreased by a factor of .55:1 - Describe as percentage increase/decrease - *Remember how to interpret logs for DV* ```r exp(new)-1 # get probability of increase/decrease ``` ``` ## [1] -0.4496701 ``` - For every 100,000,000 spent on movie budget, the odds of passing the test is reduced by 45% --- --- # Coffiencet Effect Size: Odd's Ratio ```r tidy(fit,exponentiate = TRUE) # get odds ratio ```
term
estimate
std.error
statistic
p.value
(Intercept)
1.04
0.0661
0.643
0.52
budget
1
1.08e-09
-5.35
8.6e-08
- Greater than 1: As the continuous variable increases, the event is more likely to occur - Less than 1: As the variable increases, the event is less likely to occur - Equals 1: As the variable increases, the likelihood of the event does not change --- # Hypothesis test for `\(\beta_j\)` .vocab[Hypotheses]: `\(H_0: \beta_j = 0 \hspace{2mm} \text{ vs } \hspace{2mm} H_a: \beta_j \neq 0\)` -- .vocab[Test Statistic]: `$$z = \frac{\hat{\beta}_j - 0}{SE_{\hat{\beta}_j}}$$` -- .vocab[P-value]: `\(P(|Z| > |z|)\)`, where `\(Z \sim N(0, 1)\)`, the Standard Normal distribution --- # Confidence interval for `\(\beta_j\)` We can calculate the .vocab[C% confidence interval] for `\(\beta_j\)` as the following: .eq[ `$$\Large{\hat{\beta}_j \pm z^* SE_{\hat{\beta}_j}}$$` where `\(z^*\)` is calculated from the `\(N(0,1)\)` distribution ] -- `$$\exp\{\hat{\beta}_1 \pm z^* SE(\hat{\beta}_1)\}$$` --- # Log likelihood - With large samples, likelihood values β(π,πβπ₯) get very small very fast - To make them easier to work with, we usually work with the log-likelihood .eq[ `$$\log L = \sum\limits_{i=1}^n[y_i \log(\hat{\pi}_i) + (1 - y_i)\log(1 - \hat{\pi}_i)]$$` ] -- - Measure of how well the model fits the data -- - Higher values of `\(\log L\)` are better -- - .vocab[Deviance] = `\(-2 \log L\)` - `\(-2 \log L\)` follows a `\(\chi^2\)` distribution with `\(n - p - 1\)` degrees of freedom --- # `\(\chi^2\)` distribution <img src="log_reg_files/figure-html/unnamed-chunk-43-1.png" width="80%" style="display: block; margin: auto;" /> --- # Comparing Nested Models - Suppose there are two models: - Reduced model includes predictors `\(x_1, \ldots, x_q\)` - Full model includes predictors `\(x_1, \ldots, x_q, x_{q+1}, \ldots, x_p\)` -- - We want to test the hypotheses `$$\begin{aligned}&H_0: \beta_{q+1} = \dots = \beta_p = 0 \\ & H_a: \text{ at least 1 }\beta_j \text{ is not } 0\end{aligned}$$` -- - To do so, we will use the .vocab[Drop-in-deviance test] (also known as the Nested Likelihood Ratio test) --- # Drop-In-Deviance Test .vocab[Hypotheses]: `$$\begin{aligned}&H_0: \beta_{q+1} = \dots = \beta_p = 0 \\ & H_a: \text{ at least 1 }\beta_j \text{ is not } 0\end{aligned}$$` -- .vocab[Test Statistic]: `$$G = (-2 \log L_{reduced}) - (-2 \log L_{full})$$` -- .vocab[P-value]: `\(P(\chi^2 > G)\)`, calculated using a `\(\chi^2\)` distribution - df = `\(df_1\)` - `\(df_2\)` --- # Testing Deviance - We can use the anova function to conduct this test - Add test = "Chisq" to conduct the drop-in-deviance test ```r anova(model1, model2, test = "Chisq") ``` - `Performance` package ```rm test_likelihoodratio(model1, model2) ``` --- # Model Fit - McFadden Psuedo `\(R^2\)` `$$R^2 = \frac{SSM}{SST} = \frac{SST-SSE}{SST}=1-\frac{SSE}{SST}$$` `$$R_{MF}^2 = 1-\frac{D_{\rm residual}}{D_{\rm null}}$$` - Tjur's R2 (`easystats`) ```r r2_tjur(model) ``` - Both a ratio indicating how "good" the model fit --- # Logistic Diagnostics --- # Visualzing Logistic Data - How boring! .pull-left[ ```r p %>% ggplot(aes(x = budget_2013, y = fit)) + geom_ribbon(aes(ymin = ll, ymax = ul), alpha = 1/2) + geom_line() + geom_point(data = bechdel, aes(y = pass), alpha = 1/2) + scale_y_continuous("probability of passing", expand = expansion(mult = 0.01)) ``` ] .pull-right[ <img src="log_reg_files/figure-html/unnamed-chunk-49-1.png" width="100%" /> ] --- # Better? .pull-left[ ```r p %>% ggplot(aes(x = budget_2013, y = fit)) + geom_ribbon(aes(ymin = ll, ymax = ul), alpha = 1/2) + geom_line() + geom_jitter(data = bechdel, aes(y = pass), size = 1/4, alpha = 1/2, height = 0.05) + scale_y_continuous("probability of passing", expand = c(0, 0)) ``` ] .pull-right[ <img src="log_reg_files/figure-html/unnamed-chunk-51-1.png" width="100%" /> ] --- # Better? .pull-left[ ```r p %>% ggplot(aes(x = budget_2013)) + geom_ribbon(aes(ymin = ll, ymax = ul), alpha = 1/2) + geom_line(aes(y = fit)) + stat_dots(data = bechdel, aes(y = pass, side = ifelse(pass == 0, "top", "bottom")), scale = 1/3) + scale_y_continuous("probability of passing", expand = c(0, 0)) ``` ] .pull-right[ <img src="log_reg_files/figure-html/unnamed-chunk-53-1.png" width="100%" height="80%" /> ] --- # Better? .pull-left[ ```r p %>% ggplot(aes(x = budget_2013)) + geom_ribbon(aes(ymin = ll, ymax = ul), alpha = 1/2) + geom_line(aes(y = fit)) + stat_dots(data = bechdel %>% mutate(binary = factor(binary, levels = c("PASS", "FAIL"))), aes(y = pass, side = ifelse(pass == 0, "top", "bottom"), color = binary), scale = 0.4, shape = 19) + scale_color_manual("Bechdel test", values = c("#009E73", "#D55E00")) + scale_x_continuous("budget (in 2013 dollars)", breaks = c(0, 1e8, 2e8, 3e8, 4e8), labels = c(0, str_c(1:4 * 100, " mill")), expand = c(0, 0), limits = c(0, 48e7)) + scale_y_continuous("probability of passing", expand = c(0, 0)) + theme(panel.grid = element_blank()) ``` ] .pull-right[ <img src="log_reg_files/figure-html/unnamed-chunk-55-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Case Study: Loseing Weight - These data are a sample of 500 teens from data collected in 2009 through the U.S. Youth Risk Behavior Surveillance System (YRBSS) - Are the odds that young females report trying to lose weight greater than the odds that males do? Is increasing BMI associated with an interest in losing weight, regardless of sex? ```r risk2009.data <- foreign::read.spss( file = "https://github.com/proback/BeyondMLR/blob/master/data/yrbs09.sav?raw=true", to.data.frame = TRUE) set.seed(33) risk2009.samp <- risk2009.data %>% sample_n(500) dim(risk2009.samp) ``` ``` ## [1] 500 208 ``` --- # Trying to lose weight ```r risk2009 <- risk2009.samp %>% transmute(lose.wt = ifelse(Q66 == "Lose weight", "Lose weight", "No weight loss") %>% factor(), lose.wt.01 = ifelse(Q66 == "Lose weight", 1, 0), sex = Q2, sport = ifelse(Q84 == "0 teams", "No sports", "Sports") %>% factor(), sport_fac = Q84, sport_num = str_extract(Q84, "\\d") %>% as.double(), media = as.numeric(Q81) - 2, media = ifelse(media == 0, 0.5, media), media = ifelse(media == -1, 0, media), bmipct = bmipct) %>% mutate(female = ifelse(sex == "Female", 1, 0)) %>% filter(complete.cases(.)) # what have we done? glimpse(risk2009) ``` ``` ## Rows: 445 ## Columns: 9 ## $ lose.wt <fct> Lose weight, No weight loss, No weight loss, No weight lossβ¦ ## $ lose.wt.01 <dbl> 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0,β¦ ## $ sex <fct> Male, Female, Male, Male, Male, Male, Male, Male, Male, Malβ¦ ## $ sport <fct> No sports, No sports, Sports, No sports, No sports, No sporβ¦ ## $ sport_fac <fct> 0 teams, 0 teams, 1 team, 0 teams, 0 teams, 0 teams, 0 teamβ¦ ## $ sport_num <dbl> 0, 0, 1, 0, 0, 0, 0, 1, 1, 3, 0, 0, 1, 1, 3, 1, 1, 1, 3, 1,β¦ ## $ media <dbl> 3.0, 1.0, 3.0, 3.0, 3.0, 2.0, 0.0, 0.5, 1.0, 1.0, 5.0, 2.0,β¦ ## $ bmipct <dbl> 98, 41, 6, 41, 99, 97, 0, 100, 3, 8, 99, 13, 87, 71, 93, 97β¦ ## $ female <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1,β¦ ``` --- # Trying to lose weight - Selected variables: DV: *`lose.wt.01`, which is coded 1 when someone indicated they were trying to lost weight and 0 when they were not trying to. This is a dichotomized version of the `lose.wt` factor variable. - Our two predictor variables will be: * `sex` and its related dummy variable `female` and `male` * `bmipct`, which is the percentile for a given BMI for members of the same sex --- # Exploratory Data Analysis - Here are two versions of comparing `lose.wt.01` by `sex` .pull-left[ <img src="log_reg_files/figure-html/unnamed-chunk-58-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-left[ <img src="log_reg_files/figure-html/unnamed-chunk-59-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Exploratory Data Analysis - Here are the histograms for `bmipct`, faceted by `lose.wt`. <img src="log_reg_files/figure-html/unnamed-chunk-60-1.png" width="80%" style="display: block; margin: auto;" /> --- # Exploratory Data Analysis - Here are the histograms for `bmipct`, faceted by both `lose.wt` and `sex`. <img src="log_reg_files/figure-html/unnamed-chunk-61-1.png" width="70%" style="display: block; margin: auto;" /> --- # Logistic Regression: Fitting Model - Let's fit an intercept-only model $$ `\begin{align*} \text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i)\\ \text{logit}(p_i) & = \beta_0 \end{align*}` $$ --- # Logistic Regression: Fitting Model ```r # fit fit1 <- glm( data = risk2009, family = binomial, lose.wt.01 ~ 1 ) fit1 %>% tidy() ```
term
estimate
std.error
statistic
p.value
(Intercept)
-0.212
0.0953
-2.22
0.0262
--- # Interpreting `\(\beta_0\)`
term
estimate
std.error
statistic
p.value
(Intercept)
-0.212
0.0953
-2.22
0.0262
- The intercept `\(\beta_0\)` is in the log-odds metric -- - We can convert it to probability ```r b0 <- coef(fit1) exp(b0) / (1 + exp(b0)) ``` ``` ## (Intercept) ## 0.447191 ``` - Thus, the overall probability of students wanting to lose weight was about .45 --- - Let's check that with the empirical proportions ```r risk2009 %>% count(lose.wt.01) %>% mutate(proportion = n / sum(n)) ```
lose.wt.01
n
proportion
0
246
0.553
1
199
0.447
- We can actually do that transformation with the base **R** `plogis()` function. ```r plogis(b0) ``` ``` ## (Intercept) ## 0.447191 ``` --- # Logistic Regression - Add our binary predictor `female`, which gives us the model $$ `\begin{align*} \text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\ \text{logit}(p_i) & = \beta_0 + \beta_1 \text{female}_i, \end{align*}` $$ where `female` is an 0/1 dummy variable. --- # Fit Model ```r # fit fit2 <- glm( data = risk2009, family = binomial, lose.wt.01 ~ 1 + female ) # summarize fit2 %>% tidy() ```
term
estimate
std.error
statistic
p.value
(Intercept)
-0.752
0.141
-5.33
9.59e-08
female
1.09
0.198
5.52
3.38e-08
--- - Now we can compute the probabilities for young men and women as follows ```r # young men plogis(b0 + b1 * 0) ``` ``` ## [1] 0.3203463 ``` ```r # young women plogis(b0 + b1 * 1) ``` ``` ## [1] 0.5841121 ``` --- # `Emmeans` - Can use `emmeans` to get probabilities directly from model - Can also get odds ratios for comparisons! ```r emm <- emmeans(fit2, "female", type = "response") pairs(emm, reverse=TRUE) ``` ``` ## contrast odds.ratio SE df null z.ratio p.value ## female1 / female0 2.98 0.589 Inf 1 5.520 <.0001 ## ## Tests are performed on the log odds ratio scale ``` ```r emm ``` ``` ## female prob SE df asymp.LCL asymp.UCL ## 0 0.320 0.0307 Inf 0.263 0.383 ## 1 0.584 0.0337 Inf 0.517 0.648 ## ## Confidence level used: 0.95 ## Intervals are back-transformed from the logit scale ``` --- # Logistic Regression Continuous Now we add our continuous predictor `bmipct`, which gives us the model: $$ `\begin{align*} \text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\ \text{logit}(p_i) & = \beta_0 + \beta_1 \text{bmipct}_i, \end{align*}` $$ ```r # fit fit3 <- glm( data = risk2009, family = binomial, lose.wt.01 ~ 1 + bmipct ) ``` --- # Model 3 ```r # summarize fit3 %>% tidy() ```
term
estimate
std.error
statistic
p.value
(Intercept)
-2.57
0.314
-8.19
2.72e-16
bmipct
0.0355
0.00428
8.3
1.04e-16
The coefficient for `bmipct` seems small at 0.036, but recall that `bmipct` ranges from 0 to 100. - Thus a one point increase being a one percentile increase in BMI, which isn't very much --- # Logistic Multiple Predictors - The first will have both `female` and `bmipct` as predictors: $$ `\begin{align*} \text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\ \text{logit}(p_i) & = \beta_0 + \beta_1 \text{female}_i + \beta_2 \text{bmipct}_i. \end{align*}` $$ --- # Logistic Multiple Predictors - The second model will add an interaction term between the two predictors: $$ `\begin{align*} \text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\ \text{logit}(p_i) & = \beta_0 + \beta_1 \text{female}_i + \beta_2 \text{bmipct}_i + \beta_3 \text{female}_i \cdot \text{bmipct}_i. \end{align*}` $$ --- # Fitting Both Models ```r fit4 <- glm( data = risk2009, family = binomial, lose.wt.01 ~ 1 + female + bmipct ) fit5 <- glm( data = risk2009, family = binomial, lose.wt.01 ~ 1 + female + bmipct + female:bmipct ) ``` --- # Model 4 ``` ## ## Call: glm(formula = lose.wt.01 ~ 1 + female + bmipct, family = binomial, ## data = risk2009) ## ## Coefficients: ## (Intercept) female bmipct ## -4.25914 1.86067 0.04715 ## ## Degrees of Freedom: 444 Total (i.e. Null); 442 Residual ## Null Deviance: 611.9 ## Residual Deviance: 463 AIC: 469 ``` --- # Model 5 ```r glm(data = risk2009, family = binomial, lose.wt.01 ~ 1 + female + bmipct + female:bmipct ) %>% tidy() ```
term
estimate
std.error
statistic
p.value
(Intercept)
-4.65
0.734
-6.33
2.53e-10
female
2.41
0.839
2.87
0.00405
bmipct
0.0519
0.00883
5.88
4.12e-09
female:bmipct
-0.00768
0.011
-0.698
0.485
--- # Model Comparisons <img src="log_reg_files/figure-html/unnamed-chunk-76-1.png" width="70%" style="display: block; margin: auto;" /> - The interaction term `female:bmipct` (i.e., `\(\beta_3\)`) would not be statistically significant. --- # Likelihood Ratio Test - Let's compare the models by performing likelihood ratio test ```r test_likelihoodratio(fit4, fit5) ```
Name
Model
df
df_diff
Chi2
p
fit4
glm
3
fit5
glm
4
1
0.498
0.481
--- ```r bind_rows( glance(fit1), glance(fit2), glance(fit3), glance(fit4), glance(fit5) ) %>% mutate(fit = str_c("fit", 1:5)) %>% select(fit, everything()) ```
fit
null.deviance
df.null
logLik
AIC
BIC
deviance
df.residual
nobs
fit1
612
444
-306
614
618
612
444
445
fit2
612
444
-290
584
593
580
443
445
fit3
612
444
-263
529
537
525
443
445
fit4
612
444
-231
469
481
463
442
445
fit5
612
444
-231
470
487
462
441
445
- When you compare model `fit4` with the interaction model `fit5`, it appears that the interaction term does not add anything --- Let's plot the results from `fit4`. <img src="log_reg_files/figure-html/unnamed-chunk-79-1.png" width="80%" /> --- # Model Comparisons - Radar plots ```r #easystats package plot(compare_performance(fit1,fit2,fit3, fit4)) ``` <img src="log_reg_files/figure-html/unnamed-chunk-80-1.png" width="90%" style="display: block; margin: auto;" /> --- # Visualization ```r library(sjPlot) # need to plot predictors library(sjmisc) plot_model(fit4, type="pred", terms=c("bmipct", "female")) + theme_bw() # pred is predictors ``` <img src="log_reg_files/figure-html/unnamed-chunk-81-1.png" width="50%" style="display: block; margin: auto;" /> --- # Visualization ```r library(sjPlot) # need to plot predictors library(sjmisc) plot_model(fit4, type="pred", terms=c("bmipct")) + theme_bw() # pred is predictors ``` <img src="log_reg_files/figure-html/unnamed-chunk-82-1.png" width="50%" style="display: block; margin: auto;" /> --- # Visualization ```r library(ggeffects) plot(ggpredict(fit4, "bmipct"), ci = FALSE, add.data = TRUE) ``` <img src="log_reg_files/figure-html/unnamed-chunk-83-1.png" width="60%" style="display: block; margin: auto;" /> --- # Table ```r #library(sjplot) fit3 %>% tab_model() ``` <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">lose wt 01</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Odds Ratios</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.08</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.04 – 0.14</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "><strong><0.001</strong></td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">bmipct</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">1.04</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">1.03 – 1.05</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "><strong><0.001</strong></td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">445</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R<sup>2</sup> Tjur</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.181</td> </tr> </table> --- ```r data=read_csv("https://raw.githubusercontent.com/jgeller112/psy503-psych_stats/master/static/slides/14-Logistic_Regression/data/gss.csv") ``` DV: Attended a religious service in the last week Independent Variables: - Age - Sex - Socioeconomic Status (SES) - Negative attitudes toward premarital sex --- # Report Logistic Regression - State your hypothesis, statistical test, its link function, and justify the use of a logistic regression > We hypothesized that people who were against premarital sex would be more likely to have attended a religious event in the previous week. Religious event attendance was a binary outcome as the person either attended an event or they did not, which violated the normality assumption required for traditional regression. Thus, a logistic regression with a logit link function was used to model religious event attendance using R 4.0.4 (R Core Team, 2020). --- # Report Logistic Regression - State the full model and your results, including the Odds Ratio > Religious event attendance was modelled as a function of attitudes about premarital sex, controlling for the age, sex, and socioeconomic status of the respondent. As shown in Figure 1, this analysis revealed that attitudes about premarital sex significantly predicted religious event attendance, b = 0.72, SE = 0.05, z(1303) = 13.24, p < 0.01, Odds Ratio = 2.05:1. This implies that every 1-unit increase in negative attitudes toward premarital sex predicted a 2-fold increase in the likelihood that the person had attended religious services in the previous week. Table 1 provides the model estimates for all predictors.