PSY 503: Foundations of Statistical Methods in Psychological Science

class: center, middle, inverse, title-slide

.title[
# PSY 503: Foundations of Statistical Methods in Psychological Science
]
.subtitle[
## Generalized Linear Madness: Logistic Regression
]
.author[
### Jason Geller, Ph.D. (he/him/his)
]
.institute[
### Princeton University
]
.date[
### Updated:2022-11-29
]

---

# Today

- Finishing cat x cat interactions

- Introduction to logistic regression
  
    - Fitting the model: 
    
        - Binomial distribution
        - Logit Link
        - Likelihood
        - Parameter interpretation
         
    - Statistical inference: 
      - Model fit and model diagnoistics
      - Comparing models

---
#  Linear Model

- Model `$y$` as a linear function of predictors

$$ y = b_0 + b_1*x $$

`$$y\sim~Normal(\mu, \sigma)$$`
---
- What about if `$y$` is not normal?

- What if `$y$` is categorical with two possible outcomes?

`$$y = b_0 + b_1*x$$`
`$$y\sim~Bernoulli(p)$$`
`$$y\sim~Binomial(n,p)$$`
--

- 🤯

*Logistic regression*

---
# Binomial Distribution

The simplest kind of categorical data is each case is binary

0–1
Pass/fail
Choose/Don’t choose
Support/Not support
Relapse/Not relapse

---
# Binomial Distribution

> Distribution of discrete counts of independent outcomes over a certain set of trials

.pull-left[

`$$y\sim binomial(N,p)$$`
- N = number of trials
- p = probability of “y = 1”
- y = {0, 1}

- `$\mu=E(X)=np$`
  - `$Var(X)=np(1-p)$`
  - `$SD(X)=\sqrt{np(1-p)} \text{, where \(p$ is the probability of the “success"}\)`
]

.pull-right[

<img src="images/bigraph2.png" width="100%" style="display: block; margin: auto;" />
]

---

background-image: url(images/bour.jpg)
background-size: contain

---
# Bernoulli Distribution

>  Distribution of a single, discrete binary outcome

.pull-left[
`$$y\sim Bernoulli(p)$$`]
.pull-right[
<img src="images/bigraph.png" width="100%" style="display: block; margin: auto;" />
]
---

<img src="images/bernoulli2-1.png" width="80%" style="display: block; margin: auto;" />
---
# Generalized Linear Model

- Regression models with non-normal response likelihoods

`$$\begin{align*}
  Y_i & \sim \mathrm{Dist}(\mu_i, \tau)  \\
  g(\mu_i) & = \eta_i \\
  \eta_i & = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \ldots
\end{align*}$$`

- Where:

- `$y$` is referenced with `$\mu$` 
  
  - `$g$` is link function

`$$p(y = 1)  = p$$`
`$$log(\frac{p}{1 - p})$$`
---
# Fitting a Logistic Regression in R

```r
# use GLM istead of LM
# family changes
bi_model <- glm(
  rvote ~ income,
  family = binomial(
    link = "logit"))
```

---
# Logit Link

- The logit link transforms a linear model in the log-odds metric:

`$$\begin{equation*}
\log\left(\frac{p}{1 - p}\right)=\beta_0+\beta_1X 
 \end{equation*}$$`

- To a non-linear model in the probability metric: 
`$$\begin{equation*} 
p(logit)=\frac{e^{logit}}{1+e^{logit}}
\end{equation*}=\frac{exp(logit)}{1+exp(logit)}$$`
---
# Logit Link

- Step 1

- Transform probability into odds
  
    - The "odds" of an event are just the probability it happens divided by the probability it does not happen

`$$\textrm{Odds} = \frac{\# \textrm{successes}}{\# \textrm{failures}}=
\frac{\# \textrm{successes}/n}{\# \textrm{failures}/n}=
\frac{p}{1-p}$$`

---
# Logit Link

- Step 2

.pull-left[

- When we convert a probability to odds, odds always  > 0 
  
  - Problematic for linear model

-  Take log of the odds
  
  $$ logit = log(\frac{p}{1-p})$$
]

.pull-right[

<img src="images/odds_fig.png" width="100%" style="display: block; margin: auto;" />
]

---
# Logit Link

.pull-left[

- This log odds conversion has a nice property

- It converts odds of less than one to negative numbers, because the log of a number between 0 and 1 is always negative
  
  - Our data is now linear!
]

.pull-right[
            
<img src="log_reg_files/figure-html/unnamed-chunk-10-1.png" width="100%" style="display: block; margin: auto;" />

]
---
# However...

- We need a function that scrunches the output of the linear predictor into a range appropriate for this parameter

<img src="images/lin2.png" width="50%" style="display: block; margin: auto;" />
---
# Squiggles

- We need a function that scrunches the output of the linear predictor into a range appropriate for this parameter

- A squiggle (logistic or Sigmoid curve) to the rescue!

---
# Anatomy of Logistic Curve

---
<img src="images/lin4.png" width="100%" style="display: block; margin: auto;" />
---
<img src="images/lin5.png" width="100%" style="display: block; margin: auto;" />
`$$p(logit)=\frac{e^{logit}}{1+e^{{logit}}}$$`
---
# Probabilities, Odds, and Log Odds

---
<img src="images/odds.png" width="100%" height="100%" style="display: block; margin: auto;" />
---
# Best Fitting Squiggle

- In standard linear regression we use least squares to find the best fitting line

---
# Likelihood

-  In logistic regression we try to maximize the likelihood of the data

- Minimize the *deviance* (more on this later)

- *Similar, but not the same as probability*

---
# Probabality vs. Likelihood

- Probability

.pull-left[

> If I assume a distribution with certain parameters, how often do I see a particular value in the data?

- Pr⁡〖(𝑦>0│𝜇=0,𝜎=1)=.50〗
- Pr⁡〖(−1<𝑦<1│𝜇=0,𝜎=1)=.68〗
- Pr⁡〖(0<𝑦<1│𝜇=0,𝜎=1)=.34〗
- Pr⁡〖(𝑦>2│𝜇=0,𝜎=1)=.02〗

-*Here distribution is fixed and data can change*
]

.pull-right[

<img src="images/normal.png" width="100%" style="display: block; margin: auto;" />
]
---
# Likelihood

.pull-left[
- `$L(𝜇,𝜎│𝑥)$`

- Holding a sample of data constant, which parameter values are more likely?

- Which values have higher likelihood?
  
  *Here data is fixed and distribution can change*
]

.pull-right[
<img src="images/likelihood-2.png" width="100%" style="display: block; margin: auto;" />
]

---
<img src="images/like1.png" width="100%" style="display: block; margin: auto;" />

---
<img src="images/like2.png" width="100%" style="display: block; margin: auto;" />

---
<img src="images/like4.png" width="100%" style="display: block; margin: auto;" />
---
<img src="images/like5.png" width="100%" style="display: block; margin: auto;" />
---
<img src="images/like6.png" width="100%" style="display: block; margin: auto;" />
---
# Maximum Likelihood Estimation

- Logistic regression

- Specify a likelihood function
  - Add data
  - Find the parameters values that maximize the -     likelihood function, given the observed data
---
# ML: Logistic Regression

- We need find where points lie on line and get corresponding logits

<br>
<br>
<br>
.pull-left[

<img src="images/logit_points.jpeg" width="100%" style="display: block; margin: auto;" />
]

.pull-right[

<img src="images/logit_line.jpeg" width="100%" style="display: block; margin: auto;" />
]
---
# ML: Logistic Regression

- plot them as a function of probabilitiy

<img src="images/max.png" width="50%" style="display: block; margin: auto;" />
---
# ML: Logistic Regression

<img src="images/log_ML1.png" width="90%" style="display: block; margin: auto;" />
---
# ML: Logistic Regression

<img src="images/log_ML2.png" width="90%" style="display: block; margin: auto;" />
---
# Now What?

- We keep rotating the log odds(line) and projecting data on to it and transforming into probabilities

- Until we find maximum likelihood!

<img src="images/mle-many.png" width="70%" style="display: block; margin: auto;" />
---
# Slope and Intercept Parameters

<br>
<br>
.pull-left[

<img src="images/MLE_1.png" width="100%" style="display: block; margin: auto;" />
]
.pull-right[
<img src="images/mle-2.png" width="100%" style="display: block; margin: auto;" />
]
---
class: middle

# Interactive: Understanding Maximum Likelihood Estimation: https://rpsychologist.com/likelihood/

---
# Dataset

- The Bechdel test

---
# Data

```r
#install.packages(fivethirtyeight)
bechdel=fivethirtyeight::bechdel
glimpse(bechdel)
```

```
## Rows: 1,794
## Columns: 15
## $ year          <int> 2013, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 20…
## $ imdb          <chr> "tt1711425", "tt1343727", "tt2024544", "tt1272878", "tt0…
## $ title         <chr> "21 & Over", "Dredd 3D", "12 Years a Slave", "2 Guns", "…
## $ test          <chr> "notalk", "ok-disagree", "notalk-disagree", "notalk", "m…
## $ clean_test    <ord> notalk, ok, notalk, notalk, men, men, notalk, ok, ok, no…
## $ binary        <chr> "FAIL", "PASS", "FAIL", "FAIL", "FAIL", "FAIL", "FAIL", …
## $ budget        <int> 13000000, 45000000, 20000000, 61000000, 40000000, 225000…
## $ domgross      <dbl> 25682380, 13414714, 53107035, 75612460, 95020213, 383624…
## $ intgross      <dbl> 42195766, 40868994, 158607035, 132493015, 95020213, 1458…
## $ code          <chr> "2013FAIL", "2012PASS", "2013FAIL", "2013FAIL", "2013FAI…
## $ budget_2013   <int> 13000000, 45658735, 20000000, 61000000, 40000000, 225000…
## $ domgross_2013 <dbl> 25682380, 13611086, 53107035, 75612460, 95020213, 383624…
## $ intgross_2013 <dbl> 42195766, 41467257, 158607035, 132493015, 95020213, 1458…
## $ period_code   <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ decade_code   <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
```

---
# Fitting a Linear Model

<img src="log_reg_files/figure-html/fi-1.png" width="80%" />
- Not good for two reasons:

1. Outcome is non-normal
  2. The predicted regression line is unbounded, meaning that it can go from ∞to-∞
 
---
# Model

`$$\begin{align*}
\text{pass}_i & \sim \operatorname{Bernoulli}(n = 1, p_i) \\
\operatorname{logit}(p_i) & = \beta_0 + \beta_1 \text{budget_2013}_i,
\end{align*}$$`

```r
fit <- glm(
  data = bechdel,
  family = binomial(link="logit"),
  pass ~ 1 + budget_2013)

new=-5.972374e-09*1e8
```

---
# Logistic Regression in R

```r
fit <- glm(
  data = bechdel,
  family = binomial(link = "logit"),
  pass ~ 1 + budget_2013) %>%
  tidy()

fit
```

---
# Summary Output

```
## 
## Call:
## glm(formula = pass ~ 1 + budget, family = binomial(link = "logit"), 
##     data = bechdel)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.1955  -1.1221  -0.9046   1.2088   1.9312  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  4.248e-02  6.606e-02   0.643     0.52    
## budget      -5.797e-09  1.083e-09  -5.354  8.6e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2467.3  on 1793  degrees of freedom
## Residual deviance: 2436.2  on 1792  degrees of freedom
## AIC: 2440.2
## 
## Number of Fisher Scoring iterations: 4
```

---
# Interpreting the intercept: `$\beta_0$`

- Intercept

- When budget = 0, log-odds of Y are 0.04
  
    -  A movie with a budget of zero has a 51% passing the test
    
---
# Interpreting slope: `$\beta_1$`

- If X is quantitative predictor:

- As `$X_i$` increases by 1 unit, log-odds of Y increases by `$\beta_1$`

- As `$X_i$` increases by 1 unit, the odds of Y multiply by a factor of  `$exp(\beta_1)$`

- Budget_2013 = pretty small so change to -0.5972374 which is interpreted as the expected change in log odds for every 100,000,000 spent on movie budget
    
          - Divide by 4 rule to get upper limit of max probability change (Gelman & Hill, 2007) (every 1 unit increase in X,P(y) will increase by no more than `$\beta/4$`)
---
# Interpreting slope: β1

- If X is a categorical predictor

- The difference in the log-odds between group X and the baseline is `$\beta_1$`
  
  - The odds of Y for group X are expected to be exp{β1} times the odds of Y for the baseline group.

---
# Coffiencet Effect Size: Odd's Ratio

- For every 100,000,000 spent in movie budget, the odds of passing are decreased by a factor of .55:1

- Describe as percentage increase/decrease

- *Remember how to interpret logs for DV*

```r
exp(new)-1 # get probability of increase/decrease
```

```
## [1] -0.4496701
```

- For every 100,000,000 spent on movie budget, the odds of passing the test is reduced by 45%

---

---
# Coffiencet Effect Size: Odd's Ratio

```r
tidy(fit,exponentiate = TRUE) # get odds ratio
```

- Greater than 1: As the continuous variable increases, the event is more likely to occur

- Less than 1: As the variable increases, the event is less likely to occur

- Equals 1: As the variable increases, the likelihood of the event does not change

---

# Hypothesis test for `$\beta_j$`

.vocab[Hypotheses]: `$H_0: \beta_j = 0 \hspace{2mm} \text{ vs } \hspace{2mm} H_a: \beta_j \neq 0$`

.vocab[Test Statistic]: `$$z = \frac{\hat{\beta}_j - 0}{SE_{\hat{\beta}_j}}$$`
--

.vocab[P-value]: `$P(|Z| > |z|)$`,

where `$Z \sim N(0, 1)$`, the Standard Normal distribution

---
# Confidence interval for `$\beta_j$`

We can calculate the .vocab[C% confidence interval] for `$\beta_j$` as the following:

.eq[
`$$\Large{\hat{\beta}_j \pm z^* SE_{\hat{\beta}_j}}$$`

where `$z^*$` is calculated from the `$N(0,1)$` distribution

]

`$$\exp\{\hat{\beta}_1 \pm z^* SE(\hat{\beta}_1)\}$$`
---
# Log likelihood

- With large samples, likelihood values ℒ(𝜇,𝜎│𝑥)  get very small very fast

- To make them easier to work with, we usually work with the log-likelihood

.eq[
`$$\log L = \sum\limits_{i=1}^n[y_i \log(\hat{\pi}_i) + (1 - y_i)\log(1 - \hat{\pi}_i)]$$`
]

- Measure of how well the model fits the data

- Higher values of `$\log L$` are better

- .vocab[Deviance] = `$-2 \log L$`

- `$-2 \log L$` follows a `$\chi^2$` distribution with `$n - p - 1$` degrees of freedom
  
---
# `$\chi^2$` distribution

<img src="log_reg_files/figure-html/unnamed-chunk-43-1.png" width="80%" style="display: block; margin: auto;" />
---
# Comparing Nested Models

- Suppose there are two models:

- Reduced model includes predictors `$x_1, \ldots, x_q$`
    - Full model includes predictors `$x_1, \ldots, x_q, x_{q+1}, \ldots, x_p$`

- We want to test the hypotheses

`$$\begin{aligned}&H_0: \beta_{q+1} = \dots = \beta_p = 0 \\
& H_a: \text{ at least 1 }\beta_j \text{ is not } 0\end{aligned}$$`

- To do so, we will use the .vocab[Drop-in-deviance test] (also known as the Nested Likelihood Ratio test)

---
# Drop-In-Deviance Test

.vocab[Hypotheses]:
`$$\begin{aligned}&H_0: \beta_{q+1} = \dots = \beta_p = 0 \\
& H_a: \text{ at least 1 }\beta_j \text{ is not } 0\end{aligned}$$`
--

.vocab[Test Statistic]: `$$G = (-2 \log L_{reduced}) - (-2 \log L_{full})$$`

--
.vocab[P-value]: `$P(\chi^2 > G)$`,

calculated using a `$\chi^2$` distribution

- df = `$df_1$` - `$df_2$`
---
# Testing Deviance

- We can use the anova function to conduct this test

- Add test = "Chisq" to conduct the drop-in-deviance test

```r
anova(model1, model2, test = "Chisq")
```

- `Performance` package

```rm

test_likelihoodratio(model1, model2)

```

---

# Model Fit

- McFadden Psuedo `$R^2$`

`$$R^2 = \frac{SSM}{SST} = \frac{SST-SSE}{SST}=1-\frac{SSE}{SST}$$`
`$$R_{MF}^2 = 1-\frac{D_{\rm residual}}{D_{\rm null}}$$`

- Tjur's R2 (`easystats`)

```r
r2_tjur(model)
```

- Both a ratio indicating how "good" the model fit
---
# Logistic Diagnostics

---
# Visualzing Logistic Data

- How boring!

.pull-left[

```r
p %>% 
  ggplot(aes(x = budget_2013, y = fit)) +
  geom_ribbon(aes(ymin = ll, ymax = ul),
              alpha = 1/2) +
  geom_line() +
  geom_point(data = bechdel,
             aes(y = pass),
             alpha = 1/2) +
  scale_y_continuous("probability of passing", 
                     expand = expansion(mult = 0.01))
```
]

.pull-right[

<img src="log_reg_files/figure-html/unnamed-chunk-49-1.png" width="100%" />
]
---
# Better?

.pull-left[

```r
p %>% 
  ggplot(aes(x = budget_2013, y = fit)) +
  geom_ribbon(aes(ymin = ll, ymax = ul),
              alpha = 1/2) +
  geom_line() +
  geom_jitter(data = bechdel,
              aes(y = pass),
              size = 1/4, alpha = 1/2, height = 0.05) +
  scale_y_continuous("probability of passing", 
                     expand = c(0, 0))
```

]

.pull-right[
<img src="log_reg_files/figure-html/unnamed-chunk-51-1.png" width="100%" />

]

---
# Better?

.pull-left[

```r
p %>% 
  ggplot(aes(x = budget_2013)) +
  geom_ribbon(aes(ymin = ll, ymax = ul),
              alpha = 1/2) +
  geom_line(aes(y = fit)) +
  stat_dots(data = bechdel,
            aes(y = pass, side = ifelse(pass == 0, "top", "bottom")),
            scale = 1/3) +
  scale_y_continuous("probability of passing",
                     expand = c(0, 0))
```

]

.pull-right[
<img src="log_reg_files/figure-html/unnamed-chunk-53-1.png" width="100%" height="80%" />

]

---
# Better?

.pull-left[

```r
p %>% 
  ggplot(aes(x = budget_2013)) +
  geom_ribbon(aes(ymin = ll, ymax = ul),
              alpha = 1/2) +
  geom_line(aes(y = fit)) +
  stat_dots(data = bechdel %>% 
              mutate(binary = factor(binary, levels = c("PASS", "FAIL"))),
            aes(y = pass, 
                side = ifelse(pass == 0, "top", "bottom"),
                color = binary),
            scale = 0.4, shape = 19) +
  scale_color_manual("Bechdel test", values = c("#009E73", "#D55E00")) +
  scale_x_continuous("budget (in 2013 dollars)",
                     breaks = c(0, 1e8, 2e8, 3e8, 4e8),
                     labels = c(0, str_c(1:4 * 100, " mill")),
                     expand = c(0, 0), limits = c(0, 48e7)) +
  scale_y_continuous("probability of passing",
                     expand = c(0, 0)) +
  theme(panel.grid = element_blank())
```

]

.pull-right[

]

---
# Case Study: Loseing Weight

- These data are a sample of 500 teens from data collected in 2009 through the U.S. Youth Risk Behavior Surveillance System (YRBSS)

- Are the odds that young females report trying to lose weight greater than the odds that males do? Is increasing BMI associated with an interest in losing weight, regardless of sex?

```r
risk2009.data <- foreign::read.spss(
  file = "https://github.com/proback/BeyondMLR/blob/master/data/yrbs09.sav?raw=true",
  to.data.frame = TRUE)

set.seed(33)

risk2009.samp <- risk2009.data %>%
  sample_n(500)

dim(risk2009.samp)
```

```
## [1] 500 208
```
---
# Trying to lose weight

```r
risk2009 <- risk2009.samp %>% 
  transmute(lose.wt    = ifelse(Q66 == "Lose weight", "Lose weight", "No weight loss") %>% factor(),
            lose.wt.01 = ifelse(Q66 == "Lose weight", 1, 0),
            sex        = Q2,
            sport      = ifelse(Q84 == "0 teams", "No sports", "Sports") %>% factor(),
            sport_fac  = Q84,
            sport_num  = str_extract(Q84, "\\d") %>% as.double(),
            media      = as.numeric(Q81) - 2,
            media      = ifelse(media == 0, 0.5, media),
            media      = ifelse(media == -1, 0, media),
            bmipct     = bmipct) %>%
  mutate(female = ifelse(sex == "Female", 1, 0)) %>% 
  filter(complete.cases(.))

# what have we done?
glimpse(risk2009)
```

```
## Rows: 445
## Columns: 9
## $ lose.wt    <fct> Lose weight, No weight loss, No weight loss, No weight loss…
## $ lose.wt.01 <dbl> 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0,…
## $ sex        <fct> Male, Female, Male, Male, Male, Male, Male, Male, Male, Mal…
## $ sport      <fct> No sports, No sports, Sports, No sports, No sports, No spor…
## $ sport_fac  <fct> 0 teams, 0 teams, 1 team, 0 teams, 0 teams, 0 teams, 0 team…
## $ sport_num  <dbl> 0, 0, 1, 0, 0, 0, 0, 1, 1, 3, 0, 0, 1, 1, 3, 1, 1, 1, 3, 1,…
## $ media      <dbl> 3.0, 1.0, 3.0, 3.0, 3.0, 2.0, 0.0, 0.5, 1.0, 1.0, 5.0, 2.0,…
## $ bmipct     <dbl> 98, 41, 6, 41, 99, 97, 0, 100, 3, 8, 99, 13, 87, 71, 93, 97…
## $ female     <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1,…
```
---
# Trying to lose weight

- Selected variables:

DV: *`lose.wt.01`, which is coded 1 when someone indicated they were trying to lost weight and 0 when they were not trying to. This is a dichotomized version of the `lose.wt` factor variable.

- Our two predictor variables will be:

* `sex` and its related dummy variable `female` and `male`
  * `bmipct`, which is the percentile for a given BMI for members of the same sex

---
# Exploratory Data Analysis

- Here are two versions of comparing `lose.wt.01` by `sex`

.pull-left[
<img src="log_reg_files/figure-html/unnamed-chunk-58-1.png" width="100%" style="display: block; margin: auto;" />
]

.pull-left[
<img src="log_reg_files/figure-html/unnamed-chunk-59-1.png" width="100%" style="display: block; margin: auto;" />
]
---
# Exploratory Data Analysis

- Here are the histograms for `bmipct`, faceted by `lose.wt`.

<img src="log_reg_files/figure-html/unnamed-chunk-60-1.png" width="80%" style="display: block; margin: auto;" />
---
# Exploratory Data Analysis

- Here are the histograms for `bmipct`, faceted by both `lose.wt` and `sex`.

<img src="log_reg_files/figure-html/unnamed-chunk-61-1.png" width="70%" style="display: block; margin: auto;" />
---
# Logistic Regression: Fitting Model

- Let's fit an intercept-only model

$$
`\begin{align*}
\text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i)\\
\text{logit}(p_i) & = \beta_0
\end{align*}`
$$
---
# Logistic Regression: Fitting Model

```r
# fit
fit1 <- glm(
  data = risk2009,
  family = binomial,
  lose.wt.01 ~ 1
  ) 
fit1 %>% 
  tidy()
```

---
# Interpreting `$\beta_0$`

- The intercept `$\beta_0$` is in the log-odds metric

- We can convert it to probability

```r
b0 <- coef(fit1)

exp(b0) / (1 + exp(b0))
```

```
## (Intercept) 
##    0.447191
```

- Thus, the overall probability of students wanting to lose weight was about .45
---
- Let's check that with the empirical proportions

```r
risk2009 %>% 
  count(lose.wt.01) %>% 
  mutate(proportion = n / sum(n))
```

- We can actually do that transformation with the base **R** `plogis()` function.

```r
plogis(b0)
```

```
## (Intercept) 
##    0.447191
```

---
# Logistic Regression

- Add our binary predictor `female`, which gives us the model

$$
`\begin{align*}
\text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\
\text{logit}(p_i) & = \beta_0 + \beta_1 \text{female}_i,
\end{align*}`
$$

where `female` is an 0/1 dummy variable.
---
# Fit Model

```r
# fit
fit2 <- glm(
  data = risk2009,
  family = binomial,
  lose.wt.01 ~ 1 + female
  )

# summarize
fit2 %>%
  tidy()
```

---
- Now we can compute the probabilities for young men and women as follows

```r
# young men
plogis(b0 + b1 * 0)
```

```
## [1] 0.3203463
```

```r
# young women
plogis(b0 + b1 * 1)
```

```
## [1] 0.5841121
```

---
# `Emmeans`

- Can use `emmeans` to get probabilities directly from model

- Can also get odds ratios for comparisons!

```r
emm <- emmeans(fit2, "female", type = "response")

pairs(emm, reverse=TRUE)
```

```
##  contrast          odds.ratio    SE  df null z.ratio p.value
##  female1 / female0       2.98 0.589 Inf    1   5.520  <.0001
## 
## Tests are performed on the log odds ratio scale
```

```r
emm
```

```
##  female  prob     SE  df asymp.LCL asymp.UCL
##       0 0.320 0.0307 Inf     0.263     0.383
##       1 0.584 0.0337 Inf     0.517     0.648
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the logit scale
```
---
# Logistic Regression Continuous

Now we add our continuous predictor `bmipct`, which gives us the model:

$$
`\begin{align*}
\text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\
\text{logit}(p_i) & = \beta_0 + \beta_1 \text{bmipct}_i,
\end{align*}`
$$

```r
# fit
fit3 <- glm(
  data = risk2009,
  family = binomial,
  lose.wt.01 ~ 1 + bmipct
  )
```

---
# Model 3

```r
# summarize
fit3 %>%
  tidy()
```

The coefficient for `bmipct` seems small at 0.036, but recall that `bmipct` ranges from 0 to 100.

- Thus a one point increase being a one percentile increase in BMI, which isn't very much

---
# Logistic Multiple Predictors

- The first will have both `female` and `bmipct` as predictors:

$$
`\begin{align*}
\text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\
\text{logit}(p_i) & = \beta_0 + \beta_1 \text{female}_i + \beta_2 \text{bmipct}_i.
\end{align*}`
$$

---
# Logistic Multiple Predictors

- The second model will add an interaction term between the two predictors:

$$
`\begin{align*}
\text{lose.wt.01}_i & \sim \operatorname{Binomial}(n = 1, p_i) \\
\text{logit}(p_i) & = \beta_0 + \beta_1 \text{female}_i + \beta_2 \text{bmipct}_i + \beta_3 \text{female}_i \cdot \text{bmipct}_i.
\end{align*}`
$$

---
# Fitting Both Models

```r
fit4 <- glm(
  data = risk2009,
  family = binomial,
  lose.wt.01 ~ 1 + female + bmipct
  )

fit5 <- glm(
  data = risk2009,
  family = binomial,
  lose.wt.01 ~ 1 + female + bmipct + female:bmipct
  )
```
---
# Model 4

```
## 
## Call:  glm(formula = lose.wt.01 ~ 1 + female + bmipct, family = binomial, 
##     data = risk2009)
## 
## Coefficients:
## (Intercept)       female       bmipct  
##    -4.25914      1.86067      0.04715  
## 
## Degrees of Freedom: 444 Total (i.e. Null);  442 Residual
## Null Deviance:	    611.9 
## Residual Deviance: 463 	AIC: 469
```
---
# Model 5

```r
glm(data = risk2009,
  family = binomial,
  lose.wt.01 ~ 1 + female + bmipct + female:bmipct
  ) %>%
  tidy()
```

---
# Model Comparisons

- The interaction term `female:bmipct` (i.e., `$\beta_3$`) would not be statistically significant.

---
# Likelihood Ratio Test

- Let's compare the models by performing likelihood ratio test

```r
test_likelihoodratio(fit4, fit5)
```

---

```r
bind_rows(
  glance(fit1),
  glance(fit2),
  glance(fit3),
  glance(fit4),
  glance(fit5)
) %>% 
  mutate(fit = str_c("fit", 1:5)) %>% 
  select(fit, everything())
```

<table class="huxtable" style="border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  " id="tab:unnamed-chunk-78">
<col><col><col><col><col><col><col><col><col><tr>
<th style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">fit</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">null.deviance</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">df.null</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">logLik</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">AIC</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">BIC</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">deviance</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">df.residual</th><th style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;">nobs</th></tr>
<tr>
<td style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">fit1</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">612</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">444</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">-306</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">614</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">618</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">612</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">444</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">445</td></tr>
<tr>
<td style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">fit2</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">612</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">444</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">-290</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">584</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">593</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">580</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">443</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">445</td></tr>
<tr>
<td style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">fit3</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">612</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">444</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">-263</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">529</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">537</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">525</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">443</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">445</td></tr>
<tr>
<td style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">fit4</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">612</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">444</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">-231</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">469</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">481</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">463</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">442</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">445</td></tr>
<tr>
<td style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">fit5</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">612</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">444</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">-231</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">470</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">487</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">462</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">441</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;">445</td></tr>
</table>

- When you compare model `fit4` with the interaction model `fit5`, it appears that the interaction term does not add anything

---
Let's plot the results from `fit4`.

---
# Model Comparisons

- Radar plots

```r
#easystats package
plot(compare_performance(fit1,fit2,fit3, fit4))
```

<img src="log_reg_files/figure-html/unnamed-chunk-80-1.png" width="90%" style="display: block; margin: auto;" />
---
# Visualization

```r
library(sjPlot) # need to plot predictors
library(sjmisc)

plot_model(fit4, type="pred", terms=c("bmipct", "female")) + theme_bw() # pred is predictors 
```

---
# Visualization

```r
library(sjPlot) # need to plot predictors
library(sjmisc)

plot_model(fit4, type="pred", terms=c("bmipct")) + theme_bw() # pred is predictors 
```

<img src="log_reg_files/figure-html/unnamed-chunk-82-1.png" width="50%" style="display: block; margin: auto;" />
---
# Visualization

```r
library(ggeffects)

plot(ggpredict(fit4, "bmipct"), ci = FALSE, add.data = TRUE) 
```

---
# Table

```r
#library(sjplot)
fit3 %>%
  tab_model()
```

<table style="border-collapse:collapse; border:none;">
<tr>
<th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm;  text-align:left; ">&nbsp;</th>
<th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">lose wt 01</th>
</tr>
<tr>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  text-align:left; ">Predictors</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">Odds Ratios</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">CI</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">p</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">0.08</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">0.04&nbsp;&ndash;&nbsp;0.14</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>&lt;0.001</strong></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">bmipct</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">1.04</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">1.03&nbsp;&ndash;&nbsp;1.05</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>&lt;0.001</strong></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">445</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R<sup>2</sup> Tjur</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.181</td>
</tr>

</table>

---

```r
data=read_csv("https://raw.githubusercontent.com/jgeller112/psy503-psych_stats/master/static/slides/14-Logistic_Regression/data/gss.csv")
```

DV: Attended a religious service in the last week

Independent Variables:

- Age

- Sex

- Socioeconomic Status (SES)

- Negative attitudes toward  premarital sex

---

# Report Logistic Regression

- State your hypothesis, statistical test, its link function, and
justify the use of a logistic regression

> We hypothesized that people who were against premarital
sex would be more likely to have attended a religious event
in the previous week. Religious event attendance was a
binary outcome as the person either attended an event or
they did not, which violated the normality assumption
required for traditional regression. Thus, a logistic
regression with a logit link function was used to model
religious event attendance using R 4.0.4 (R Core Team,
2020).

---
# Report Logistic Regression

- State the full model and your results, including the Odds Ratio

> Religious event attendance was modelled as a function of attitudes about premarital sex, controlling for the age, sex, and
socioeconomic status of the respondent. As shown in Figure 1, this analysis revealed that attitudes about premarital sex
significantly predicted religious event attendance, b = 0.72, SE
= 0.05, z(1303) = 13.24, p < 0.01, Odds Ratio = 2.05:1. This implies that every 1-unit increase in negative attitudes toward premarital sex predicted a 2-fold increase in the likelihood that the person had attended religious services in the previous week. Table 1 provides the model estimates for all predictors.