Testing The Equality of Regression Coefficients

The Problem

You have two predictors in your model. One seems to have a stronger coefficient than the other. But is it significant?

Example: when predicting a worker’s salary, is the standardized coefficient of number of extra hours (xtra_hours) really larger than that of number of compliments given the to boss n_comps?

library(parameters)
library(effectsize)

data("hardlyworking", package = "effectsize")

hardlyworkingZ <- standardize(hardlyworking)

m <- lm(salary ~ xtra_hours + n_comps, data = hardlyworkingZ)

model_parameters(m)
#> Parameter   | Coefficient |   SE |        95% CI |    t(497) |      p
#> ---------------------------------------------------------------------
#> (Intercept) |   -7.19e-17 | 0.02 | [-0.03, 0.03] | -4.14e-15 | > .999
#> xtra_hours  |        0.81 | 0.02 | [ 0.78, 0.84] |     46.60 | < .001
#> n_comps     |        0.41 | 0.02 | [ 0.37, 0.44] |     23.51 | < .001

Here are 4 6 methods to test coefficient equality in R.

Notes
- If we were interested in the unstandardized coefficient, we would not need to first standardize the data.
- Note that if one parameter was positive, and the other was negative, one of the terms would need to be first reversed (-X) to make this work.

Method 1: As Model Comparisons

Based on this awesome tweet.

Since:

\(\hat{Y} = a \times \text{xtra_hours} + a \times \text{n_comps} \\ = a \times (\text{xtra_hours} + \text{n_comps})\)

We can essentially force a constraint on the coefficient to be equal by using a composite variable.

m0 <- lm(salary ~ I(xtra_hours + n_comps), data = hardlyworkingZ)

model_parameters(m0)
#> Parameter            | Coefficient |   SE |        95% CI |    t(498) |      p
#> ------------------------------------------------------------------------------
#> (Intercept)          |   -2.05e-17 | 0.02 | [-0.04, 0.04] | -9.57e-16 | > .999
#> xtra_hours + n_comps |        0.61 | 0.01 | [ 0.58, 0.64] |     41.09 | < .001

We can then compare how this model compares to our first model without this constraint:

anova(m0, m)
#> Analysis of Variance Table
#> 
#> Model 1: salary ~ I(xtra_hours + n_comps)
#> Model 2: salary ~ xtra_hours + n_comps
#>   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
#> 1    498 113.67                                  
#> 2    497  74.95  1    38.716 256.73 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can conclude that the unconstrained model is significantly better than the constrained model - meaning that \(\beta_{\text{xtra_hours}} > \beta_{\text{n_comps}}\).

Method 1b: Composite Variable + Difference

Edit (Feb-17, 2021): Thanks to @joejps84 for pointing this out!

We can achieve the same thing in a single model. If we say that the slope of n_comps as equal to the slope of xtra_hours + some change:

\(\hat{Y} = a \times \text{xtra_hours} + (a + b) \times \text{n_comps} \\ = a \times \text{xtra_hours} + a \times \text{n_comps} + b \times \text{n_comps} \\ = a \times (\text{xtra_hours} + \text{n_comps}) + b \times \text{n_comps}\)

So the slope of n_comps in such a model is the difference between the slope of n_comps and xtra_hours:

m1 <- lm(salary ~ I(xtra_hours + n_comps) + n_comps, data = hardlyworkingZ)

model_parameters(m1)
#> Parameter            | Coefficient |   SE |         95% CI |    t(497) |      p
#> -------------------------------------------------------------------------------
#> (Intercept)          |   -2.99e-17 | 0.02 | [-0.03,  0.03] | -1.72e-15 | > .999
#> xtra_hours + n_comps |        0.81 | 0.02 | [ 0.78,  0.84] |     46.60 | < .001
#> n_comps              |       -0.40 | 0.03 | [-0.45, -0.35] |    -16.02 | < .001

This give the exact same result (\(t^2 = (-16)^2 = 256\) is the same as the F-value from the anova() test)!

Method 2: Paternoster et al (1998)

According to Paternoster et al. (1998), we can compute a t-test to compare the coefficients:

b <- coef(m)
V <- vcov(m)

tibble::tibble(
  diff_estim = b[2] - b[3],
  diff_SE = sqrt(V[2, 2] + V[3, 3] - 2 * V[2, 3]),
  t_stat = diff_estim / diff_SE,
  df = df.residual(m),
  p_value = 2 * pt(abs(t_stat), df = df, lower.tail = FALSE)
)
#> # A tibble: 1 x 5
#>   diff_estim diff_SE t_stat    df  p_value
#>        <dbl>   <dbl>  <dbl> <int>    <dbl>
#> 1      0.402  0.0251   16.0   497 6.96e-47

This gives the exact same results as the first method!

Edit (Feb-17, 2021): As @bmwiernik pointed out, this can also be done with some fancy matrix multiplication:

contr <- c(0, 1, -1)

tibble::tibble(
  diff_estim = b %*% contr,
  diff_SE = sqrt(contr %*% V %*% contr),
  t_stat = diff_estim / diff_SE,
  df = df.residual(m),
  p_value = 2 * pt(abs(t_stat), df = df, lower.tail = FALSE)
)
#> # A tibble: 1 x 5
#>   diff_estim[,1] diff_SE[,1] t_stat[,1]    df p_value[,1]
#>            <dbl>       <dbl>      <dbl> <int>       <dbl>
#> 1          0.402      0.0251       16.0   497    6.96e-47

All of the following solutions are essentially this method, wrapped in some nice functions.

Method 3: emmeans <3

We can estimate the slopes from the model using emmeans, and then, with some trickery, compare them!

library(emmeans)

trends <- rbind(
  emtrends(m, ~1, "xtra_hours"),
  emtrends(m, ~1, "n_comps")
)

# clean up so it does not error later
trends@grid$`1` <- c("xtra_hours", "n_comps")
trends@misc$estName <- "trend"

trends
#>  1          trend     SE  df lower.CL upper.CL
#>  xtra_hours 0.811 0.0174 497    0.772    0.850
#>  n_comps    0.409 0.0174 497    0.370    0.448
#> 
#> Confidence level used: 0.95 
#> Conf-level adjustment: bonferroni method for 2 estimates

Compare them:

pairs(trends)
#>  contrast             estimate     SE  df t.ratio p.value
#>  xtra_hours - n_comps    0.402 0.0251 497 16.023  <.0001

Once again - exact same results!

Method 4: lavaan

The lavaan package for latent variable analysis and structural equation modeling allows for parameter constraining. So let’s do just that:

library(lavaan)

L <- sem("salary ~ xtra_hours + n_comps", data = hardlyworkingZ)
L0 <- sem("salary ~ a * xtra_hours + a * n_comps", data = hardlyworkingZ)

anova(L0, L)
#> Chi-Squared Difference Test
#> 
#>    Df    AIC    BIC  Chisq Chisq diff Df diff Pr(>Chisq)    
#> L   0 476.04 488.69   0.00                                  
#> L0  1 682.26 690.69 208.22     208.22       1  < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This too yields similar results! (Only slightly different due to different estimation methods.)

We can also directly estimate the difference using lavaan’s “defined parameters” - defines with the := operator:

L <- sem("salary ~ b1 * xtra_hours + b2 * n_comps
         diff := b1 - b2", 
         data = hardlyworkingZ)

parameterEstimates(L, output = "text")[7,]
#> 
#> Defined Parameters:
#>                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
#>     diff              0.402    0.025   16.071    0.000    0.353    0.451

Which, again, gives the same results.

Method 5: car

Edit (Feb-17, 2021): Thanks to @DouglasKGAraujo for his suggestion!

Even more methods!

library(car)

linearHypothesis(m, c("xtra_hours - n_comps"))
#> Linear hypothesis test
#> 
#> Hypothesis:
#> xtra_hours - n_comps = 0
#> 
#> Model 1: restricted model
#> Model 2: salary ~ xtra_hours + n_comps
#> 
#>   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
#> 1    498 113.67                                  
#> 2    497  74.95  1    38.716 256.73 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Which once again gives the same result!

Method 6: multcomp

Edit (Feb-17, 2021): Thanks to Stefan Th. Gries for his suggestion!

Even more more methods!

library(multcomp)

summary(glht(m, matrix(c(0, 1, -1), nrow = 1)))
#> 
#>   Simultaneous Tests for General Linear Hypotheses
#> 
#> Fit: lm(formula = salary ~ xtra_hours + n_comps, data = hardlyworkingZ)
#> 
#> Linear Hypotheses:
#>        Estimate Std. Error t value Pr(>|t|)    
#> 1 == 0  0.40168    0.02507   16.02   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> (Adjusted p values reported -- single-step method)

Which once again again gives the same result!

Summary

That’s it really - these 4 6 simple methods have wide applications to GL(M)M’s, SEM, and more.

Enjoy!

comments powered by Disqus