When a linear regression models shows bad fit due to nonlinearity, one option is to add polynomial terms for one or more covariates. For example, with one covariate $X_1$
$Y_i = \beta_0 + \beta_1X_{i,1} + \beta_2X^2_{i,1} + \dots + \beta_dX_{i,1}^d$
Heuristics for how to fit the model include:
- add terms of higher order until they are no longer statistically significant
- start with some high guess and remove terms until all are statistically significant
Note that these heuristics will not always produce equivalent results. In general, it is considered a bad idea to include a higher order term without also including all lower order terms.
In [[R]]
```R
lm_mod <- lm(y ~ x + I(x^2) + ..., data=data)
```