The task of statistical modeling is to [[explanation|explain]] or [[prediction|predict]] outputs from inputs. Typically, statistical modeling begins with a research question. The statistician may need to [[operationalize]] the concept of the research question to ensure [[concept validity]].
Most statistical modeling falls in the domain of [[statistical modeling]]. Additional care must be taken to establish [[causality]] (causal inference), inferring that the inputs *caused* the outputs.
Common errors in statistical modeling include the [[circular analysis]] of a dataset and [[overfitting]] of the model.
Common statistical models include [[linear regression]] and [[ANOVA]].
## challenges of statistical modeling
- no ability to change inputs (except maybe in experimental settings)
- relationship between inputs and outputs are not deterministic (they are stochastic)
## considerations in building a statistical model
- which input variables should be included?
- how does a change in one variable impact the output (sensitivity)
- are the impacts of changes consistent across the range of an input variable or different?
- is there a causal relationship between the input and output?
## evaluation
- Accuracy
- Speed
- Interpretability
- Robustness: to noise and missing data
- Scalability