The task of statistical modeling is to [[explanation|explain]] or [[prediction|predict]] outputs from inputs. Typically, statistical modeling begins with a research question. The statistician may need to [[operationalize]] the concept of the research question to ensure [[concept validity]]. Most statistical modeling falls in the domain of [[statistical modeling]]. Additional care must be taken to establish [[causality]] (causal inference), inferring that the inputs *caused* the outputs. Common errors in statistical modeling include the [[circular analysis]] of a dataset and [[overfitting]] of the model. Common statistical models include [[linear regression]] and [[ANOVA]]. ## challenges of statistical modeling - no ability to change inputs (except maybe in experimental settings) - relationship between inputs and outputs are not deterministic (they are stochastic) ## considerations in building a statistical model - which input variables should be included? - how does a change in one variable impact the output (sensitivity) - are the impacts of changes consistent across the range of an input variable or different? - is there a causal relationship between the input and output? ## evaluation - Accuracy - Speed - Interpretability - Robustness: to noise and missing data - Scalability