statistical power

Statistical power is the ability of the test to reject the null hypothesis when it is in fact false. The power of the test is $1 - \beta$, or the complement of the probability of a [[Type II Error]]. $\beta$ is defined as the probability of a Type II error, or failing to reject the null hypothesis when the true value of the parameter $\hat \theta$ is in the parameter space corresponding to the alternate hypothesis $H_1$. $\begin{align} \beta &= \max \ P(\text{Type II Error}) \\ &= \max \ P(\text{Fail to reject } H_0; \theta \in H_1) \end{align}$ Statistical power is typically calculated *a priori* when comparing hypothesis tests to determine the [[best test]] to use. In this case, the observed value of the test statistic is unknown. Thus, we think about statistical power across the parameter space, the range of values the statistic can take, rather than at the value of the observed test statistic. However, one could calculate the probability of committing a Type II error given an observed test statistic, which is called post-hoc power analysis. The validity and usefulness of post-hoc power analysis is debated by statisticians. Power depends on sample size, true effect size, variability in the population, and significance level. Of these, sample size and significance level can be controlled in the The replication crisis is in part due to experiments with low power. Many organizations, such as the American Psychological Associations, have recently released guidance for researchers to pay more attention to power in studies. Most medical studies are only permitted if their design permits a power of 80% or greater[^1]. However, this is not common in other domains such as psychology. [^1]: [[gelman_2014]]