hypothesis testing

Hypothesis testing helps us make decisions from data by quantifying and controlling for [[errors in hypothesis testing]], which is to say making the wrong decision due to variability in sample data. In hypothesis testing, a null hypothesis, which we assume to be true, is contrasted with an alternative (or alternate) hypothesis that will inform our decision making. The alternative hypothesis is always what you want to "prove". However, we can either "reject" or "fail to reject" the null hypothesis; we do not prove either the null or alternative hypothesis. ## Process Step 1: **Choose a statistic** on which to base the test (e.g., the sample mean $\bar X$). This can be any [[estimator]] for which we can work out the distribution. We'll call this our [[test statistic]]. Step 2: **State the form** (or direction) of the test. For this step, consider if the alternative hypothesis is true, how would that be reflected in the test statistic? Which value(s) of the test statistic would lead you to reject the null hypothesis in favor of the alternative? For the hypothesis test of $\begin{align} H_0: \theta = \theta_0 && H_1: \theta > \theta_0 \end{align}$ we would expect the estimator $\hat \theta$ to be large if the alternate hypothesis were true, thus the form of the test would be to reject the null hypothesis, in favor of the alternate hypothesis, when $\hat \theta > c$ for some value $c$. Step 3: **Find the critical value(s)**, the cutoff against which we will compare the test statistic. For this step, get curious about the distribution of the test statistic. Can you transform the test statistic to a known distribution? If not, does the [[Central Limit Theorem]] apply (e.g., for a sample mean)? Use the [[level of significance]] of the test to find the critical value. Find $c$ such that the probability of a Type I error is equal to $\alpha$ when the null hypothesis is true, which is to say that $\theta = \theta_0$. $\begin{align} \alpha &= P(\text{Type I Error}) \\ &= P(\text{Reject } H_0; \ \theta_0) \\ &= P(\hat \theta > c; \theta) \end{align}$ Use the distribution of the test statistic $\hat \theta$, or some transformation $g(\hat \theta)$ to resolve the probability and solve for $c$. Step 4: **Give a conclusion**. Calculate the test statistic and compare to the critical value. Depending on the form (or direction) of the test, determine whether or not to reject the null hypothesis in favor of the alternate hypothesis. [[test for a mean]] [[two-sample test for a difference in population means with known variance]] [[two-sample test for a difference in population means with unknown, equal variance]] [[Welch's two-sample t-test]] (for difference in population means with unknown, unequal variance) [[paired test]] [[two-sample test for difference in proportions]] [[test for a proportion]] [[two-sample test for ratio of variances]] # t-test #expand See Week 3 Lesson 3 DS 5003 ## Index of test one-sample test for a mean - z test - t test two-sample test for a difference of means - known variance - unknown, equal variance - unknown, unequal variance (Welch's) one-sample test for a proportion two-sample test for difference in proportions paired test (point to test for a mean) one-sample test for variance two-sample test for a ratio of variances [[best test]] [[Neyman-Pearson Lemma]] [[uniformly most powerful test]] [[Generalized Likelihood Ratio Test]] [[Wilks' Theorem]] # likelihood ratio - Lots of likelihood ratios in Statistics - Likelihood refers to the same likelihood as MLE - ## critical region The critical region, which is often also referred to as a [[rejection region]], is the area of the distribution of the test statistic under which we reject the null hypothesis. For example, the area to the right of $z_\alpha$ for a one tailed test using the standard normal distribution. ## Example: hypothesis test for variance An ecological services firm is calibrating applicator spray nozzles for their herbicide applicators. They suspect the nozzles have worn out resulting in higher variability, which could lead to over-application and damage their restoration projects. The company selected a random sample of 10 nozzles and measured the application rate of herbicide in gallons per acre. Test the null hypothesis that applicator variance is less than or equal to $0.01$ versus the alternative hypothesis that applicator variance is greater than $0.01$ at level $0.05$. $\begin{align} H_0: \sigma^2 \le 0.01 && H_1: \sigma^2 > 0.01 \end{align}$ **Step 1. Choose a statistic** The common sense test statistic for the population variance $\sigma^2$ is the [[sample variance]] $S^2$. **Step 2. Give the form of the test** If the sample variance is large we will reject the null hypothesis. We will reject $H_0$, in favor of $H_1$, if $S^2 > c$. **Step 3. Find the critical value** The critical value can be calculated from the probability of committing a [[Type I Error]] in the case that the null hypothesis is true, or $\sigma^2 \le 0.01$. Because the probability of committing a Type I error is a function of $\sigma^2$, we want to find the maximum probability of committing a Type I error across the portion of the parameter space where the null hypothesis is true ($\sigma^2 \le 0.01$). $\begin{align} \alpha &= P(\text{Type I Error}) \\ &= max \ P(\text{Reject the null hypothesis; } \sigma^2 \le 0.01) \\ &= max \ P(S^2 > c; \sigma^2 \le 0.01) \end{align} $ > [!NOTE] Notation > In the notation $P(S^2 > c; \sigma^2 \le 0.01)$, the term $\sigma^2 \le 0.01$, which is set off by a semicolon, indicates that we are interested in the probability of rejecting the null hypothesis when the true value of $\sigma^2$ is in the range where the null hypothesis is true (a Type I error). For a composite hypothesis like this one, this will be a range of values and so we are interested in the *maximum* probability within that parameter space. For a simple hypothesis, we will have a single value for our parameter and will need to worry about maximizing this probability. To calculate this probability, we'll need to know the distribution of $S^2$. While we don't know the distribution of $S^2$ exactly, we do know that the sample variance multiplied by $(n-1)/\sigma^2$ has the [[chi-squared distribution]]. We can multiply both sides of the equality by the same term to convert the sample variance to a chi-squared random variable. $\begin{align} \alpha &= max \ P(S^2 > c; \sigma^2 \le 0.01) \\ &= max \ P \Big ( \frac{(n-1)S^2}{\sigma^2} > \frac{(n - 1)c}{\sigma^2} ; \sigma^2 \le 0.01 \Big ) \\ &= max \ P \Big ( W > \frac{(n - 1)c}{\sigma^2} ; \sigma^2 \le 0.01 \Big ) \end{align} $ If we could write the closed form of the [[cumulative density function|cdf]] for the chi-squared distribution, we could use calculus calculate the [[function maximum]] over all $\sigma^2 \le 0.01$. However, we can reason through this by noting that the term $(n-1)/\sigma^2$ is decreasing in $\sigma^2$. The larger the value of $\sigma^2$, the smaller the ratio will be. To maximize the probability of a random variable $W$ being less than some constant, we want that constant to be as large as possible. Thus, we can plug in the upper bound of the range of values for $\sigma^2$ under which the null hypothesis is true, which is $0.01$. $\begin{align} &=max \ P \Big ( W > \frac{(n - 1)}{\sigma^2} ; \sigma^2 \le 0.01 \Big ) \\ &=P \Big ( W > \frac{(10 - 1)c}{0.01} \Big) \\ &=P \Big ( W > \frac{9c}{0.01} \Big) \end{align} $ The probability that $W$, a chi-squared random variable, is greater than the value $9c/0.01$ can be solved by setting that term value equal to a critical value from the chi-squared distribution with area $0.05$ to the right with $n -1$ degrees of freedom. $\frac{9c}{0.01} = \chi^2_{0.05,9}$ With [[R]], we find the critical value for $\chi^2_{0.05,9}$ ```R qchisq(1 - 0.05, 9) > 16.919 ``` *Note the function `qchisq` accepts the area to the left as the first parameter, rather than the area to the right.* Solving for $c$ we get $c = \chi^2_{0.05,9} \frac{0.01}{9} = 16.919 \frac{0.01}{9} = 0.0188$ The final form of our test is to reject $H_0$, in favor of $H_1$, if $S^2 > 0.0188$. **Step 4. Give a conclusion** The observed data are 4.089, 3.977, 4.094, 4.090, 4.029, 3.891, 3.934, 3.979, 4.000, 4.170. We can calculate the sample variance as $S^2 = \frac{\sum_{i=1}^n(x_i - \bar X)^2}{n-1} = 0.0073$ Thus, we fail to reject the null hypothesis because it is not true that $S^2 > c$. $0.0073 < 0.0188$ ## Using R This procedure can be completed in R using the `EnvStats` package's function `varTest`. We'll generate a random sample of size $10$ with mean $4$ and $\sigma^2=0.005$. ```R install.packages('EnvStats') library(EnvStats) set.seed(0) my_data <- rnorm(10, 4, sqrt(0.005)) varTest(my_data, alternative="greater", conf.level=0.95, sigma.squared=0.01) ``` The results will appear as below. We should not reject the null hypothesis because the [[p-value]] is $0.68$, which is not less than our $\alpha$ of $0.05$. ``` Results of Hypothesis Test -------------------------- Null Hypothesis: variance = 0.01 Alternative Hypothesis: True variance is greater than 0.01 Test Name: Chi-Squared Test on Variance Estimated Parameter(s): variance = 0.007264179 Data: my_data Test Statistic: Chi-Squared = 6.537761 Test Statistic Parameter: df = 9 P-value: 0.6851213 95% Confidence Interval: LCL = 0.003864158 UCL = Inf ```