confidence interval for a difference between means

For a difference between means from large samples with a [[normal distribution]], including those approximately normal by the [[Central Limit Theorem]], a confidence interval between two means is given by $\bar X_1 - \bar X_2 \pm z_{\alpha/2} \sqrt{\frac{s^2_1}{n} + \frac{s^2_2}{n}}$ where $s$ is the [[sample variance]] and $z_{\alpha/2}$ is the [[critical value]] from the standard normal distribution. ## for small samples with equivalent population variances When one or both of the means are calculated from a small sample, the confidence interval for the difference between the means is calculated using the [[t-distribution]]. When we can assume the samples are from populations with the same variance, we use a [[pooled variance]] in place of the [[sample variance]], which is no longer a good estimator of the true population variance. $\bar X_1 - \bar X_2 \pm t_{\alpha/2, n_1 + n_2 - 2} \sqrt{S^2_p (\frac{1}{n_1} + \frac{1}{n_2})}$ ## for small samples without equivalent population variances When we cannot assume the samples are from populations with equivalent variance, this is known as the Behrens-Fisher problem. The most popular solution is to use Welch's approximation ([[Welch's two sample t-test]]). In [[R]], use ```R x <- rnorm(10) # generate data for x y <- rnorm(14) # generate data for y t.test(x,y,conf.level=0.90) ```