- R Statistics Cookbook
- Francisco Juretig
- 229字
- 2021-06-24 15:54:11
The Fisher-Behrens problem
The original t-test is designed for two Gaussian samples with equal unknown variance. When the variances are not the same, the degrees of freedom for the test are not the usual ones (the equality of variances is known as homocedasticity). Consequently, we can't calculate the p-values and, by extension, we can't test our hypothesis. This is known as the Fisher-Behrens problem.
It has been found that the t-test (with its usual degrees of freedom) can still be used with moderate departures from the homocedasticity (equality of variances) assumption. But this does not take us very far: translating the idea that the test is robust to departures from this assumption is difficult to operationalize (the impact depends on the sample sizes, the relative differences in the variances, and so on).
If the sample is large enough, we can ignore the problem altogether and get the p-values using a Gaussian distribution (a t-Student distribution converges to a Gaussian distribution as the degrees of freedom go to infinity). If the sample is small, we need a different technique. The preferred one is the Welch t-test, which finds the appropriate degrees of freedom using the so-called Welch-Satterwhite approximation.
To use Welch's test, we can simply use the t.test function with the var.equal=FALSE option. Even when the variances are the same, it works quite well compared to the standard t-test:
t.test(.....,var.equal=FALSE,....)