The Anderson-Darling test is a statistical method used to determine whether a data sample is likely drawn from a specific theoretical distribution. Unlike parametric tests, it does not require assumptions about specific parameters of the distribution. Instead, it compares the sample's empirical cumulative distribution function (ECDF) with the cumulative distribution function (CDF) of the hypothesized distribution. Critical values for the test are specific to the chosen distribution rather than universal, making it adaptable to various distributions.
Developed by Theodore Wilbur Anderson and Donald Allan Darling in 1952, the test is widely used to check for normality, though it is a common misconception that it applies only to normal distributions. In fact, it can also test goodness-of-fit for distributions like exponential, Weibull, or logistic, as long as the relevant CDF is known.
A key consideration when using the Anderson-Darling test is whether a parametric or nonparametric approach is appropriate, depending on the information about the population distribution. Although it is frequently employed to test for normality, the test can assess fit across a broad range of distributions. It is considered an improvement over the Kolmogorov-Smirnov (K-S) test due to its greater sensitivity to deviations in the tails of the distribution, making it more effective for detecting outliers and extreme values. Finally, while calculating the Anderson-Darling test statistic manually can be complex, computer-based tools and software packages have simplified the process, providing both the test statistic and the critical values needed to interpret results efficiently.
In many cases, the distribution of the population from where the random samples are drawn is often unknown or difficult to determine.
In these cases, the Anderson-Darling test can aid in determining if such data and samples are drawn from a particular distribution, such as standard normal or uniform distribution.
When testing for normality, the null hypothesis states that the data follow a normal distribution, and the alternative hypothesis is that the data do not follow a normal distribution.
The test statistic A2 is calculated using the following equation to test for the normality of samples and compared with the critical value obtained from the theoretical standard normal distribution.
When this test statistic is greater than the critical value at a pre-decided significance level, the null hypothesis that the sample is from a normal distribution is rejected.
The data from laboratory experiments or even from natural observations are often assumed to be normally distributed.
The Anderson-Darling test can be applied to decide the appropriate parametric or nonparametric test for the analysis.