Wolfram Language & System Documentation Center

AndersonDarlingTest

AndersonDarlingTest[data]

tests whether data is normally distributed using the Anderson–Darling test.

AndersonDarlingTest[data,dist]

tests whether data is distributed according to dist using the Anderson–Darling test.

AndersonDarlingTest[data,dist,"property"]

returns the value of "property".

Details and Options

AndersonDarlingTest performs the Anderson–Darling goodness-of-fit test with null hypothesis that data was drawn from a population with distribution dist, and alternative hypothesis that it was not.
By default, a probability value or -value is returned.
A small -value suggests that it is unlikely that the data came from dist.
The dist can be any symbolic distribution with numeric and symbolic parameters or a dataset.
The data can be univariate {x₁,x₂,…} or multivariate {{x₁,y₁,…},{x₂,y₂,…},…}.
The Anderson–Darling test assumes that the data came from a continuous distribution.
The Anderson–Darling test effectively uses a test statistic based on where is the empirical CDF of data and is the CDF of dist.
For univariate data, the test statistic is given by , where is the sorted data.
For multivariate tests, the sum of the univariate marginal -values is used and is assumed to follow a UniformSumDistribution under .
AndersonDarlingTest[data,dist,"HypothesisTestData"] returns a HypothesisTestData object htd that can be used to extract additional test results and properties using the form htd["property"].
AndersonDarlingTest[data,dist,"property"] can be used to directly give the value of "property".
Properties related to the reporting of test results include:

	"PValue"	-value
	"PValueTable"	formatted version of "PValue"
	"ShortTestConclusion"	a short description of the conclusion of a test
	"TestConclusion"	a description of the conclusion of a test
	"TestData"	test statistic and -value
	"TestDataTable"	formatted version of "TestData"
	"TestStatistic"	test statistic
	"TestStatisticTable"	formatted "TestStatistic"

The following properties are independent of which test is being performed.
Properties related to the data distribution include:
"FittedDistribution" fitted distribution of data

"FittedDistributionParameters" distribution parameters of data
The following options can be given:
Method Automatic the method to use for computing -values

SignificanceLevel 0.05 cutoff for diagnostics and reporting
For a test for goodness of fit, a cutoff is chosen such that is rejected only if . The value of used for the "TestConclusion" and "ShortTestConclusion" properties is controlled by the SignificanceLevel option. By default, is set to 0.05.
With the setting Method->"MonteCarlo", datasets of the same length as the input are generated under using the fitted distribution. The EmpiricalDistribution from AndersonDarlingTest[s_i,dist,"TestStatistic"] is then used to estimate the -value.

Examples

open all close all

Basic Examples (3)

Perform an Anderson–Darling test for normality:

Wolfram Language code: data = RandomVariate[NormalDistribution[], 10^3];

Wolfram Language code: AndersonDarlingTest[data]

Wolfram Language code: Show[SmoothHistogram[data, PlotStyle -> {Dashed, Red}], Plot[PDF[NormalDistribution[], x], {x, -4, 4}]]

Test the fit of some data to a particular distribution:

Wolfram Language code: data = RandomVariate[PowerDistribution[1, 2], 10^4];

Wolfram Language code: AndersonDarlingTest[data, PowerDistribution[1, 2]]

Wolfram Language code:

Show[Histogram[data, Automatic, "PDF"], Plot[PDF[PowerDistribution[1, 2], x], {x, 0, 1}, PlotStyle -> Thick, PlotRange -> All]]

Compare the distributions of two datasets:

Wolfram Language code: data1 = RandomVariate[NormalDistribution[], 100];

Wolfram Language code: data2 = RandomVariate[NormalDistribution[.1, 1], 150];

Wolfram Language code: AndersonDarlingTest[data1, data2]

Wolfram Language code: SmoothHistogram[{data1, data2}]

Scope (9)

Testing (6)

Perform an Anderson–Darling test for normality:

Wolfram Language code:

data1 = RandomVariate[NormalDistribution[], 10^4];
data2 = RandomVariate[StudentTDistribution[3], 10^4];

The -value for the normal data is large compared to the -value for the non-normal data:

Wolfram Language code: AndersonDarlingTest[data1]

Wolfram Language code: AndersonDarlingTest[data2]

Test the goodness of fit for a particular distribution:

Wolfram Language code:

data1 = RandomVariate[NormalDistribution[], 10^3];
data2 = RandomVariate[CauchyDistribution[0, 1], 10^3];

Wolfram Language code: AndersonDarlingTest[data1, CauchyDistribution[0, 1]]

Wolfram Language code: AndersonDarlingTest[data2, CauchyDistribution[0, 1]]

Compare the distributions of two datasets:

Wolfram Language code:

data1 = RandomVariate[NormalDistribution[], 10^3];
data2 = RandomVariate[NormalDistribution[], 10^3];

Wolfram Language code: AndersonDarlingTest[data1, data2]

Wolfram Language code: data3 = RandomVariate[NormalDistribution[0, 1.25], 10^3];

Wolfram Language code: AndersonDarlingTest[data1, data3]

Test for multivariate normality:

Wolfram Language code:

data1 = RandomVariate[BinormalDistribution[.5], 10^3];
data2 = RandomVariate[LaplaceDistribution[1, 2], {10^3, 2}];

Wolfram Language code: AndersonDarlingTest[data1]

Wolfram Language code: AndersonDarlingTest[data2]

Test for goodness of fit to any multivariate distribution:

Wolfram Language code:

data1 = RandomVariate[BinormalDistribution[.5], 10^3];
data2 = RandomVariate[𝒹 = LaplaceDistribution[1, 2], {10^3, 2}];

Wolfram Language code: 𝒟 = ProductDistribution[𝒹, 𝒹];

Wolfram Language code: AndersonDarlingTest[data1, 𝒟]

Wolfram Language code: AndersonDarlingTest[data2, 𝒟]

Create a HypothesisTestData object for repeated property extraction:

Wolfram Language code: data = RandomVariate[NormalDistribution[], 10^5];

Wolfram Language code: ℋ = AndersonDarlingTest[data, Automatic, "HypothesisTestData"]

The properties available for extraction:

Wolfram Language code: ℋ["Properties"]

Reporting (3)

Tabulate the results of the Anderson–Darling test:

Wolfram Language code: data = RandomVariate[NormalDistribution[], 100];

Wolfram Language code: ℋ = AndersonDarlingTest[data, Automatic, "HypothesisTestData"];

The full test table:

Wolfram Language code: ℋ["TestDataTable"]

A -value table:

Wolfram Language code: ℋ["PValueTable"]

The test statistic:

Wolfram Language code: ℋ["TestStatisticTable"]

Retrieve the entries from an Anderson–Darling test table for custom reporting:

Wolfram Language code:

data1 = RandomVariate[NormalDistribution[], 100];
data2 = RandomVariate[NormalDistribution[], 100];

Wolfram Language code: ℋ1 = AndersonDarlingTest[data1, Automatic, "TestData"]

Wolfram Language code: ℋ2 = AndersonDarlingTest[data2, Automatic, "TestData"]

Wolfram Language code:

BarChart[{Labeled[ℋ1, "Set 1"], Labeled[ℋ2, "Set 2"]}, ChartLabels -> {"SuperscriptBox[A, 2]", "p‐value"}]

Report test conclusions using "ShortTestConclusion" and "TestConclusion":

Wolfram Language code: data = BlockRandom[SeedRandom[1];RandomVariate[ParetoDistribution[1.05, 2], 100]];

Wolfram Language code: ℋ = AndersonDarlingTest[data, ParetoDistribution[1, 2], "HypothesisTestData"];

Wolfram Language code: ℋ["ShortTestConclusion"]

Wolfram Language code: ℋ["TestConclusion"]//TraditionalForm

The conclusion may differ at a different significance level:

Wolfram Language code: ℋ = AndersonDarlingTest[data, ParetoDistribution[1, 2], "HypothesisTestData", SignificanceLevel -> .001];

Wolfram Language code: ℋ["ShortTestConclusion"]

Wolfram Language code: ℋ["TestConclusion"]//TraditionalForm

Options (4)

Method (3)

Use Monte Carlo-based methods for a computation formula:

Wolfram Language code: data = RandomVariate[NormalDistribution[], 100];

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[], Method -> "MonteCarlo"]

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[], Method -> Automatic]

Set the number of samples to use for Monte Carlo-based methods:

Wolfram Language code: data = RandomVariate[NormalDistribution[], 100];

Wolfram Language code:

pts = Table[{i, AndersonDarlingTest[data, NormalDistribution[], Method -> {"MonteCarlo", "MonteCarloSamples" -> i}]}, {i, Range[5, 250, 15]}];

The Monte Carlo estimate converges to the true -value with increasing samples:

Wolfram Language code: pval = AndersonDarlingTest[data, NormalDistribution[]];

Wolfram Language code:

Show[ListLinePlot[pts, PlotRange -> {0, 1}, FrameLabel -> {"Samples", "P-Value"}, Frame -> True, AxesOrigin -> {0, 0}], Graphics[{Dashed, Line[{{0, pval}, {250, pval}}]}]]

Set the random seed used in Monte Carlo-based methods:

Wolfram Language code: data = RandomVariate[NormalDistribution[], 100];

Wolfram Language code:

pts = Table[{i, AndersonDarlingTest[data, NormalDistribution[], Method -> {"MonteCarlo", "RandomSeed" -> i, "MonteCarloSamples" -> 50}]}, {i, Range[1, 10]}];

The seed affects the state of the generator and has some effect on the resulting -value:

Wolfram Language code: pval = AndersonDarlingTest[data, NormalDistribution[]];

Wolfram Language code:

Show[ListLinePlot[pts, PlotRange -> {Min[pts[[All, 2]]], Max[pts[[All, 2]]]}, FrameLabel -> {"Seed", "P-Value"}, Frame -> True, AxesOrigin -> {0, 0}], Graphics[{Dashed, Line[{{0, pval}, {100, pval}}]}]]

SignificanceLevel (1)

Set the significance level used for "TestConclusion" and "ShortTestConclusion":

Wolfram Language code: data = BlockRandom[SeedRandom[1];RandomVariate[NormalDistribution[], 100]];

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[0, 1.5], "ShortTestConclusion", SignificanceLevel -> .05]

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[0, 1.5], "ShortTestConclusion", SignificanceLevel -> .01]

By default, is used:

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[0, 1.5], "TestConclusion"]//TraditionalForm

Applications (4)

It can be shown that a GammaDistribution[1,1/λ] is equivalent to an ExponentialDistribution[λ]. This conclusion is supported by simulation:

Wolfram Language code: λvect = {2, 4, 6, 8, 10};

Wolfram Language code: data = Table[RandomVariate[GammaDistribution[1, (1/λ)], {1000, 100}], {λ, λvect}];

Perform the Anderson–Darling test, grouping each dataset with its expected value:

Wolfram Language code: p = Table[(AndersonDarlingTest[#1, ExponentialDistribution[λvect[[i]]]]&) /@ data[[i]], {i, Length[λvect]}];

The resulting -value distributions are approximately uniform, supporting the claim:

Wolfram Language code: And@@(AndersonDarlingTest[#, UniformDistribution[], "PValue"] > 0.05& /@ p)

A power curve for the Anderson–Darling test:

Wolfram Language code: data = Table[RandomVariate[UniformDistribution[{-4, 4}], {500, i}], {i, n = {5, 7, 10, 15, 20, 25, 30}}];

Wolfram Language code: ℋ = Table[AndersonDarlingTest[data[[i, j]], NormalDistribution[]], {i, Length[data]}, {j, Length[data[[i]]]}];

Wolfram Language code: pC = Interpolation[Transpose[{n, Table[Probability[x ≤ 0.05, xi], {i, ℋ}]}], InterpolationOrder -> 1];

Visualize the approximate power curve:

Wolfram Language code: Plot[pC[x], {x, 5, 30}, PlotRange -> {.6, 1}, Ticks -> {n, Automatic}, AxesOrigin -> {0, 0.6}]

Estimate the power of the Anderson–Darling test when the underlying distribution is a UniformDistribution[{-4,4}], the test size is 0.05, and the sample size is 6:

Wolfram Language code: pC[6.]

A collection of measurements were taken on 50 members from each of three iris species. It has been observed that the species setosa is easy to identify but that the remaining two species, versicolor and virginica, are often confused:

Wolfram Language code: data = ExampleData[{"Statistics", "FisherIris"}];

Wolfram Language code: petalLen = data[[All, 3]];

Wolfram Language code: species = data[[All, -1]];

The distributions of petal lengths for each species:

Wolfram Language code: SmoothHistogram[Pick[petalLen, species, #], PlotLabel -> #, Filling -> Axis]& /@ DeleteDuplicates[species]

The distributions are equivalent for versicolor and virginica, which are very different from setosa:

Wolfram Language code: AndersonDarlingTest[Pick[petalLen, species, "versicolor"], Pick[petalLen, species, "viginica"]]

Wolfram Language code: AndersonDarlingTest[Pick[petalLen, species, "setosa"], Pick[petalLen, species, "virginica" | "versicolor"]]

Assume the following petal length measures are known for the populations:

Wolfram Language code: Subscript[μ, s] = 1.5;Subscript[μ, vv] = 4.9;Subscript[σ, s] = .2;Subscript[σ, vv] = .8;

Wolfram Language code:

𝒟 = MixtureDistribution[{50 / 150, 100 / 150}, {NormalDistribution[Subscript[μ, s], Subscript[σ, s]], NormalDistribution[Subscript[μ, vv], Subscript[σ, vv]]}];

Wolfram Language code: 𝒟data = SmoothKernelDistribution[petalLen, {"Adaptive", Automatic, .5}];

Wolfram Language code: Plot[{PDF[𝒟, x], PDF[𝒟data, x]}, {x, 0, 8}, PlotLegends -> {"𝒟", "𝒟data"}]

The normal mixture appears to fit the petal length distribution well:

Wolfram Language code: AndersonDarlingTest[petalLen, 𝒟, "TestDataTable"]

Estimate distribution parameters by minimizing the Anderson-Darling test statistic:

Wolfram Language code: data = RandomVariate[StudentTDistribution[6.4], 500];

Wolfram Language code: adDistance[df_Real] := AndersonDarlingTest[data, StudentTDistribution[df], "TestStatistic"]

Wolfram Language code: Plot[adDistance[ν], {ν, 1, 10}]

Wolfram Language code: νMinAD = NArgMin[{adDistance[ν], ν > 0}, ν]

Compare with the MLE value:

Wolfram Language code: νMLE = ν /. FindDistributionParameters[data, StudentTDistribution[ν]]

Compare Anderson-Darling test statistic for these two optimal parameters:

Wolfram Language code: {adDistance[νMinAD], adDistance[νMLE]}

Properties & Relations (9)

By default, univariate data is compared to a NormalDistribution:

Wolfram Language code: data = RandomVariate[NormalDistribution[2, 3], 10^4];

Wolfram Language code: ℋ = AndersonDarlingTest[data, Automatic, "HypothesisTestData"];

Wolfram Language code: ℋ["TestDataTable"]

The parameters have been estimated from the data:

Wolfram Language code: ℋ["FittedDistribution"]

Multivariate data is compared to a MultinormalDistribution by default:

Wolfram Language code: data = RandomVariate[MultinormalDistribution[{1, 2, 3}, IdentityMatrix[3]], 1000];

Wolfram Language code: ℋ = AndersonDarlingTest[data, Automatic, "HypothesisTestData"];

Wolfram Language code: ℋ["TestDataTable"]

Wolfram Language code: ℋ["FittedDistribution"]//TraditionalForm

The parameters of the test distribution are estimated from the data if not specified:

Wolfram Language code: data = RandomVariate[NormalDistribution[1, 2], 1000];

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[μ, σ], "FittedDistribution"]

Specified parameters are not estimated:

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[μ, 2], "FittedDistribution"]

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[1, 2], "FittedDistribution"]

Maximum likelihood estimates are used for unspecified parameters of the test distribution:

Wolfram Language code: data = RandomVariate[ExponentialDistribution[3], 10^3];

Wolfram Language code: ℋ = AndersonDarlingTest[data, ExponentialDistribution[λ], "FittedDistribution"]

Wolfram Language code: EstimatedDistribution[data, ExponentialDistribution[λ]]

If the parameters are unknown, AndersonDarlingTest applies a correction when possible:

Wolfram Language code: data = RandomVariate[NormalDistribution[3, 4], 10^4];

Wolfram Language code: est = EstimatedDistribution[data, NormalDistribution[μ, σ]]

The parameters are estimated but no correction is applied:

Wolfram Language code: AndersonDarlingTest[data, est]

Wolfram Language code: ℋ = AndersonDarlingTest[data, NormalDistribution[μ, σ], "HypothesisTestData"];

The fitted distribution is the same as before and the -value is corrected:

Wolfram Language code: ℋ["FittedDistribution"]

Wolfram Language code: ℋ["PValue"]

Independent marginal densities are assumed in tests for multivariate goodness of fit:

Wolfram Language code: data = RandomVariate[MultinormalDistribution[{0, 0}, {{0.118, 0.252}, {0.252, 0.665}}], 100];

Wolfram Language code: AndersonDarlingTest[data, MultinormalDistribution[{0, 0}, {{0.118, 0.252}, {0.252, 0.665}}], "TestStatistic"]

The test statistic is identical when independence is assumed:

Wolfram Language code: AndersonDarlingTest[data, MultinormalDistribution[{0, 0}, {{0.118, 0}, {0, 0.665}}], "TestStatistic"]

The Anderson–Darling test statistic:

Wolfram Language code: n = 100;

Wolfram Language code: data = Sort@RandomVariate[StudentTDistribution[3], n];

Wolfram Language code: F[x_] := CDF[NormalDistribution[], x];

Wolfram Language code: -n - Underoverscript[∑, k, n](2 k - 1/n) (Log[1 - F[data[[n - k + 1]]]] + Log[F[data[[k]]]])

Wolfram Language code: AndersonDarlingTest[data, NormalDistribution[], "TestStatistic"]

The Anderson–Darling statistic can be defined using NExpectation:

Wolfram Language code:

n = 10;
h0 = NormalDistribution[1, 2];
data = RandomVariate[h0, n];

Wolfram Language code:

f[x_] := CDF[h0, x]
Overscript[f,  ^ ][x_] := CDF[EmpiricalDistribution[data], x]

Wolfram Language code: n NExpectation[((Overscript[f, ^ ][t] - f[t])^2/f[t] (1 - f[t])), th0]

Wolfram Language code: AndersonDarlingTest[data, h0, "TestStatistic"]

The Anderson–Darling test works on the values only when the input is a TimeSeries:

Wolfram Language code:

ts = TemporalData[TimeSeries, {{{0., -0.11029234344648474, 0.0345779635879646, 0.1743666051306928, 
    0.16967400225598006, 0.1411114639333355, 0.23851991945192513, 0.26921448263698, 
    0.24027115738328042, 0.17492509184799315, 0.2781622872829057, 0. ... 276577, -0.45856201940681207, -0.4228641161946517, 
    -0.21228762599990375}}, {{0, 1., 0.01}}, 1, {"Continuous", 1}, {"Continuous", 1}, 1, 
  {ValueDimensions -> 1, ResamplingMethod -> {"Interpolation", InterpolationOrder -> 1}}}, False, 
 10.1];

Wolfram Language code: AndersonDarlingTest[ts]

Wolfram Language code: AndersonDarlingTest[ts["Values"]]

Possible Issues (2)

The Anderson–Darling test is not intended for discrete distributions:

Wolfram Language code: data = RandomVariate[PoissonDistribution[30], 35];

Wolfram Language code: AndersonDarlingTest[data, PoissonDistribution[30]]

The continuity correction typically does a good job of preserving the size of the test:

Wolfram Language code: sim = RandomVariate[PoissonDistribution[30], {500, 35}];

Wolfram Language code: p = Quiet[AndersonDarlingTest[#, PoissonDistribution[30]]]& /@ sim;

Wolfram Language code:

Show[ListLinePlot[Table[{α, Probability[pv ≤ α, pvp]}, {α, .01, 1, .01}]], Plot[x, {x, 0, 1}, PlotStyle -> {Gray, Dashed}]]

This may not be the case in some situations:

Wolfram Language code: sim = RandomVariate[DiscreteUniformDistribution[{1, 3}], {500, 35}];

Wolfram Language code: p = Quiet[AndersonDarlingTest[#, DiscreteUniformDistribution[{1, 3}]]]& /@ sim;

Wolfram Language code:

Show[ListLinePlot[Table[{α, Probability[pv ≤ α, pvp]}, {α, .01, 1, .01}]], Plot[x, {x, 0, 1}, PlotStyle -> {Gray, Dashed}]]

Use Monte Carlo methods or PearsonChiSquareTest in these cases:

Wolfram Language code: AndersonDarlingTest[sim[[1]], DiscreteUniformDistribution[{1, 3}], Method -> "MonteCarlo"]

Wolfram Language code: PearsonChiSquareTest[sim[[1]], DiscreteUniformDistribution[{1, 3}]]

The Anderson–Darling test is not valid for some distributions when parameters have been estimated from the data:

Wolfram Language code: data = RandomVariate[BetaDistribution[1, 2], 100];

Wolfram Language code: AndersonDarlingTest[data, BetaDistribution[1, b]]

Provide parameter values if they are known:

Wolfram Language code: AndersonDarlingTest[data, BetaDistribution[1, 2]]

Alternatively, use Monte Carlo methods to approximate the -value:

Wolfram Language code: AndersonDarlingTest[data, BetaDistribution[1, b], Method -> "MonteCarlo"]

Neat Examples (1)

Compute the statistic when the null hypothesis is true:

Wolfram Language code: data = RandomVariate[NormalDistribution[], {2500, 100}];

Wolfram Language code: T1 = AndersonDarlingTest[#, NormalDistribution[], "TestStatistic"]& /@ data;

The test statistic given a particular alternative:

Wolfram Language code: T2 = AndersonDarlingTest[#, NormalDistribution[1, 2], "TestStatistic"]& /@ data;

Compare the distributions of the test statistics:

Wolfram Language code:

SmoothHistogram[{T1, T2}, Filling -> Axis, PlotLegends -> {"SubscriptBox[H, 0] is True", "SubscriptBox[H, 0] is False"}, PlotStyle -> Thick]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

AndersonDarlingTest

Details and Options

Examples

Basic Examples (3)

Scope (9)

Testing (6)

Reporting (3)

Options (4)

Method (3)

SignificanceLevel (1)

Applications (4)

Properties & Relations (9)

Possible Issues (2)

Neat Examples (1)

Text

CMS

APA

BibTeX

BibLaTeX

	Method	Automatic	the method to use for computing -values
	SignificanceLevel	0.05	cutoff for diagnostics and reporting

AndersonDarlingTest

Details and Options

Examples

Basic Examples (3)

Scope (9)

Testing (6)

Reporting (3)

Options (4)

Method (3)

SignificanceLevel (1)

Applications (4)

Properties & Relations (9)

Possible Issues (2)

Neat Examples (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX