Maria Cholifah (林玛丽亚): ONE-WAY ANOVA

Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means by examining the variances of samples that are taken. ANOVA allows one to determine whether the differences between the samples are simply due to random error (sampling errors) or whether there are systematic treatment effects that causes the mean in one group to differ from the mean in another.

Most of the time ANOVA is used to compare the equality of three or more means, however when the means from two samples are compared using ANOVA it is equivalent to using a t-test to compare the means of independent samples.

ANOVA is based on comparing the variance (or variation) between the data samples to variation within each particular sample. If the between variation is much larger than the within variation, the means of different samples will not be equal. If the between and within variations are approximately the same size, then there will be no significant difference between sample means.

Assumptions of ANOVA:

(i) All populations involved follow a normal distribution.

(ii) All populations have the same variance (or standard deviation).

(iii) The samples are randomly selected and independent of one another.

Since ANOVA assumes the populations involved follow a normal distribution, ANOVA falls into a category of hypothesis tests known as parametric tests. If the populations involved did not follow a normal distribution, an ANOVA test could not be used to examine the equality of the sample means. Instead, one would have to use a non-parametric test (or distribution-free test), which is a more general form of hypothesis testing that does not rely on distributional assumptions.

Whereas the t test is an appropriate test of the difference between the means of two groups at a time (e.g., boys and girls), ANOVA is the test for multiple group comparisons. Variance is an important statistical measure and is described as the mean of the squares of deviations taken from the mean of the given series of data. It is a frequently used measure of variation. Its square root is known as standard deviation. Standard deviation = √ Variance.

ANOVA is essentially a procedure for testing the difference among different groups of data for homogeneity. The essence of ANOVA is that the total amount of variation in a set of data is broken down into two types:

- The amount which can be attributed to chance.

- The amount which can be attributed to specified causes.

ANOVA consists in splitting the variance for analytical purposes. Hence, it is a method of analyzing the variance to which a response is subject into its various components corresponding to various sources of variation. Through ANOVA technique one can, in general, investigate any number of factors which are hypothesized or said to influence the dependent variable. One may as well investigate the differences among various categories within each of these factors which may have a large number of possible values.

In terms of variation within the given population, it is assumed that the values of (Xij) differ from the mean of this population only because of random effects i.e., there are influences on (Xij) which are unexplainable, whereas in examining differences between populations we assume that the difference between the mean of the jth population and the grand mean is attributable to what is called a ‘specific factor’ or what is technically described as treatment effect. In short, we have to make two estimates of population variance:

- based on between samples variance

- based on within samples variance.

The two estimates of population variance are compared with F-test,

This value of F is to be compared to the F-limit for given degrees of freedom. If the F value we work out is equal or exceeds* the F-limit value, we may say that there are significant differences between the sample means.

...