One-way ANOVA
Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences between group means, developed by R.A. Fisher (Fisher, 1925). In ANOVA, the observed variance in a particular variable, usually an outcome variable, is partitioned into components attributable to different sources of variation: typically the between-group variation and the within-group variation. Simply speaking, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups.
One-way ANOVA
One-way ANOVA typically evaluates whether there is difference in means of three or more groups, although it also works for two-group analysis. In other words, it investigates the effect a grouping variable on the outcome variable.
In the ACTIVE data, we have 4 groups: one control group and three training groups - memory, reasoning, and speed training groups. As an example, we test whether there is any difference in the four groups in terms of memory performance measured by the Hopkins Verbal Learning Test, denoted by the hvltt2
variable in the data set. Then, one-way ANOVA can be used.
Specifically for this example, the null and alternative hypotheses
\[ H_{0}:\:\mu_{1}=\mu_{2}=\mu_{3}=\mu_{4}\]
\[H_{a}:\;H_{0}\,\mbox{is not true or at least two groups have different means.}\]
For the hypothesis testing, an F-test is used. The basic idea is to decompose the total variation of the outcome variance into the variation because of the difference between groups and the variation within groups. The variation is measured by the sum of squares. For one-way ANOVA, we have \[ SST = SSB + SSW \]
- SST is the total sum of squares.
- SSB is the between-group sum of squares.
- SSW is the within-group sum of squares.
Then the F-statistic is \[ F=\frac{SSB/df_B}{SSW/df_W} = \frac{MSB}{MSW}\]
- $df_B$ is the between-group degrees of freedom, which is equal to the number of groups - 1. For the ACTIVE example, it is $4-1=3$.
- $df_W$ is the within-group degrees of freedom, which is equal to the total sample size - the number of groups. Since the sample size is 1575, $df_W = 1575-4 = 1571$.
If the null hypothesis is true, the F-statistic should follow an F distribution with degrees of freedom $df_B$ and $df_W$ under certain condition. The R code below conducts the one-way ANOVA for the ACTIVE data.
- ANOVA in python can be conducted using the function
anova()
from thepingouin
package. The function is used to construct the ANOVA source of variation table. In the table, the sum of squares (SS), mean sum of squares (MS), degrees of freedom (DF), F value and p-value (F & p)) are included. - Specifically for this example, we have F=10.051. Comparing it to an F distribution with degrees of freedom (3, 1571), we obtain the p-value = 1.466e-06. Therefore, one would reject the null hypothesis. The memory performance significantly differed across the 4 groups.
>>> ## read data >>> import pandas as pd >>> active = pd.read_csv("https://advstats.psychstat.org/data/active.csv") >>> #active['group'] = active['group'].astype('category') >>> active.head() site age edu group booster sex ... hvltt hvltt2 hvltt3 hvltt4 mmse id 0 1 76 12 1 1 1 ... 28 28 17 22 27 1 1 1 67 10 1 1 2 ... 24 22 20 27 25 2 2 6 67 13 3 1 2 ... 24 24 28 27 27 3 3 5 72 16 1 1 2 ... 35 34 32 34 30 4 4 4 69 12 4 0 2 ... 35 29 34 34 28 5 [5 rows x 14 columns] >>> >>> ## use pingouin for anova >>> import pingouin as pg >>> aov = pg.anova(dv='hvltt2', between='group', data=active, ... detailed=True) >>> aov.round(3) Source SS DF MS F p-unc np2 0 group 858.610 3 286.203 10.051 0.0 0.019 1 Within 44736.225 1571 28.476 NaN NaN NaN >>> >>> ## one can also conduct one-way ANOVA using statsmodels >>> import statsmodels.api as sm >>> from statsmodels.formula.api import ols >>> from statsmodels.stats.anova import anova_lm >>> >>> # Fit the two-way ANOVA model >>> model = ols('hvltt2 ~ C(group)', data=active).fit() >>> >>> # Perform the ANOVA >>> anova_results = anova_lm(model) >>> >>> # Display the results >>> print(anova_results.round(3)) df sum_sq mean_sq F PR(>F) C(group) 3.0 858.610 286.203 10.051 0.0 Residual 1571.0 44736.225 28.476 NaN NaN
Post-hoc multiple comparison
In the one-way ANOVA, we have found at least two groups are different in memory performance. We can further investigate which two groups are different. In python, the function pairwise_tukey() can be used to conduct a t-test for any two groups to see if they are significantly different. For example, the code below shows how to conduct such analysis.
>>> ## read data >>> import pandas as pd >>> active = pd.read_csv("https://advstats.psychstat.org/data/active.csv") >>> #active['group'] = active['group'].astype('category') >>> active.head() site age edu group booster sex ... hvltt hvltt2 hvltt3 hvltt4 mmse id 0 1 76 12 1 1 1 ... 28 28 17 22 27 1 1 1 67 10 1 1 2 ... 24 22 20 27 25 2 2 6 67 13 3 1 2 ... 24 24 28 27 27 3 3 5 72 16 1 1 2 ... 35 34 32 34 30 4 4 4 69 12 4 0 2 ... 35 29 34 34 28 5 [5 rows x 14 columns] >>> >>> ## use pingouin for anova >>> import pingouin as pg >>> pairwise = pg.pairwise_tests(dv='hvltt2', between='group', data=active) >>> print(pairwise.round(3).to_string()) Contrast A B Paired Parametric T dof alternative p-unc BF10 hedges 0 group 1 2 False True 4.221 776.812 two-sided 0.000 460.778 0.302 1 group 1 3 False True 3.284 792.934 two-sided 0.001 15.289 0.232 2 group 1 4 False True 5.042 775.369 two-sided 0.000 1.755e+04 0.361 3 group 2 3 False True -1.062 786.028 two-sided 0.289 0.138 -0.075 4 group 2 4 False True 0.936 771.841 two-sided 0.349 0.123 0.067 5 group 3 4 False True 1.993 778.808 two-sided 0.047 0.555 0.142
From the output, we might "naively" conclude that group 1 is statistically different from groups 2, 3, and 4 as well as group 3 different from group 4. However, in doing so, we run into the multiple comparison problem -- when conducting a group of more than one hypothesis testing simultaneously, hypothesis tests are more likely to incorrectly reject the null hypothesis. To demonstrate this, consider two hypotheses:
\[ H_{01}: \text{group 1 = group 2} \] and \[ H_{02}: \text{group 3 = group 4}. \]
If for each individual hypothesis, the type I error rate is controlled at 0.05, meaning there is 5% probability to reject the hypothesis even it is true. Then, the type I error rate to reject either or both of the two hypotheses is $1 - (1-.05)*(1-0.05) = 0.0975$, which is about two times of the nominal level 0.05. This indicates to control the type I error rate, one should use a smaller significance level. Such a significance level can be calculated. Suppose we have $k$ simultaneous hypotheses to test and we want to keep the overall type I error rate to be 0.05. Let $\alpha^*$ is the significance level to use. Then we have \[ 1-(1-\alpha^*)^k = 0.05. \] Solving this equation leads to \[ \alpha^*= 1 - 0.95^{1/k} \approx 0.05/k. \] For the one-way ANOVA post-hoc comparison, we conducted a total of 6 hypotheses tests. Therefore, the corrected significance level should be $0.05/6 = 0.008$. Therefore, only a p-value smaller than this can be considered significant. In this case, the difference between group 3 and group 4 is not statistically significant anymore. Another way to solve the problem is to multiply the obtained p-value by the number of tests and then compare it with 0.05 all the time. The way to adjust the p-value is called Bonferroni correction (Dunn, 1961). This can be done easily in R as shown below.
>>> ## read data >>> import pandas as pd >>> active = pd.read_csv("https://advstats.psychstat.org/data/active.csv") >>> >>> ## use pingouin for anova >>> import pingouin as pg >>> pairwise = pg.pairwise_tests(dv='hvltt2', between='group', ... padjust='bonf', data=active) >>> print(pairwise.round(3).to_string()) Contrast A B Paired Parametric T dof alternative p-unc p-corr p-adjust BF10 hedges 0 group 1 2 False True 4.221 776.812 two-sided 0.000 0.000 bonf 460.778 0.302 1 group 1 3 False True 3.284 792.934 two-sided 0.001 0.006 bonf 15.289 0.232 2 group 1 4 False True 5.042 775.369 two-sided 0.000 0.000 bonf 1.755e+04 0.361 3 group 2 3 False True -1.062 786.028 two-sided 0.289 1.000 bonf 0.138 -0.075 4 group 2 4 False True 0.936 771.841 two-sided 0.349 1.000 bonf 0.123 0.067 5 group 3 4 False True 1.993 778.808 two-sided 0.047 0.279 bonf 0.555 0.142
Other p-value correction methods
Bonferroni correction does not take into consideration of the actual size of p-values. Other methods can have a larger power than the Bonferroni correction method. One such method is the Holm's step-down method. The method obtain the corrected level for rejecting a null hypothesis in the following way.- Specify $\alpha$, the level at which to control the overall type I error.
- Compute $p$-values, $p_1, \ldots , p_m$, for the m null hypotheses $H_{01}, \ldots, H_{0m}$.
- Order the m p-values so that $p^{(1)} \leq p^{(2)} \leq \ldots \leq p^{(m)}$.
- Define $L = min\left\{ j: p^{(j)} > \frac{\alpha}{m+1−j} \right \}$
- Reject all null hypotheses $H_{0j}$ for which $p_j < p^{(L)}$
>>> ## read data >>> import pandas as pd >>> active = pd.read_csv("https://advstats.psychstat.org/data/active.csv") >>> >>> ## use pingouin for anova >>> import pingouin as pg >>> pairwise = pg.pairwise_tests(dv='hvltt2', between='group', ... padjust='holm', data=active) >>> print(pairwise.round(3).to_string()) Contrast A B Paired Parametric T dof alternative p-unc p-corr p-adjust BF10 hedges 0 group 1 2 False True 4.221 776.812 two-sided 0.000 0.000 holm 460.778 0.302 1 group 1 3 False True 3.284 792.934 two-sided 0.001 0.004 holm 15.289 0.232 2 group 1 4 False True 5.042 775.369 two-sided 0.000 0.000 holm 1.755e+04 0.361 3 group 2 3 False True -1.062 786.028 two-sided 0.289 0.577 holm 0.138 -0.075 4 group 2 4 False True 0.936 771.841 two-sided 0.349 0.577 holm 0.123 0.067 5 group 3 4 False True 1.993 778.808 two-sided 0.047 0.140 holm 0.555 0.142
Barplot with standard error or confidence interval
With a fairly large sample size, it can be useful to make a barplot to show the group information. The height of each bar is proportional to the mean of each group. In addition, the standard error or the confidence interval for the mean of each group can be added to the bars. Such a plot can be generated using the bar()
from the matplotlib
package. To generate a such a plot, we need a data frame with the following information: the grouping variable, the mean of each group and the standard error. For the ACTIVE data, the information is given in the following table.
group | mean | se |
1 | 26.1 | 0.27 |
2 | 24.5 | 0.27 |
3 | 24.9 | 0.25 |
4 | 24.1 | 0.29 |
The python code below calculates the information in the table and generates a bar plot as shown in the output. We now explain the code.
groups
: The labels for the different groups (x-axis).means
: The mean value for each group (height of the bars).std_errors
: The standard errors (or any other measure of uncertainty) to plot as vertical error bars.yerr=std_errors
: Specifies the error bars along the y-axis (the vertical error bars).capsize=5
: Adds caps at the ends of the error bars for better visibility.color='skyblue'
: Specifies the color of the bars.edgecolor='black'
: Adds a black border around the bars.
>>> import matplotlib.pyplot as plt >>> import numpy as np >>> >>> # Example data (mean values and standard errors for each group) >>> groups = ['Memory', 'Reasoning', 'Speed', 'Control'] >>> means = [26.1, 24.5, 24.9, 24.1] >>> std_errors = [0.27, 0.27, 0.25, 0.29] # Standard error for each group >>> >>> # Create the bar plot with error bars >>> plt.bar(groups, means, yerr=std_errors, capsize=5, color='skyblue', edgecolor='black')>>> >>> # Adding labels and title >>> plt.xlabel('Training Groups') Text(0.5, 0, 'Training Groups') >>> plt.ylabel('Mean Value') Text(0, 0.5, 'Mean Value') >>> plt.title('Bar Plot with Error Bars') Text(0.5, 1.0, 'Bar Plot with Error Bars') >>> >>> # Display the plot >>> plt.savefig('bar.svg', format='svg')
To cite the book, use:
Zhang, Z. & Wang, L. (2017-2022). Advanced statistics using R. Granger, IN: ISDSA Press. https://doi.org/10.35566/advstats. ISBN: 978-1-946728-01-2.
To take the full advantage of the book such as running analysis within your web browser, please subscribe.