Analysis of Variance (ANOVA) is one of the statistical techniques employed in testing of Hypothesis which tends to compare the means of three or more samples. However, the basic concept of ANOVA remains the same and, even though the results may differ slightly, this is due to variations of different software programs. For instance, ANOVA results generated by R, and Python’s Statsmodels tend to be different by a small margin in values such as F-statistics or p-values.

In this article, we will explore the reasons for these discrepancies by discussing the differences in methodology, software-specific implementations, and examining examples of discrepancies between ANOVA results in R and Statsmodels.
Table of Content
Differences in Default Type of ANOVA
One key reason for different results between R and Statsmodels is the default type of sum of squares (SS) each tool uses. In ANOVA, sum of squares measures the variation within and between groups, and there are different types of SS: Type I, Type II, and Type III.
- Type I SS (Sequential): This is also called sequential sum of squares. It depends on the order in which factors are entered into the model. R typically uses this type by default.
- Type II SS (Hierarchical): This type tests each factor after accounting for the effects of other factors. It’s often used when the design is balanced.
- Type III SS (Marginal): This tests each factor with all other factors in the model, which is important for unbalanced designs or interaction effects. Statsmodels often defaults to this type when handling ANOVA with interaction terms.
Software-Specific Implementations
R's Implementation of ANOVA
In R, the aov() function is commonly used for ANOVA. In R, the default is Type I sum of squares, which stands for sequential. This means that when analyzing the main equation, each of the predictors’ contribution is analyzed sequentially based on its position in the formula.
For example:
model <- aov(y ~ x1 + x2, data = df)
summary(model)
This approach may lead to different results when predictors are correlated because the order of the predictors affects how the variance is partitioned.
R also provides other methods for ANOVA, such as Anova() from the car package, which allows for Type II and Type III sum of squares:
library(car)
Anova(model, type = "III")
Statsmodels' Implementation of ANOVA
Statsmodels in Python offers robust statistical modeling capabilities, including ANOVA using formulas and model fitting:
import statsmodels.api as sm
from statsmodels.formula.api import ols
model = ols('y ~ x1 + x2', data=df).fit()
sm.stats.anova_lm(model, typ=2)
Statsmodels also supports Type III sum of squares, which tests each term after accounting for all other terms and interactions:
sm.stats.anova_lm(model, typ=3)
These differences in sum-of-squares methods can lead to slight variations in F-values and p-values when comparing results from R and Statsmodels.
Examples of Discrepancies
Let's look at a concrete example of how R and Statsmodels produce slightly different ANOVA results.
Example Dataset
Consider the following dataset:
# Create dataset in R
set.seed(123)
df <- data.frame(
group = rep(c("A", "B", "C"), each = 10),
score = c(rnorm(10, mean = 5, sd = 1), rnorm(10, mean = 6, sd = 1), rnorm(10, mean = 7, sd = 1))
)
Performing ANOVA in R
# Load necessary packages
library(dplyr)
# Perform ANOVA
anova_result_r <- aov(score ~ group, data = df)
summary(anova_result_r)
Output in R
Df Sum Sq Mean Sq F value Pr(>F)
group 2 53.85 26.925 47.12 2e-09 ***
Residuals 27 13.15 0.487
Performing ANOVA in Python (Statsmodels)
For comparison, here's how you can perform ANOVA in Python using Statsmodels:
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
# Convert R dataframe to Python pandas dataframe
df_python = pd.DataFrame(df)
# Perform ANOVA using Statsmodels
model = ols('score ~ group', data=df_python).fit()
anova_result_python = sm.stats.anova_lm(model)
print(anova_result_python)
Output in Python
df sum_sq mean_sq F PR(>F)
group 2.0 53.846754 26.923377 47.121 0.000002
Residual 27.0 13.145687 0.486881 NaN NaN
Conclusion
Despite these, ANOVA is still a robust hypothesis testing technique, but small disparities between R and Statsmodels are attributed to the difference of SS types, numerical precision, treatment of missing values, and software differences. These aspects should be considered by the researchers to avoid falling into a wrong conclusion when using different statistics software packages to use ANOVA analysis. This way, researchers probably can avoid certain tendencies and make some procedures more consistent and less error-prone for statistical applications in their research.