Time Series in R: Stationarity Testing

In time series analysis, understanding whether your data is stationary is crucial. Stationarity is important because many statistical methods and models, such as ARIMA (AutoRegressive Integrated Moving Average), assume the data is stationary. Here we will explain stationarity testing in R and how to make time series data stationary.

What is Stationarity?

A time series is considered stationary if its statistical properties, such as mean and variance, remain constant over time. This consistency makes the data easier to model and predict. There are different types of stationarity:

Strict Stationarity: All statistical properties of the time series are constant over time. This strong condition is rarely met in practice.
Weak (Second-Order) Stationarity: The time series' mean, variance and autocovariance are constant over time. This form of stationarity is often sufficient for practical applications.
Trend Stationarity: The series has a deterministic trend that detrending can remove.

Why is Stationarity Important?

Stationarity is vital because many time series models assume that the underlying data is stationary. If your data is not stationary, these models may not perform well. Therefore, testing for stationarity and transforming the data, if necessary, is essential before applying these models. Before testing for stationarity, it's important to identify any seasonality or trends in the data. We can use decomposition methods in R by using:

decompose(): This function separates a time series into seasonal, trend, and remainder components.
stl(): This function performs seasonal decomposition using LOESS, which can handle complex seasonality.

How to Test for Stationarity in R

There are several methods to test for stationarity in R which are:

Augmented Dickey-Fuller (ADF) Test: This is a formal statistical test that checks for the presence of a unit root in the time series. A unit root indicates non-stationarity.
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test: This test checks for stationarity around a deterministic trend. It is used to complement the ADF test.

Now we implement Stationarity Tests in R Programing Language.

Step 1: Install and Load the Required Packages

# Install the tseries package if not already installed
install.packages("tseries")  # or install.packages("urca")

# Load the required package
library(tseries)  # or library(urca)

Step 2: Create or Load the Time Series Data

We'll create a simple time series object.

# Set seed for reproducibility
set.seed(123)

# Create a simple time series object with random normal data
data <- ts(rnorm(100, mean = 10, sd = 5))  # Example time series data

Step 3: Perform Stationarity Tests

Now we will Perform Stationarity Tests.

1: Augmented Dickey-Fuller (ADF) Test

The ADF test checks if a unit root is present in the series, which indicates non-stationarity.

# Perform the Augmented Dickey-Fuller Test
adf_result <- adf.test(data)

# Print ADF Test results
cat("Augmented Dickey-Fuller Test:\n")
print(adf_result)

Output:

Augmented Dickey-Fuller Test:

	Augmented Dickey-Fuller Test

data:  data
Dickey-Fuller = -4.3961, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary

2: Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

The KPSS test checks for stationarity around a deterministic trend.

# Perform the KPSS Test
kpss_result <- kpss.test(data)

# Print KPSS Test results
cat("\nKwiatkowski-Phillips-Schmidt-Shin Test:\n")
print(kpss_result)

Output:

Kwiatkowski-Phillips-Schmidt-Shin Test:

	KPSS Test for Level Stationarity

data:  data
KPSS Level = 0.12479, Truncation lag parameter = 4, p-value = 0.1

Step 4: Making Time Series Stationary

If the original tests indicate non-stationarity, we can apply differencing to the data to achieve stationarity.

# Apply differencing to the time series data
diff_data <- diff(data)

Step 5: Re-testing for Stationarity on Differenced Data

Now we will Re-testing for Stationarity on Differenced Data.

1: ADF Test on Differenced Data

# Perform the ADF Test on the differenced data
adf_diff_result <- adf.test(diff_data)

# Print ADF Test results for differenced data
cat("\nADF Test on Differenced Data:\n")
print(adf_diff_result)

Output:

ADF Test on Differenced Data:

	Augmented Dickey-Fuller Test

data:  diff_data
Dickey-Fuller = -7.2332, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary

2: KPSS Test on Differenced Data

# Perform the KPSS Test on the differenced data
kpss_diff_result <- kpss.test(diff_data)

# Print KPSS Test results for differenced data
cat("\nKPSS Test on Differenced Data:\n")
print(kpss_diff_result)

Output:

KPSS Test on Differenced Data:

	KPSS Test for Level Stationarity

data:  diff_data
KPSS Level = 0.034506, Truncation lag parameter = 3, p-value = 0.1

Step 6: Visualization

Visualizing Original and Differenced Data.

# Set up the plotting area
par(mfrow=c(2,1))  # Arrange plots in 2 rows and 1 column

# Plot the original time series
plot(data, main="Original Time Series", ylab="Values", col="blue", lwd=2)

# Plot the differenced time series
plot(diff_data, main="Differenced Time Series", ylab="Differenced Values", col="red", lwd=2)

Output:

Screenshot-2024-09-25-105625 — Orginal Vs Differenced

Conclusion

In time series analysis, ensuring data is stationary is crucial for effective modeling and accurate predictions, as many statistical methods, such as ARIMA, rely on this assumption. Understanding types of stationarity—strict, weak, and trend—helps clarify data behavior. Identifying seasonality and trends through decomposition methods is essential before performing stationarity tests like the Augmented Dickey-Fuller and KPSS tests. If data is found to be non-stationary, techniques such as differencing, detrending, and log transformations can be applied. Overall, these methods prepare time series data for analysis, leading to more reliable forecasts and informed decision-making.

Time Series in R: Stationarity Testing

What is Stationarity?

Why is Stationarity Important?

How to Test for Stationarity in R

Step 1: Install and Load the Required Packages

Step 2: Create or Load the Time Series Data

Step 3: Perform Stationarity Tests

1: Augmented Dickey-Fuller (ADF) Test

2: Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

Step 4: Making Time Series Stationary

Step 5: Re-testing for Stationarity on Differenced Data

1: ADF Test on Differenced Data

2: KPSS Test on Differenced Data

Step 6: Visualization

Conclusion

Explore