Log Transformation

Log transformation is a way to change data that has very large numbers, very small numbers or a skewed shape. It works by taking the logarithm of each number in the data which helps to “compress” the large values and spread out the small ones.

We use log transformation when:

Some numbers are extremely large compared to others.
Data is not following a normal (bell-shaped) distribution.
We want to make patterns easier to spot for analysis.

Log transformation shrinks big numbers and stretches small numbers so the data becomes more balanced.

Here’s a graph of example dataset before log transformation. As you can see, the data is highly right-skewed which is typical of real-world data that often requires log transformation.

Log transformation helps by:

Making the data more symmetrical (normal-shaped).
Making patterns between variables clearer.
Making predictions more stable and reliable.
Reducing the effect of extreme values (outliers).

Here’s the graph of the same dataset after log transformation. You can clearly see that the data is now much closer to a normal distribution (bell-shaped curve).

What is a Logarithm?

A logarithm is simply the opposite of an exponent.

Example:

If 10² = 100, then log10(100) = 2.
If 2³ = 8, then log2(8) = 3.

In statistics, we usually use the natural logarithm (ln) which uses a special number called "e" (around 2.718).

How Do We Apply Log Transformation?

Applying log transformation means replacing each data value with its logarithm.

For example:

Original values: 10, 100, 1000
After log10 transformation: 1, 2, 3

In Python, it’s very easy to do:

Python

import numpy as np

x = [1, 10, 100, 1000]
log_x = np.log(x)  
print(log_x)

The function np.log() computes the natural logarithm (log base e).

Note: You can only take the log of positive numbers. You may need to adjust your data if it contains 0 or negative values.

Now let’s compute ln (natural log) for each value:

x	In(x)
1	0
10	≈ 2.302585
100	≈ 4.605170
1000	≈ 6.907755

Output:

[0. 2.30258509 4.60517019 6.90775528]

How to Understand Results After Log Transformation

When you use log transformation in models like regression you need to interpret the results a bit differently depending on what you transformed.

1. Log-Level Model (Log Y, regular X)

Formula:

\ln(Y) = \beta_0 + \beta_1 \times X

Here:

Y = your target variable (the thing you're predicting)
X = your independent variable (the thing you're using to predict Y)
β₀ = intercept (value when X = 0)
β₁ = how much change happens in Y when X increases

In this model, only the dependent variable Y is log-transformed. This means that every 1-unit increase in X leads to a percentage change in Y. Specifically, for each additional unit of X, Y changes by approximately β₁ × 100%. We use this model when Y grows in percentage terms as X increases in actual units.

Example: If β1 = 0.05 then for every 1 unit increase in X then Y increases by: 0.05×100=5 %

2. Level-Log Model (Regular Y, Log X)

Formula:

Y = \beta_0 + \beta_1 \times \ln(X)

Here:

Y = your target variable (the thing you're predicting)
X = your independent variable (the thing you're using to predict Y)
β₀ = intercept (value when X = 0)
β₁ = how much change happens in Y when X increases

Here, only the independent variable X is log-transformed. This means that when X increases by 1%, Y changes by β₁ ÷ 100 units. This model is useful when small percentage changes in X result in actual unit changes in Y. It helps when X grows multiplicatively but Y grows additively.

Example: If β1 = 4 then for every 1% increase in X, Y increases by: \frac{4}{100} = 0.04 \ \text{units}

3. Log-Log Model (Log Y, Log X)

Formula:

\ln(Y) = \beta_0 + \beta_1 \times X

Here:

Y = your target variable (the thing you're predicting)
X = your independent variable (the thing you're using to predict Y)
β₀ = intercept (value when X = 0)
β₁ = how much change happens in Y when X increases

In this model, both Y and X are log-transformed. Now, a 1% change in X results in a β₁% change in Y. This is called an elasticity model because both variables are interpreted in percentage terms. It’s commonly used when both X and Y grow or shrink proportionally.

Example: If β1 = 0.8 then for every 1% increase in X, Y increases by 0.8%.

Things You Should Check Before Using Log Transformation

Do not apply it to zero or negative values.
Use it only when data is skewed or has big differences in size.
Always check if the transformation improves your analysis or not.

Related Articles:

Log transformation of an image using Python and OpenCV
Feature Transformation Techniques in Machine Learning

What is a Logarithm?

How Do We Apply Log Transformation?

How to Understand Results After Log Transformation

1. Log-Level Model (Log Y, regular X)

2. Level-Log Model (Regular Y, Log X)

3. Log-Log Model (Log Y, Log X)

Things You Should Check Before Using Log Transformation

Explore