Feature Importance with Random Forests

Last Updated : 11 Nov, 2025

Feature Importance in Random Forests measures how much each feature contributes to the model’s prediction accuracy. It helps in identifying the most influential input variables, improving performance, interpretability and computational efficiency.

  • It enhances model transparency by ranking influential features.
  • Improves predictive accuracy by focusing on key variables.
  • Reduces overfitting by removing less significant features.
  • Optimizes training speed and overall model efficiency.

Importance

Understanding feature importance offers several advantages:

  • Enhanced Model Performance: By identifying the most influential features, we can prioritize them during model training hence leading to more accurate predictions.
  • Faster Training Times: Focusing on the most relevant features streamlines the training process, saving valuable time and computational resources.
  • Reduced Overfitting: Overfitting occurs when a model memorizes the training data instead of learning general patterns. By focusing on important features, we can prevent the model from becoming overly reliant on specific data points.

Feature Importance in Random Forests

Random Forests, a popular ensemble learning algorithm, consist of multiple decision trees that combine to produce robust predictions. They inherently provide mechanisms to measure feature importance using various methods such as:

  • Built-in Feature Importance: Uses internal metrics like Gini importance or Mean Decrease in Accuracy.
  • Permutation Feature Importance: Evaluates how model performance changes when a feature is shuffled.
  • SHAP (SHapley Additive exPlanations) Values: Quantifies the contribution of each feature to individual predictions.

Method 1: Built-in Feature Importance

To show implementation the iris dataset is used throughout the article to understand the implementation of feature importance.

Step 1: Install dependencies and Libraries

We will install the required libraries and packages.

  • The dataset is split into training and testing sets to evaluate model performance.
  • We also extract the feature names for later use.
Python
!pip install shap
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import pandas as pd
import shap
from sklearn.metrics import accuracy_score
import numpy as np

iris = load_iris()
X = iris.data
y = iris.target
feature_names = iris.feature_names

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size = 0.25, random_state=42) # Split dataset into 75% train and 25% test
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)

Step 2: Calculate Gini Importance

  • Gini importance is derived from the Random Forest algorithm’s internal structure.
  • It measures how much each feature contributes to reducing impurity in decision trees.
Python
importances = clf.feature_importances_
feature_imp_df = pd.DataFrame({'Feature': feature_names, 'Gini Importance': importances}).sort_values(
    'Gini Importance', ascending=False)
print(feature_imp_df)

Output:

Screenshot-2025-11-03-153149

Visualization plot for feature importance

Python
plt.figure(figsize=(8, 4))
plt.barh(feature_names, importances, color='skyblue')
plt.xlabel('Gini Importance')
plt.title('Feature Importance - Gini Importance')
plt.gca().invert_yaxis()
plt.show()

Output:

feature-importance

Method 2: Mean Decrease Accuracy (MDA)

  • Mean Decrease Accuracy (MDA) measures the average reduction in model accuracy across all out-of-bag cross-validated predictions when a particular feature undergoes permutation after training but before making predictions.
  • MDA measures how much accuracy drops when each feature is randomly shuffled.
  • A larger drop in accuracy means the feature is more important.
Python
importances = []
initial_accuracy = accuracy_score(y_test, clf.predict(X_test))
for i in range(X.shape[1]):
    X_test_copy = X_test.copy()
    np.random.shuffle(X_test_copy[:, i])
    shuff_accuracy = accuracy_score(y_test, clf.predict(X_test_copy))
    importances.append(initial_accuracy - shuff_accuracy)

accuracy_df = pd.DataFrame({'Feature': feature_names, 'Decrease in Accuracy': importances}).sort_values(
    'Decrease in Accuracy', ascending=False)
print(accuracy_df)

Output:

Screenshot-2025-11-03-153043

Visualization plot for feature importance

Python
plt.figure(figsize=(8, 4))
plt.barh(feature_names, importances, color='skyblue')
plt.xlabel('Mean Decrease Accuracy')
plt.title('Feature Importance - Mean Decrease Accuracy')
plt.gca().invert_yaxis()
plt.show()

Output:

featureimp

Method 3: Permutation Feature Importance

  • Uses a statistical approach to measure importance by shuffling each feature.
  • More robust than Gini importance as it does not depend on model internals.
  • The permutation_importance() function from scikit-learn automates this process.
Python
from sklearn.inspection import permutation_importance
result = permutation_importance(
    clf, X_test, y_test, n_repeats=10, random_state=0, n_jobs=-1)
perm_imp_df = pd.DataFrame({'Feature': feature_names, 'Permutation Importance': result.importances_mean}).sort_values(
    'Permutation Importance', ascending=False)
print(perm_imp_df)

Output:

Screenshot-2025-11-03-153031

Here's what the permutation importance values suggest in this output:

  • Petal length (cm): has the highest permutation importance value (0.210526), indicating that shuffling the values of this feature leads to a significant decrease in the model's performance.
  • Petal width (cm): also has a relatively high permutation importance value (0.144737), though lower than "Petal length (cm)".
  • Sepal length (cm) and "Sepal width (cm): have permutation importance values of 0, suggesting that shuffling the values of these features does not affect the model's performance significantly.

Visualization plot for feature importance

Python
plt.figure(figsize=(6, 6))
plt.bar(perm_imp_df['Feature'], perm_imp_df['Permutation Importance'])
plt.xlabel('Feature')
plt.ylabel('Permutation Importance')
plt.title('Permutation Feature Importance')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

Output:

features

Method 4: SHAP values

SHAP (SHapley Additive exPlanations) value is a measure of the contribution of a feature towards the prediction for each instance.

  • For a given prediction made by a Random Forest model, SHAP values can be calculated for each feature for that specific instance.
  • It involves considering all possible subsets of features and their interactions, measuring the impact of including or excluding each feature on the prediction.

Positive SHAP values indicate a positive contribution to the prediction, while negative values suggest a negative contribution. The magnitude of the SHAP value represents the strength of the contribution.

Python
import shap
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

explainer = shap.TreeExplainer(clf)
shap_values = explainer.shap_values(X_test)
if isinstance(shap_values, list):
    shap_summary = np.mean(np.abs(np.stack(shap_values)), axis=(0, 1))
else:
    shap_summary = np.abs(shap_values).mean(axis=0)

shap_summary = shap_summary.flatten()
shap_summary = shap_summary[:len(feature_names)]
shap_summary_df = pd.DataFrame({
    'Feature': feature_names,
    'Mean |SHAP Value|': shap_summary
}).sort_values('Mean |SHAP Value|', ascending=False)

print(shap_summary_df)
plt.figure(figsize=(8, 4))
plt.barh(shap_summary_df['Feature'],
         shap_summary_df['Mean |SHAP Value|'], color='skyblue')
plt.xlabel('Mean Absolute SHAP Value')
plt.title('Feature Importance (SHAP Values)')
plt.gca().invert_yaxis()
plt.show()

Output:

Screenshot-2025-11-03-153015
shap
Shap Values
Comment