How to Interpret Regression Model Diagnostics in Python

Image by Editor

Regression analysis is a common method used to predict continuous values, like sales or prices. Building a regression model is easy, but the results are only reliable if the model’s assumptions are met. Diagnostics help identify issues like multicollinearity, heteroscedasticity, non-linearity, or outliers that can distort predictions. Python provides libraries such as statsmodels, scikit-learn, and matplotlib to visualize and assess these aspects.

In this article, we’ll explore how to interpret common regression diagnostics in Python and use them to refine model performance.

Setting Up The Model

We’ll use the Diabetes dataset and fit a regression model:

import pandas as pd
import statsmodels.api as sm
from sklearn.datasets import load_diabetes

# Load Diabetes dataset
data = load_diabetes(as_frame=True)
df = data.frame

# Select predictors and target
X = df[['bmi', 'bp', 's1']]   
y = df['target']

# Add constant for intercept
X = sm.add_constant(X)

# Fit OLS regression
model = sm.OLS(y, X).fit()

Diagnostic Checks

Diagnostics help verify whether the assumptions of linear regression are satisfied. Let’s look at the main plots.

Residuals vs. Fitted Values Plot

The Residuals vs. Fitted Values plot shows whether residuals are randomly scattered around zero. Patterns suggest non-linearity or heteroscedasticity (unequal variance).

import matplotlib.pyplot as plt

fitted = model.fittedvalues
resid = model.resid

plt.figure(figsize=(6,4))
plt.scatter(fitted, resid, alpha=0.6)
plt.axhline(0, linestyle='--')
plt.xlabel("Fitted values")
plt.ylabel("Residuals")
plt.title("Residuals vs. Fitted")
plt.show()

Normal Q–Q Plot

The Normal Q–Q plot compares residuals to a normal distribution. If points fall along the diagonal line, residuals are approximately normal; deviations suggest skewness or heavy tails.

import matplotlib.pyplot as plt
from statsmodels.graphics.gofplots import qqplot

plt.figure(figsize=(6,4))
qqplot(resid, line='45', fit=True)
plt.title("Normal Q–Q")
plt.show()

Scale–Location Plot

The Scale–Location plot displays the spread of residuals across fitted values. A random, even spread supports homoscedasticity, while a funnel shape indicates heteroscedasticity.

import numpy as np
import matplotlib.pyplot as plt

influence = model.get_influence()
std_resid = influence.resid_studentized_internal
sqrt_abs_std_resid = np.sqrt(np.abs(std_resid))

plt.figure(figsize=(6,4))
plt.scatter(model.fittedvalues, sqrt_abs_std_resid, alpha=0.6)
plt.xlabel("Fitted values")
plt.ylabel("√|Standardized residuals|")
plt.title("Scale–Location")
plt.show()

Residuals vs. Leverage Plot

The Residuals vs. Leverage plot identifies influential observations. High-leverage points with large residuals can disproportionately affect model estimates.

import matplotlib.pyplot as plt
from statsmodels.graphics.regressionplots import plot_leverage_resid2

fig, ax = plt.subplots(figsize=(6,4))
plot_leverage_resid2(model, ax=ax)
ax.set_title("Residuals vs. Leverage")
plt.show()

Statistical Tests

In addition to visual checks, statistical tests provide formal ways to detect violations.

Breusch–Pagan Test

The Breusch–Pagan test checks for heteroscedasticity (non-constant variance of residuals). A high p-value supports homoscedasticity, while a low one indicates heteroscedasticity.

from statsmodels.stats.diagnostic import het_breuschpagan

bp_stat, bp_p, f_stat, f_p = het_breuschpagan(resid, model.model.exog)
print(f"Breusch–Pagan: LM={bp_stat:.3f}, p={bp_p:.4f} | F={f_stat:.3f}, p={f_p:.4f}")

Breusch–Pagan: LM=5.758, p=0.1240 | F=1.927, p=0.1244

With p=0.1240 (> 0.05), we fail to reject the null hypothesis, suggesting the homoscedasticity assumption is satisfied.

Shapiro–Wilk Test

The Shapiro–Wilk test checks if residuals follow a normal distribution. A high p-value supports normality, while a low one suggests a violation.

from scipy.stats import shapiro

W, p = shapiro(resid)
print(f"Shapiro–Wilk: W={W:.3f}, p={p:.4f}")

Shapiro–Wilk: W=0.990, p=0.0060

With p=0.0060 (< 0.05), the residuals show slight deviation from normality, though this is often acceptable with larger sample sizes.

Durbin–Watson Test

The Durbin–Watson test measures autocorrelation in residuals. Values near 2 indicate independence, while values closer to 0 or 4 show correlation.

from statsmodels.stats.stattools import durbin_watson

dw = durbin_watson(resid)
print(f"Durbin–Watson: {dw:.3f}")

Durbin–Watson: 1.918

Note: The Durbin–Watson test is most relevant for time series data. For this cross-sectional dataset, the independence assumption is primarily addressed through proper sampling design.

Multicollinearity Diagnostics

Multicollinearity occurs when independent variables are highly correlated with each other. This can inflate standard errors, make coefficient estimates unstable, and reduce the interpretability of the model.

A common way to detect multicollinearity is by calculating the Variance Inflation Factor (VIF) for each predictor:

from statsmodels.stats.outliers_influence import variance_inflation_factor

X_vif = X

features = X_vif.columns
vifs = []
for i, col in enumerate(features):
    vif_val = variance_inflation_factor(X_vif.values, i)
    vifs.append((col, vif_val))

print(vifs)

[('const', np.float64(0.9999999999999998)), ('bmi', np.float64(1.2217707711234034)), ('bp', np.float64(1.2170977247330208)), ('s1', np.float64(1.0951284725580026))]

All VIF values are below 5, indicating no problematic multicollinearity among the predictors.

Addressing Violations

When regression diagnostics indicate violations of model assumptions, it is important to take corrective actions to ensure reliable results. Some common approaches include:

Heteroscedasticity: Apply data transformations, use weighted least squares, or use heteroscedasticity-robust standard errors (e.g., HC3)
Non-normality: Transform the dependent variable (e.g., Box–Cox or Yeo–Johnson), apply non-parametric regression methods, or increase sample size if feasible
Multicollinearity: Apply principal component regression, or use regularization methods such as ridge or lasso regression

Conclusion

Regression diagnostics are essential for validating model assumptions and ensuring reliable inference. Visual plots help detect patterns, while statistical tests provide formal confirmation of potential issues. Multicollinearity checks further strengthen model stability. If violations are found, corrective techniques such as transformations, robust methods, or regularization should be applied to improve accuracy.