The Accuracy-Interpretability Trade-off: Why Simple Linear Surrogates Often Fail Complex Models

Introduction

In the modern era of artificial intelligence, we are increasingly reliant on “black-box” models—deep neural networks, gradient-boosted trees, and massive ensemble learners. These models achieve unprecedented predictive accuracy, but they keep their decision-making logic hidden behind layers of non-linear transformations. To bridge this gap, practitioners frequently turn to linear surrogates: simple, interpretable models (like linear regression or decision trees) trained to mimic the behavior of the complex model.

While linear surrogates offer the comfort of readability, they come with a significant cost: low fidelity. As the complexity of a model increases, the ability of a linear approximation to capture its nuances decreases. This article explores why this happens, how to measure the gap, and when you should—or should not—rely on these surrogate techniques for high-stakes decision-making.

Key Concepts

To understand the limitation of surrogates, we must define two core metrics: Interpretability and Fidelity.

Interpretability is the degree to which a human can understand the cause of a decision. A linear regression model is highly interpretable because we can point to a coefficient and say, “For every one-unit increase in X, Y increases by Z.”

Fidelity refers to how well the surrogate model replicates the predictions of the original, complex model. A surrogate has high fidelity if its outputs match the black box across the entire feature space. The fundamental problem is that global fidelity is almost impossible to maintain when a complex model captures non-linear interactions, high-dimensional manifolds, or threshold-based logic that a simple line or plane simply cannot represent.

The “Accuracy-Interpretability Trade-off” suggests that as models become more powerful, they become less transparent. Linear surrogates attempt to solve this, but they often provide a comforting lie rather than an accurate simplification.

Step-by-Step Guide: Implementing and Evaluating Surrogates

If you choose to use a surrogate for model explanation (such as LIME or similar approaches), follow this workflow to ensure you aren’t misleading yourself or stakeholders.

Select the Black-Box Model: Build your high-performance model (e.g., XGBoost, Random Forest). Ensure it is fully trained and performant.
Define the Perturbation Space: Since surrogates work by perturbing inputs to see how the black box reacts, define the local region you care about. Do not try to explain the entire model at once; focus on specific instances or small clusters.
Generate Synthetic Data: Create a dataset around your point of interest. If you are explaining a specific loan rejection, create variations of that applicant’s profile.
Obtain Predictions: Pass these synthetic samples through the complex model to capture the “ground truth” labels.
Train the Surrogate: Fit a weighted linear model (or a shallow decision tree) on these synthetic samples, using the complex model’s predictions as the target variable.
Measure Local Fidelity: Calculate the R-squared or accuracy of the surrogate against the black box within that local region. If the fidelity is low, the surrogate’s explanation is likely unreliable.

Examples and Case Studies

Credit Scoring Systems

In finance, transparency is a regulatory requirement. A bank uses a deep neural network to predict loan default risks. To explain a rejection, they use a linear surrogate. Because the neural network uses complex feature interactions (e.g., a specific ratio of debt-to-income multiplied by a regional economic index), the linear surrogate might suggest that “Income” was the primary factor for rejection. However, the surrogate may have completely missed the non-linear “cliff” where the interaction of debt and region triggered the rejection. This leads to inaccurate explanations that confuse customers and frustrate regulators.

Predictive Maintenance

An industrial plant uses a Random Forest to predict when a turbine will fail. An engineer uses a linear surrogate to understand which sensors are triggering alerts. The surrogate identifies “Temperature” as the leading cause. However, the Random Forest actually relies on the rate of change of temperature over time, not the temperature itself. The linear surrogate fails to capture the temporal trend, leading the engineer to monitor static temperatures while ignoring the actual volatility that predicts failure.

Common Mistakes

Confusing Local for Global: Assuming that because a surrogate explains one prediction well, it represents the entire model’s logic. Always treat surrogate explanations as local, instance-specific approximations.
Ignoring Feature Interaction: Linear models assume independence between features. If your complex model thrives on interaction effects (e.g., A x B), a linear surrogate will effectively “average out” these effects, obscuring the truth.
Over-trusting the Surrogate: Using a surrogate without checking the R-squared value of the surrogate model itself. If the surrogate’s fit to the black box is poor, its “explanation” is effectively noise.
Data Range Mismatch: Using a surrogate to explain a prediction that falls outside the distribution of the original training data. The surrogate will likely extrapolate in ways the black box never would.

Advanced Tips

To overcome the fidelity issues of simple linear surrogates, move toward more sophisticated diagnostic frameworks:

1. Use SHAP (SHapley Additive exPlanations)

Unlike simple linear regression, SHAP is rooted in game theory. It provides a way to assign each feature an importance value for a particular prediction. While computationally more expensive, it offers much higher fidelity than a simple local linear surrogate because it accounts for interactions more effectively.

2. Partial Dependence Plots (PDP) and ALE Plots

Instead of trying to approximate the model with a linear line, use PDPs or Accumulated Local Effects (ALE) plots to visualize how the model output changes as a specific feature varies. These are model-agnostic and capture non-linear relationships much better than a static linear coefficient.

3. Use “Glass-Box” Models from the Start

If you truly need both high interpretability and decent performance, explore models like Explainable Boosting Machines (EBMs). These models are designed to be additive and interpretable while rivaling the performance of complex tree-based ensembles. They bypass the need for a surrogate by being interpretable by design.

Conclusion

Linear surrogates serve a valuable purpose in making complex models accessible, but they should never be mistaken for the models themselves. Their low fidelity is a structural limitation, not just a technical oversight. When you simplify, you lose information—sometimes the very information that defines the model’s accuracy.

For high-stakes applications, do not rely solely on simple linear approximations. Validate your explanations with fidelity metrics, explore game-theoretic approaches like SHAP, and consider using inherently interpretable models where possible. Transparency in AI is not just about making a model simple; it is about accurately representing the complexity that drives the prediction.