Wednesday, January 8, 2025

Why Linear Models Have Higher Bias


Linear models tend to exhibit higher bias because they make simplistic assumptions about the relationship between the input features and the target variable. 


1. Assumption of Linearity

Nature of the Model:
Linear models assume a linear relationship between the input features (x) and the output (y):
                                                        y=w1x1+w2x2++b

Limitation:
Real-world data often exhibits complex, non-linear relationships that a linear model cannot capture. As a result:

  • The predictions are far from the true values, leading to high bias.
  • The model underfits the data because it oversimplifies the problem.


2. Reduced Model Complexity

  • Simple Parameterization:
    Linear models have relatively few parameters (weights and biases), which limits their flexibility.

  • Limitation:
    They cannot adapt to intricate patterns in the data, especially in cases with high feature interactions or non-linear dependencies.


3. Lack of Feature Interactions

  • Nature of Interactions:
    Linear models do not automatically account for interactions between features (e.g., the combined effect of x1x_1 and x2x_2 on yy).

  • Limitation:
    In many real-world problems, feature interactions play a critical role. Ignoring these interactions increases the model's bias.
 

4. High Bias by Design

  • Simplified Decision Boundaries:
    Linear models create straight-line decision boundaries (e.g., in classification tasks). These boundaries may not accurately separate complex data distributions.

  • Example:
    In image classification, linear models fail to capture spatial and hierarchical patterns, leading to underperformance.

5. Robustness vs. Bias

  • Intention:
    Linear models are intentionally designed to be robust and interpretable but at the cost of higher bias.

  • Tradeoff:
    They avoid overfitting (low variance) but underfit the data due to their simplicity.


When Are Linear Models Useful Despite High Bias?

When Relationships Are Actually Linear:
    • If the underlying relationship is simple, linear models work well.
    • Example: Predicting house prices based on square footage.

When Interpretability Is Key:

  • Linear models are easier to interpret compared to complex non-linear models.

When Data Is Limited:

  • Linear models generalize better when there isn’t enough data to support more complex models.







No comments:

Post a Comment