Bayesian Feature Engineering: Incorporating Prior Beliefs into Feature Design

Linear models are a effective and interpretable tool for a wide range of machine learning problems. However, they are limited by their assumption that the relationship between the features and the target is linear. In many real-world applications, this assumption does not hold. The relationship between the features and the target is often non-linear, and linear models can fail to capture this complexity. This article explores two techniques for extending linear models to capture non-linear relationships: feature interaction and polynomial feature creation.

Feature Interaction: Capturing Synergistic Effects

In many cases, the effect of one feature on the target depends on the value of another feature. This is known as a feature interaction. For example, in a model that predicts the price of a house, the effect of the number of bedrooms on the price may depend on the size of the house. A large house with many bedrooms is likely to be more valuable than a small house with many bedrooms. A linear model would not be able to capture this interaction, as it would assume that the effect of the number of bedrooms on the price is the same for all house sizes.

To capture this interaction, we can create a new feature that is the product of the two interacting features. In our house price example, we could create a new feature that is the product of the number of bedrooms and the size of the house. This new feature would allow the model to learn a different relationship between the number of bedrooms and the price for different house sizes.

Polynomial Feature Creation: Modeling Non-Linear Relationships

Another way to capture non-linear relationships is to create polynomial features. This involves creating new features that are powers of the original features. For example, if we have a feature x, we can create a new feature (x^2). This would allow the model to learn a quadratic relationship between x and the target. We can also create higher-order polynomial features, such as (x^3) and (x^4), to capture more complex non-linear relationships.

In addition to creating powers of individual features, we can also create interaction terms between polynomial features. For example, if we have two features x and y, we can create a new feature that is the product of (x^2) and (y^2). This would allow the model to learn a more complex, non-linear interaction between x and y.

The Bias-Variance Trade-off

When creating feature interactions and polynomial features, it is important to be mindful of the bias-variance trade-off. By adding more features to the model, we are increasing its complexity. This can lead to a decrease in bias, as the model is better able to fit the training data. However, it can also lead to an increase in variance, as the model is more likely to overfit the training data and perform poorly on new, unseen data.

To avoid overfitting, it is important to use a regularization technique, such as L1 or L2 regularization. Regularization adds a penalty term to the loss function that discourages the model from learning large coefficients. This can help to reduce the variance of the model and improve its generalization performance.

Conclusion

Feature interaction and polynomial feature creation are effective techniques for extending linear models to capture non-linear relationships. By creating new features that are combinations of the original features, we can allow the model to learn more complex, non-linear relationships between the features and the target. However, it is important to be mindful of the bias-variance trade-off and to use a regularization technique to avoid overfitting. When used correctly, these techniques can be a valuable addition to any machine learning practitioner's toolkit.

Category	Quantitative Methods
Read time	7 minutes
Published	Feb 28, 2026

Bayesian Feature Engineering: Incorporating Prior Beliefs into Feature Design

The Black Book of Day Trading Strategies

Feature Interaction: Capturing Synergistic Effects

Polynomial Feature Creation: Modeling Non-Linear Relationships

The Bias-Variance Trade-off

Conclusion