Overfitting in machine learning is a problem in which the ML model fails to generalize well to unseen data because its predictions very tightly much the training data. An overfitting model has high variance and low bias. Overfitting is the other pole to underfitting. The desired situation between overfitting and underfitting is a good fit (or best fit).
The most common causes of overfitting are the following:
- High variance and low bias.
- The model is too complex.
- The model is non-parametric and non-linear. Non-parametric ML models generally have overfitting behavior.
- The size of the training data is small or there is too much noise in the training data.
- Lack of regularization.
The following methods can be used to reduce overfitting:
- Improve the quality of training data in the ML dataset(s)
- Increase the size of the training data
- Use a linear algorithm for linear data (for parametric algorithms) or use the proper hyperparameters (in a non-parametric algorithms)
- Reduce the ML model complexity, since more complex models tend to overfit more than less complex models.
- Perform regularization by applying a regularization function on the ML model, such as for example ridge regularization (L2) and lasso regularization (L1).
- Use dropout for neural networks to tackle overfitting.