normalization

In machine learning, normalization is a statistical technique by which the data in a dataset are transformed to have values in a normal (Gaussian) distribution.

For each value x in the dataset, its corresponding normalized value x' is calculated in the value range [0,1] as follows.

normalization10

Alternatively, there can be a mean normalization, with normalized x' values in the [-1,1] range, which is calculated as follows.

normalization-11

It is important to differentiate between normalization and regularization. Normalization and standardization are data preparation methods, while regularization is used to improve the performance ML models, by adjusting the cost function to eliminate the ML model error.

It must be noted that normalization must be used when the x variable data does not follow a normal (Gaussian) distribution or when the data distribution is unknown.

Related Terms