In machine learning, standardization is a feature engineering technique by which the dataset features are re-scaled to achieve zero-mean value (μ=0) and unit standard deviation value (σ=1). Each x value in the dataset gets a corresponding x' standardized value, which is calculated as follows.

standardization , where μ is the x variable mean and σ is the standard deviation.

The standardized value is also known as Z-Score.

It is important to differentiate between standardization and regularization. Standardization and normalization are data preparation methods, while regularization is used to improve the performance ML models, by adjusting the cost function to eliminate the ML model error. Standardization and normalization are very similar techniques, in that they both change the scale of data to better accommodate for an ML algorithm operations.

It must be noted that standardization is mostly efficient and must be used, when the x variable data follow a normal (Gaussian) distribution.

Related Terms