This post covers two techniques for normalising feature vectors. Normalisation is needed for some machine learning algos

Min-max normalisation is a simple kind of normalisation which maps the feature vector to have values between 0 and 1.

For a vector `X` the rescaled vector `X'` is given by:

`X' = (X - min(X)) / (max(X) - min(X))`

This formula is a bit like scaling the vector by its max value, except that by subtracting the min value from the numerator and denominator it handles negative numbers correctly and also causes the range to start from 0.

`X = [80, 42, 91, 27, 92, 88, 2]`

`min(X) = 2, max(X) = 92`

`X -> [(80 - 2)/90,` ` (42 - 2)/90,` ` (91 - 2)/90,` ` (27 - 2)/90,` ` 90/90,` ` (88 - 2)/90,` ` (2 - 2)/90]`

`X -> [0.867,` ` 0.444,` ` 0.989,` ` 0.278,` ` 1,` ` 0.956,` ` 0]`

If you don't know all your data ahead of time then future observations may end up with more extreme values that exceed the min or max from your original population.

This would mean you'd end up with values outside the [0, 1] range.

Z-score normalisation is a feature scaling method where the values are converted to z-scores.

A Z-score is the number of standard deviations a value is from the mean.

For a vector `X` the rescaled vector `Z` is given by:

`Z = (X - mu) / sigma`

Where `mu` is the mean of X and `sigma` is the standard deviation

`X = [80, 40, 91, 27, 92, 88, 2]`

`sigma = 36.565, mu = 60`

`X -> [(80 - 60)/36.565,` ` (40 - 60)/36.565,` ` (91 - 60)/36.565,` ` (27 - 60)/36.565,` ` (90 - 60)/36.565,` ` (88 - 60)/36.565,` ` (2 - 60)/36.565]`

`X -> [0.547,` ` -0.547,` ` 0.848,` ` -0.902,` ` 0.821,` ` 0.766,` ` -1.586]`

Since Z-scores have no lower or upper bound future values we don't have to worry about future values being outside any range

Future values should have the same standard deviation and mean as the original population for the Z-scores to be valid.

© Will Robertson