Distance Metrics
Core requirements #
\(\) Perfect distant metrics must fulfill core requirements:
- Distance from itself to itself should be 0 $$ d(X, X) = 0 $$
- Symmetric distances1 $$ d(X, Y)=d(Y, X) $$
- Satisfy triangle inequality $$ d(X, Z) \leq d(X, Y)+ d(Y, Z) $$
- Positive for all other points $$ d(X, Y) > 0 ~~ \forall X, Y ; \text{ where } X \neq Y $$
Satisfiability #
Only some Minkowski distances satisfy these constraints.
Minkowski distance is: $$ d(X, Y) = \left(\sum_{i=1}^{n}{|x-y|^{p}}\right)^{\frac{1}{p}}$$ where \( X = (x_1, x_2, \ldots , x_{n}) \text{ and } Y = (y_1, y_2, \ldots , y_{n}) \in \R\) .2
The only values of \(p\) that satisfies requirements are:
$$p \geq 1 $$
For example,
- \(p = 1\) is Manhattan distance (L1 norm)
- \(p = 2\) is Euclidean distance (L2 norm)
Any \(p < 1\) does not satisfy the triangle inequality (since distances become convex) and therefore isn’t considered a valid distance metric.
If this requirement is ever weakened to an inequality then \(Dist(X,Y)\) could be 0 even if \(X \neq Y\) . If this happens, then \(Dist\) becomes a pseudo-metric. ↩︎