https://piazza.com/class_profile/get_resource/ln18bjs43q41tr/ln18io6gfb743c
L1 (manhattan) Norm
$$
= \sqrt{x_1 + ... + x_n}
$$
L2 (euclidean) Norm
$$
= \sqrt{x_1^2 + ... + x_n^2}
$$
Computational difficulty: L2 > L1ΒΆ
- L2 has a closed form solution because it's a square of a thing.
- L1 does not have a closed form solution because it is a non-differenciable piecewise function, as it involves an absolute value. For this reason, L1 is computationally more expensive, as we can't solve it in terms of matrix math, and most rely on approximations (in the lasso case, coordinate descent).
Robustness: L1 > L2
- Robustness is defined as resistance to outliers in a dataset. The more able a model is to ignore extreme values in the data, the more robust it is.
- The L1 norm is more robust than the L2 norm, for fairly obvious reasons: the L2 norm squares values, so it increases the cost of outliers exponentially; the L1 norm only takes the absolute value, so it considers them linearly and therefore penalizes outliers less than L2 with respect to the loss function
Stability: L2 > L1
- Stability is defined as resistance to horizontal adjustments. This is the perpendicular opposite of robustness.