/https://piazza.com/class_profile/get_resource/ln18bjs43q41tr/ln18n411vo070m
Recall Kernel SVM
for each pair of points, compute similarity (in a matrix)
each data point votes w/ own label yi, and weight a_i
Mercer’s Condition
an m x m similarity matrix for k(xi, xj) must be PSD
RBF Kernel
$$ \phi : R^d -> R^{ \infin} \\ s.t. \\ k(x, z) = \phi(x) * \phi(z) $$
only matters that dot product exists at all
can fit any boundary
little data: just predict most likely label
s → only gets the closest point to x, all other go to 0
more data = decrease s