diff --git a/lab3/lab-notes.md b/lab3/lab-notes.md index 299a716e0f22b2f766c05bd4894efed2166824b7..f521d14aea8202743d92a46b7ba2a5d32a39cb4e 100644 --- a/lab3/lab-notes.md +++ b/lab3/lab-notes.md @@ -8,7 +8,15 @@ This is known as the kernel trick: If x enters the model as ð“(x)^Tð“(x') only, we can choose a kernel ðœ…(x, x') instead of chosing ð“(x). p. 194 - In the literature, it is common to see a formulation of SVMs that makes use of a hyperparameter. What is the purpose of this hyperparameter? -The purpose is to regularize. p. 211 +The hyperparameter C is the regularization term in the dual formulation of SVMs: +\[ +\alpha = \arg \min_\alpha \left( \frac{1}{2} \alpha^T K(X, X) \alpha - \alpha^T y \right) +\] +\[ +\text{subject to } \lvert \alpha_i \rvert \leq \frac{1}{2n\lambda} \quad \text{and} \quad 0 \leq \alpha_i y +\] +with \[y(x^\star) = \operatorname{sign} \left( b + \alpha^T K(X, x^\star) \right)\]. +Here \[C = \frac{1}{2n\lambda}\]. p. 211 - In neural networks, what do we mean by mini-batch and epoch?