Lab 1: Updated notes

9e38aaf8 · Felix Ramnelöv · 19de0377 · 9e38aaf8
Commit 9e38aaf8 authored 5 months ago by Felix Ramnelöv
--- a/lab1/lab-notes.md
+++ b/lab1/lab-notes.md
@@ -6,50 +6,55 @@ Confusion matrix o missclassification error e framtana.

 2. Comment: The confusion matrix looks good. Hardest for 1, 7, 8 and 9 (especially 9).

-   - Confusion matrices:
-
-     - Training:
-       | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
-       | ----- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-       | **0** | 202 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
-       | **1** | 0 | 179 | 11 | 0 | 0 | 0 | 0 | 1 | 1 | 3 |
-       | **2** | 0 | 1 | 190 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
-       | **3** | 0 | 0 | 0 | 185 | 0 | 1 | 0 | 1 | 0 | 1 |
-       | **4** | 1 | 3 | 0 | 0 | 159 | 0 | 0 | 7 | 1 | 4 |
-       | **5** | 0 | 0 | 0 | 1 | 0 | 171 | 0 | 1 | 0 | 8 |
-       | **6** | 0 | 2 | 0 | 0 | 0 | 0 | 190 | 0 | 0 | 0 |
-       | **7** | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 178 | 1 | 0 |
-       | **8** | 0 | 10 | 0 | 2 | 0 | 0 | 2 | 0 | 188 | 2 |
-       | **9** | 1 | 3 | 0 | 5 | 2 | 0 | 0 | 3 | 3 | 183 |
-
-     - Test:
-       | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
-       | ----- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-       | **0** | 77 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
-       | **1** | 0 | 81 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 3 |
-       | **2** | 0 | 0 | 98 | 0 | 0 | 0 | 0 | 0 | 3 | 0 |
-       | **3** | 0 | 0 | 0 | 107 | 0 | 2 | 0 | 0 | 1 | 1 |
-       | **4** | 0 | 0 | 0 | 0 | 94 | 0 | 2 | 6 | 2 | 5 |
-       | **5** | 0 | 1 | 1 | 0 | 0 | 93 | 2 | 1 | 0 | 5 |
-       | **6** | 0 | 0 | 0 | 0 | 0 | 0 | 90 | 0 | 0 | 0 |
-       | **7** | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 111 | 0 | 0 |
-       | **8** | 0 | 7 | 0 | 1 | 0 | 0 | 0 | 0 | 70 | 0 |
-       | **9** | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 85 |
-
-   - Missclassification errors:
-     - Training: 0.04500262
-     - Test: 0.05329154
+   Confusion matrices:
+
+   - **Training**:
+     | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+     | ----- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+     | **0** | 202 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+     | **1** | 0 | 179 | 11 | 0 | 0 | 0 | 0 | 1 | 1 | 3 |
+     | **2** | 0 | 1 | 190 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
+     | **3** | 0 | 0 | 0 | 185 | 0 | 1 | 0 | 1 | 0 | 1 |
+     | **4** | 1 | 3 | 0 | 0 | 159 | 0 | 0 | 7 | 1 | 4 |
+     | **5** | 0 | 0 | 0 | 1 | 0 | 171 | 0 | 1 | 0 | 8 |
+     | **6** | 0 | 2 | 0 | 0 | 0 | 0 | 190 | 0 | 0 | 0 |
+     | **7** | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 178 | 1 | 0 |
+     | **8** | 0 | 10 | 0 | 2 | 0 | 0 | 2 | 0 | 188 | 2 |
+     | **9** | 1 | 3 | 0 | 5 | 2 | 0 | 0 | 3 | 3 | 183 |
+
+   - **Test**:
+     | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+     | ----- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+     | **0** | 77 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
+     | **1** | 0 | 81 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 3 |
+     | **2** | 0 | 0 | 98 | 0 | 0 | 0 | 0 | 0 | 3 | 0 |
+     | **3** | 0 | 0 | 0 | 107 | 0 | 2 | 0 | 0 | 1 | 1 |
+     | **4** | 0 | 0 | 0 | 0 | 94 | 0 | 2 | 6 | 2 | 5 |
+     | **5** | 0 | 1 | 1 | 0 | 0 | 93 | 2 | 1 | 0 | 5 |
+     | **6** | 0 | 0 | 0 | 0 | 0 | 0 | 90 | 0 | 0 | 0 |
+     | **7** | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 111 | 0 | 0 |
+     | **8** | 0 | 7 | 0 | 1 | 0 | 0 | 0 | 0 | 70 | 0 |
+     | **9** | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 85 |
+
+   Missclassification errors:
+
+   - **Training**: 0.04500262
+   - **Test**: 0.05329154

 3. Comment: The easy cases were easy to recognize visually while the hard ones were hard to recognize.

 4. The complexity is the highest when k is the lowest and decreases when we increase k (as seen in the graph when the training error increases with an increasing k). Optimal k when the validation error is minimum, when k = 3.

+   Formula: $R(Y, \hat{Y}) = \frac{1}{N} \sum_{i=1}^{N} I(Y_i \neq \hat{Y}_i)$
+
   ![Missclassification rate depending on k](./assignment1-4.png)

   Test error (k = 3): 0.02403344. Higher than the training error but slightly lower than the validation error. According to us it is a pretty good model considering that it correct ~98% of times.

 5. Optimal k = 6, when the average cross-entropy loss is the lowest. Average cross-entropy loss takes probabilities in the prediction into account which is a better represntation of a model with multionmial distribution. An important aspect is that we can determina how wrong a classification is, not just wether it is wrong or not.

+   Formula: $R(Y, \hat{p}(Y)) = - \frac{1}{N} \sum_{i=1}^{N} \sum_{m=1}^{M} I(Y_i = C_m) \log \hat{p}(Y_i = C_m)$
+
   ![Average cross-entropy loss depending on k](./assignment1-5.png)

 ## Assignment 2