@@ -38,20 +38,20 @@ Confusion matrix o missclassification error e framtana.
Missclassification errors:
-**Training**: 0.04500262
-**Test**: 0.05329154
-$E_{\text{mis,train}} = 0.04500262$
-$E_{\text{mis,test}} =0.05329154$
3. Comment: The easy cases were easy to recognize visually while the hard ones were hard to recognize.
4. The complexity is the highest when k is the lowest and decreases when we increase k (as seen in the graph when the training error increases with an increasing k). Optimal k when the validation error is minimum, when k = 3.
4. The complexity is the highest when $k$ is the lowest and decreases when we increase $k$ (as seen in the graph when the training error increases with an increasing $k$). Optimal $k$ when the validation error is minimum, when $k = 3$.

Test error (k = 3): 0.02403344. Higher than the training error but slightly lower than the validation error. According to us it is a pretty good model considering that it correct ~98% of times.
Test error ($k = 3$): $0.02403344$. Higher than the training error but slightly lower than the validation error. According to us it is a pretty good model considering that it correct $\approx 98 \%$ of times.
5. Optimal k = 6, when the average cross-entropy loss is the lowest. Average cross-entropy loss takes probabilities in the prediction into account which is a better represntation of a model with multionmial distribution. An important aspect is that we can determina how wrong a classification is, not just wether it is wrong or not.
5. Optimal $k = 6$, when the average cross-entropy loss is the lowest. Average cross-entropy loss takes probabilities in the prediction into account which is a better represntation of a model with multionmial distribution. An important aspect is that we can determina how wrong a classification is, not just wether it is wrong or not.
test: lambda = 1 : mse = 0,9347, lambda = 100: 0,9341, lambda = 1000: mse = 0,9678. 100 verkar va bästa penaltyn då det minska inte testfelet med lägre degrees of freedom, enlgit tetningsdatan.
2. In the estimation summary shown bellow we can see our features ordered by significance. Here DFA is the most significant.
Estimation summary:
| Coefficient | Estimate | Std. Error | t value | Pr(>\|t\|) |
$\lambda = 100$ seems to be the most suitable penalty parameter considering we are able to drop $df(1) - df(100) \approx 4$ degrees of freedom without any significant change in $\text{MSE}_{test}$.