Skip to content
Snippets Groups Projects
Commit 7ac70c48 authored by Felix Ramnelöv's avatar Felix Ramnelöv
Browse files

Lab 2: Answered questions for assignment 4

parent 79d2cec4
No related branches found
No related tags found
No related merge requests found
......@@ -131,17 +131,25 @@
## Assignment 4
- _What are the practical approaches for reducing the expected new data error,
according to the book?_
- _What are the practical approaches for reducing the expected new data error, according to the book?_
[p. 76]
In practice the aim is to reduce a small error in production, i.e. $E_{\text{new}}$. Of course in practice we can not calculate $E_{\text{new}}$ but instead one can take the decomposition $E_{\text{new}} = E_{\text{train}} + \text{generalisation gap}$ from which one can draw two conclusions: $E_{\text{new}}$ will on averag be smaller than $E_{\text{train}}$ meaning if $E_{\text{train}}$ is greater than required $E_{\text{new}}$ the problem needs to be reconsidered. Also the generalisation gap and $E_{\text{new}}$ decreases as the size of the training data increases, thus it could help a lot with reducing $E_{\text{new}}$. In addition it is possible to evaluate model complexity using $E_{\text{hold-out}}$, if $E_{\text{hold-out}} \approx E_{\text{train}}$, then underfitting is likely and it might be beneficial to increase complexity and if $E_{\text{train}}$ is close to zero while $E_{\text{hold-out}}$ is not, then overfitting is likely and it might be beneficial to decrease complexity instead. [p. 76-77]
- _What important aspect should be considered when selecting minibatches,
according to the book?_
- _What important aspect should be considered when selecting minibatches, according to the book?_
[p. 125]
When using minibatches it is vital to ensure that the different batches are balanced and represent the whole dataset. E.g. a training dataset with different output classes that are which is sorted by these output classes. Then the first minibatch might only include one output class and thus not give a good representation of the whole dataset. Therefore it is imprtand that the batches are formed randomly, for example by randomly shuffling the training data and then divide it into minibatches in an ordered manner. [p. 125]
- _Provide an example of modifications in a loss function and in data that can be
done to take into account the data imbalance, according to the book._
- _Provide an example of modifications in a loss function and in data that can be done to take into account the data imbalance, according to the book._
[p. 101]
An example of modification in a loss function to take imbalance into account is to reflect that not predicting $y = 1$ correctly is a C times more costly than not predicting $y = -1$ (binary classification). The misclassification loss can be moddified to:
$$
L(y,\hat{y}) =
\begin{cases}
0 & \text{if } \hat{y} = y, \\
1 & \text{if } \hat{y} \neq y \text{ and } y = -1, \\
C & \text{if } \hat{y} \neq y \text{ and } y = 1.
\end{cases}
$$
Other loss functions can be modified in a similar way. A similar effect can be achieved by e.g. duplicating al positive training data points C times in the training data instead of modifying the loss function. [p. 101-102]
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment