Skip to content
Snippets Groups Projects
lab-notes.md 1.58 KiB
Newer Older
  • Learn to ignore specific revisions
  • Assignment 1
    
    - What is the kernel trick?
    Since we can rewrite the 𝐿^2 regularised linear regression formula to a form where non-linear transformations 𝝓(x) only appear via inner product, we do not have to design a 𝑑-dimensional vector 𝝓(x) and derive its inner product. Instead, we can just choose a kernel
    𝜅(x, x') directly where the kernel is the inner product of two non-linear input transformations according to:
    𝜅(x, x') = 𝝓(x)^T𝝓(x').
    This is known as the kernel trick:
    If x enters the model as 𝝓(x)^T𝝓(x') only, we can choose a kernel 𝜅(x, x') instead of chosing 𝝓(x). p. 194
    
    - In the literature, it is common to see a formulation of SVMs that makes use of a hyperparameter. What is the purpose of this hyperparameter?
    The purpose is to regularize. p. 211
    
    
    - In neural networks, what do we mean by mini-batch and epoch?
    We call a small subsample of data a mini-batch, which typically can contain 𝑛𝑏 = 10, 𝑛𝑏 = 100, or 𝑛𝑏 = 1 000
    data points. One complete pass through the training data is called an epoch, and consequently consists of 𝑛/𝑛𝑏 iterations. p. 125
    
    
    
    
    Assignment 4
    
    4.1
    Results look good. reed curve is almost the same as blue. 10 hidden units seem to be quite suffiecient. Some off points between 5 and 7.
    
    4.2
    h1 gives a very bad predictions of e learned NN on the test data.
    
    h2: The ReLU function does not have defined derivative when max(0,x) is used. Instead ifelse(x>0,x,0) is used. The prediction is quite good for Var < 4 but then off.
    
    h3: Good predctions for all Var, but not as good as sigmoid as activation function.