Results look good. reed curve is almost the same as blue. 10 hidden units seem to be quite suffiecient. Some off points between 5 and 7.
4.2
h1 gives a very bad predictions of e learned NN on the test data.
h2: The ReLU function does not have defined derivative when max(0,x) is used. Instead ifelse(x>0,x,0) is used. The prediction is quite good for Var < 4 but then off.
h3: Good predctions for all Var, but not as good as sigmoid as activation function.