I have used a 4 layered fully connected network to learn a complex classifier boundary. I have used tanh activations throughout except the last layer where I used sigmoid activation for binary classification. I train for 10K iterations with 100K examples (my data points are 3 dimensional and I initialized my weights to 0 to begin with). I see that my network is unable to fit the training data and is leading to a high training error. What is the first thing I try ?

  Increase the number of training iterations Make a more complex network – increase hidden layer size Initialize weights to a random small value instead of zeros Change tanh activations to relu     Ans : (3) . I will initialize weights to a non zero value since changing all the weights in the same…

I have designed a 2 layered deep neural network for a classifier with 2 units in the hidden layer. I use linear activation functions with a sigmoid at the final layer. I use a data visualization tool and see that the decision boundary is in the shape of a sine curve. I have tried to train with 200 data points with known class labels and see that the training error is too high. What do I do ?

Increase number of units in the hidden layer Increase number of hidden layers ¬†Increase data set size Change activation function to tanh Try all of the above The answer is d. When I use a linear activation function, the deep neural network is realizing a linear combination of linear¬† functions which leads to modeling only…