## NPTEL Deep Learning – IIT Ropar Week 4 Assignment Answers 2024

1. We have following functions x^{3}, *ln* (x), e^{x}, x and 4. Which of the following functions has the steepest slope at x=1?

- x
^{3} *ln*(x)- e
^{x} - 4

Answer :-For AnswerClick Here

2. Which of the following represents the contour plot of the function f(x,y) =x^{2}−y^{2}?

Answer :-For AnswerClick Here

3. Choose the correct options for the given gradient descent update rule ωt+1=ωt−η∇ω(η is the learning rate)

- The weight update is tiny at a gentle loss surface
- The weight update is tiny at a steep loss surface
- The weight update is large at a steep loss surface
- The weight update is large at a gentle loss surface

Answer :-For AnswerClick Here

4. Which of the following algorithms will result in more oscillations of the parameter during the training process of the neural network?

- Stochastic gradient descent
- Mini batch gradient descent
- Batch gradient descent
- Batch NAG

Answer :-

5. Which of the following are among the disadvantages of Adagrad?

- It doesn’t work well for the Sparse matrix.
- It usually goes past the minima.
- It gets stuck before reaching the minima.
- Weight updates are very small at the initial stages of the algorithm.

Answer :-

6. Which of the following is a variant of gradient descent that uses an estimate of the next gradient to update the current position of the parameters?

- Momentum optimization
- Stochastic gradient descent
- Nesterov accelerated gradient descent
- Adagrad

Answer :-For AnswerClick Here

7. Consider a gradient profile ∇W=[1,0.9,0.6,0.01,0.1,0.2,0.5,0.55,0.56].

Assume v_{−1}=0,ϵ=0,β=0.9 and the learning rate is η_{−1}=0.1. Suppose that we use the Adagrad algorithm then what is the value of η_{6}=η/sqrt(v_{t}+ϵ)?

- 0.03
- 0.06
- 0.08
- 0.006

Answer :-

8. Which of the following can help avoid getting stuck in a poor local minimum while training a deep neural network?

- Using a smaller learning rate.
- Using a smaller batch size.
- Using a shallow neural network instead.
- None of the above.

Answer :-

9. What are the two main components of the ADAM optimizer?

- Momentum and learning rate.
- Gradient magnitude and previous gradient.
- Exponential weighted moving average and gradient variance.
- Learning rate and a regularization term.

Answer :-

10. What is the role of activation functions in deep learning?

- Activation functions transform the output of a neuron into a non-linear function, allowing the network to learn complex patterns.
- Activation functions make the network faster by reducing the number of iterations needed for training.
- Activation functions are used to normalize the input data.
- Activation functions are used to compute the loss function.

Answer :-For AnswerClick Here