Tutorial 4 Problems
CSC311, Fall 2021
1 Gradient Descent Intuition
Suppose we are trying to optimize the loss function f (x) = 12 xT Ax, where x ¡Ê R2 4 0 1
Copyright By PowCoder代写 加微信 powcoder
1.LetA= 0 1 andx0= 1
What are the first two iterates of gradient descent, with a learning rate ¦Ç = 0.1?
For which learning rates will gradient descent converge? The convergence speed is determined by how the error decreases in the ¡°slowest¡± direction. What learning rate leads to the fastest convergence?
Suppose we choose the optimal learning rate. How many steps of gradient descent does it take for both components to be less than 1e-3 (0.001)?
100 0 Repeat the previous two parts with A = 0 1 .
Sum of Convex Functions
Prove that the sum of two convex functions is convex.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com