Report - Mini-Course 1: SGD Escapes Saddle PointsWhy do we use SGD? Initially because: I Much cheaper to compute using mini-batch I Can still converge to global minimum in convex case I Can

Please pass captcha verification before submit form