Комментарии:
Thank you for the video! Which tool did you use to display the planes and graphs with the ascents?
Ответитьgreat video! helped me a lot
ОтветитьUnbelievable explanation as usual, prof.
Thanks
so once i have my vector for the gradient of the two dimensional space, how do i turn it into a single variable equation?
ОтветитьWhen I did my PhD in aerodynamic shape optimization for airplanes, we would have hundreds or thousands of variables to define the shape of the wing, and the “function” that we were minimizing (e.g. drag at a fixed lift) cannot be defined analytically (it required a full flow analysis), there were a lot of research done in finding the gradient direction. Also, while we would start with the steepest descent direction, we would also build an approximate Hessian matrix that gives us a progressively clearer picture of the function space after every step. The most efficient step sizes that we would use were quite different from the steepest descent line search.
Ответитьi'm in the 9th grade, I have understood 1/27 of what you're talking about. Ig i have to watch this 27 more times
Ответитьdo you know of good techniques for discontinuous search spaces? i had been thinking a genetic algorithm (tho that’s on the back burner while i write a ton of code for it to figure out how good something it comes up with is)
ОтветитьReally happy to see ML content on the channel!
ОтветитьOptimizing functions ❌️
Something we've heard about it and didn't give it enough time in that 30 hours machine learning course ✅️
thanks for the vid!
Hello Dr. Bazett, I have been watching a lot of your videos during my current PhD process. I just want to say that you made hard concept visual and more intuitive. it has been a great help to me! Please continue making more! I love to see you explaining some convergence analysis!
Ответитьwow, the most beautiful explanation i have come across today
ОтветитьThe plural of maximum is maxima. Otherwise is uneducated!
ОтветитьThis video has been very interesting. I have really enjoyed and benifitted from your videos, especially the ones on vector calculus. When I did my engineering degree ten years ago, either I wasn't aware of these types of videos, or they didn't exist. I am currently studying a physics degree remotely and these videos have been IMMENSELY helpful. You and prof Leonard and many others are changing lives. I understand concepts now that I didn't even realize I didnt understand the first time around. The graphics and your teaching style - awesome. Thank you very very much.
ОтветитьAWESOME, thanks, gradient descent is kind of neat
ОтветитьWhat an interesting and useful video! I haven't thought about this kind of thing for a good while now. But I'm curious.... what's the deal with the conjugate gradient method?
ОтветитьHello, that was perhaps one of the most beautiful explanation and visualisation of the gradient I've come across, I could not be happier after watching this video, your excitement is what mathematics is all about, discovering beautiful things. Thank you so much !
ОтветитьI literally search Gradient Descent Trefor bazett two days ago... This is destiny....
ОтветитьI was talking to a friend yesterday about the strategies to optimize learning rate and boom! you made it really clear. Thanks. Thou this might seems less expensive computionally, but in a real problem one does not know the loss/target function and must evaluate many model parameters sampled/updated in the direction of the gradient and then select the best model parameters according to new calculated losses. right? So the optimizer needs many many more model evaluations to select the best update. This can be even more computationally expensive than SGD.
ОтветитьI'm currently using the method of moving asymptotes (Svanberg, 1987) for nonlinear optimization of nonconvex problems in topology optimization. However, using it and fully understanding it are two different things. I would be curious whether you would be able to give a simple explanation for these kind of methods.
ОтветитьJust want to say this channel is fantastic, its very enjoyable listening to you talk about these topics.
ОтветитьNow this is amazing! Please continue the ML math stuff!!
ОтветитьI am interested in gradient decent when you only have samples and re sampling is not necessarily possible. Do you know if any good research in that area?
ОтветитьVery interesting to reduce this to a 1D optimization for each step. How would you consider the convergence speed and cost of this method compared to for example iterating basing the update on knowing both Jacobian and hessian Matrix (of course, without considering a number of variables that would make the hessian inversion too cumbersome)?
In the video way you still need to find a stationary point even in your 1D constrain with some guess of a second directional derivative, but of course looks much leaner.
Thank you so much for the insight.
my only complaint is that its too short! i would love to see more about variations of gradient descent like space curves and how you mentioned things like momentum as a parameter, or higher/complex dimensions, etc! especially coming up with cool new interpolations on scalar or vector fields with calculus
ОтветитьGreat video!
ОтветитьYes, well explained.
I had this type of situation more than once (admittedly one dimensional). Defining a search pattern that is relatively fast and guaranteed to converge was interesting. Even defining the end point condition required clear thought.
Who said the life and pensions industry was boring?
dude, I am studying cs and getting an expert in ai, but I found your account because of latex. Now this gets recommended to me. The recommendation algorithm predicting at it's finest ;D
ОтветитьOne question. Say I wanted to implement it in Matlab. I thought about estimating the one variable function by estimating the Gradient numerically. Now, once O have this function defined, how would you go on finding the extrema of it? I thought of estimating it’s derivative also, numerically and then defining a function “FindExtrema(g’(phi)) “ that performs the Newton Method on a set of N equispaced points from (0-b] and then uses the estimated roots obtained to evaluate the function of phi at them and use the one that min or max the value respectively. Does this make sense?
Ps: I would also evaluate the endpoints to see if the extrema is there
Great video! I was curious a month ago about the method and decided to implement it and animate it using “Naive GD” aka: constant step size . What is the name of this Gradient Descent method, where you optimize the choice of the step size at each I?
ОтветитьOur professor went over this for our last lecture in our calc 2 class. It was amazing to understand!
ОтветитьThis year i will be going to college once i am finished with my highschool exams and entrance exams. Can't wait to explore all of your content in college.
ОтветитьBro i was watching this like "Cool video, how old is this?"
1 hour ago
Great explanation of the variable step method. But how do you calculate the gradient vector? It seems that you need that computationally expensive differentiation?!
ОтветитьI was just thinking about this topic since several days. now the video is here. thanks, Prof. you have helped me a lot
ОтветитьNice video! For being so simple, it is fascinating how well gradient descent and its variants work. Like you mention, it is not at all guaranteed to find a good minimizer for non-convex functions in low dimension, but in higher dimensions, things just magically seem to work out, both numerically and (recently) mathematically. There's so much about high-dimensional objects that isn't really captured by our low-dimensional intuition, which unfortunately is quite limited. I recently saw a quote from Geoff Hinton that said:
"To deal with a 14-dimensional space, visualize a 3-D space and say 'fourteen' to yourself very loudly."
First things first, what did you use for the plots? ;-)
ОтветитьGreat work
ОтветитьClearly not an exact maths. Very clever
Ответить♥️♥️♥️
Ответитьexcellent video trefor
Ответить