Intro to Gradient Descent || Optimizing High-Dimensional Equations

Dr. Trefor Bazett

1 год назад

62,383 Просмотров

Скачать видео

Комментарии:

Do M - 25.09.2023 15:30

Thank you for the video! Which tool did you use to display the planes and graphs with the ascents?

Ответить

Marko Lazar - 10.09.2023 00:13

great video! helped me a lot

Ответить

EGY CG - 27.04.2023 01:28

Unbelievable explanation as usual, prof.
Thanks

Ответить

J A - 09.04.2023 07:33

so once i have my vector for the gradient of the two dimensional space, how do i turn it into a single variable equation?

Ответить

Tim Leung - 07.04.2023 04:45

When I did my PhD in aerodynamic shape optimization for airplanes, we would have hundreds or thousands of variables to define the shape of the wing, and the “function” that we were minimizing (e.g. drag at a fixed lift) cannot be defined analytically (it required a full flow analysis), there were a lot of research done in finding the gradient direction. Also, while we would start with the steepest descent direction, we would also build an approximate Hessian matrix that gives us a progressively clearer picture of the function space after every step. The most efficient step sizes that we would use were quite different from the steepest descent line search.

Ответить

Reckless - 04.04.2023 10:51

i'm in the 9th grade, I have understood 1/27 of what you're talking about. Ig i have to watch this 27 more times

Ответить

Morgan - 03.04.2023 06:08

do you know of good techniques for discontinuous search spaces? i had been thinking a genetic algorithm (tho that’s on the back burner while i write a ton of code for it to figure out how good something it comes up with is)

Ответить

sedthh - 29.03.2023 16:13

Really happy to see ML content on the channel!

Ответить

Fati Anass - 16.03.2023 14:12

Optimizing functions ❌️
Something we've heard about it and didn't give it enough time in that 30 hours machine learning course ✅️
thanks for the vid!

Ответить

Jin C - 10.03.2023 21:02

Hello Dr. Bazett, I have been watching a lot of your videos during my current PhD process. I just want to say that you made hard concept visual and more intuitive. it has been a great help to me! Please continue making more! I love to see you explaining some convergence analysis!

Ответить

Tutor christabel - 20.02.2023 21:00

wow, the most beautiful explanation i have come across today

Ответить

azzteke - 19.02.2023 23:59

The plural of maximum is maxima. Otherwise is uneducated!

Ответить

Pierre Retief - 19.02.2023 07:26

This video has been very interesting. I have really enjoyed and benifitted from your videos, especially the ones on vector calculus. When I did my engineering degree ten years ago, either I wasn't aware of these types of videos, or they didn't exist. I am currently studying a physics degree remotely and these videos have been IMMENSELY helpful. You and prof Leonard and many others are changing lives. I understand concepts now that I didn't even realize I didnt understand the first time around. The graphics and your teaching style - awesome. Thank you very very much.

Ответить

Sinonimo - 17.02.2023 23:49

AWESOME, thanks, gradient descent is kind of neat

Ответить

Brad Camroux - 17.02.2023 07:32

What an interesting and useful video! I haven't thought about this kind of thing for a good while now. But I'm curious.... what's the deal with the conjugate gradient method?

Ответить

Baby Beel - 16.02.2023 14:52

Hello, that was perhaps one of the most beautiful explanation and visualisation of the gradient I've come across, I could not be happier after watching this video, your excitement is what mathematics is all about, discovering beautiful things. Thank you so much !

Ответить

Mohammed M Hilal - 16.02.2023 11:25

I literally search Gradient Descent Trefor bazett two days ago... This is destiny....

Ответить

babak slt - 15.02.2023 20:48

I was talking to a friend yesterday about the strategies to optimize learning rate and boom! you made it really clear. Thanks. Thou this might seems less expensive computionally, but in a real problem one does not know the loss/target function and must evaluate many model parameters sampled/updated in the direction of the gradient and then select the best model parameters according to new calculated losses. right? So the optimizer needs many many more model evaluations to select the best update. This can be even more computationally expensive than SGD.

Ответить

Ticho Ooms - 15.02.2023 20:04

I'm currently using the method of moving asymptotes (Svanberg, 1987) for nonlinear optimization of nonconvex problems in topology optimization. However, using it and fully understanding it are two different things. I would be curious whether you would be able to give a simple explanation for these kind of methods.

Ответить

teikono - 15.02.2023 14:29

Just want to say this channel is fantastic, its very enjoyable listening to you talk about these topics.

Ответить

Jude Davis - 15.02.2023 09:09

Now this is amazing! Please continue the ML math stuff!!

Ответить

John Salkeld - 15.02.2023 02:10

I am interested in gradient decent when you only have samples and re sampling is not necessarily possible. Do you know if any good research in that area?

Ответить

Mauro Locatelli - 14.02.2023 23:47

Very interesting to reduce this to a 1D optimization for each step. How would you consider the convergence speed and cost of this method compared to for example iterating basing the update on knowing both Jacobian and hessian Matrix (of course, without considering a number of variables that would make the hessian inversion too cumbersome)?
In the video way you still need to find a stationary point even in your 1D constrain with some guess of a second directional derivative, but of course looks much leaner.
Thank you so much for the insight.

Ответить

ⒶlexInWonderland - 14.02.2023 23:26

my only complaint is that its too short! i would love to see more about variations of gradient descent like space curves and how you mentioned things like momentum as a parameter, or higher/complex dimensions, etc! especially coming up with cool new interpolations on scalar or vector fields with calculus

Ответить

Murillo Neto - 14.02.2023 23:15

Great video!

Ответить

Andrew Harrison - 14.02.2023 23:13

Yes, well explained.
I had this type of situation more than once (admittedly one dimensional). Defining a search pattern that is relatively fast and guaranteed to converge was interesting. Even defining the end point condition required clear thought.
Who said the life and pensions industry was boring?

Ответить

Heisenberg - 14.02.2023 23:02

dude, I am studying cs and getting an expert in ai, but I found your account because of latex. Now this gets recommended to me. The recommendation algorithm predicting at it's finest ;D

Ответить

ALEX Fernandez - 14.02.2023 22:44

One question. Say I wanted to implement it in Matlab. I thought about estimating the one variable function by estimating the Gradient numerically. Now, once O have this function defined, how would you go on finding the extrema of it? I thought of estimating it’s derivative also, numerically and then defining a function “FindExtrema(g’(phi)) “ that performs the Newton Method on a set of N equispaced points from (0-b] and then uses the estimated roots obtained to evaluate the function of phi at them and use the one that min or max the value respectively. Does this make sense?
Ps: I would also evaluate the endpoints to see if the extrema is there

Ответить

ALEX Fernandez - 14.02.2023 22:34

Great video! I was curious a month ago about the method and decided to implement it and animate it using “Naive GD” aka: constant step size . What is the name of this Gradient Descent method, where you optimize the choice of the step size at each I?

Ответить

alphaphigamma Fugacity - 14.02.2023 22:27

Our professor went over this for our last lecture in our calc 2 class. It was amazing to understand!

Ответить

Gerald Snodd - 14.02.2023 21:59

This year i will be going to college once i am finished with my highschool exams and entrance exams. Can't wait to explore all of your content in college.

Ответить

O5-1 - Formerly Calvin Lucien - 14.02.2023 21:29

Bro i was watching this like "Cool video, how old is this?"

1 hour ago

Ответить

Dmitry Sizov - 14.02.2023 21:24

Great explanation of the variable step method. But how do you calculate the gradient vector? It seems that you need that computationally expensive differentiation?!

Ответить

Bhavesh Adhikari - 14.02.2023 21:18

I was just thinking about this topic since several days. now the video is here. thanks, Prof. you have helped me a lot

Ответить

Probably Approximately Correct - 14.02.2023 21:14

Nice video! For being so simple, it is fascinating how well gradient descent and its variants work. Like you mention, it is not at all guaranteed to find a good minimizer for non-convex functions in low dimension, but in higher dimensions, things just magically seem to work out, both numerically and (recently) mathematically. There's so much about high-dimensional objects that isn't really captured by our low-dimensional intuition, which unfortunately is quite limited. I recently saw a quote from Geoff Hinton that said:
"To deal with a 14-dimensional space, visualize a 3-D space and say 'fourteen' to yourself very loudly."

Ответить