All Convolution Animations Are Wrong (Neural Networks)

All Convolution Animations Are Wrong (Neural Networks)

Animated AI

1 год назад

55,035 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

oren A
oren A - 11.10.2023 23:15

Your videos are very cool! I wonder if you thought about how to present Conv3d, it is a challenge when considering more than one channel

Ответить
Andrew's_Lab
Andrew's_Lab - 04.09.2023 20:23

I don't really care about the animations, the problem is when they start describing convolutions as 2D operations and don't go into detail on the effect of having multiple input and output channels.
I wish I found this video sooner, but anyway it's easy enough to derive the solutions yourself from 200 google search results. ( Google really sucks nowadays ).
It's actually a good mental excercise to imagine the 3d/4d filter sliding across batch of images... But good luck finding a correct padding for strided convolutions during backpropagation of both Conv and TransConv layers... I had to derive everything by hand, because internet has incorrect and even worse conflicting formulas for that... 😂

Ответить
Anmar Karmush
Anmar Karmush - 31.08.2023 12:55

Is a cube in the filter (or image) a pixel? Or is it a combination of channels?

Ответить
37 window
37 window - 20.08.2023 21:36

3D, what tool are you using? Blender?

Ответить
k shamanth kumar
k shamanth kumar - 23.07.2023 10:34

Amazing work 😍

Ответить
Tom O
Tom O - 19.07.2023 11:54

Instead of spending 95% of the video ranting about how other animations are bad, I would have appreciated it more if you had spend that time explaining how this animation works. I don't think I learned anything from this video.. How do you go from an input RGB image of size W * H * 3, to some cube of size 5 * 5 * 5 (+padding)? You lost me at step 1..

Ответить
Kartik Podugu
Kartik Podugu - 14.07.2023 07:25

amazing. you have cleared all my doubts in single shot

Ответить
날개달린_양
날개달린_양 - 12.07.2023 03:19

tysm

Ответить
Jazzvids
Jazzvids - 08.07.2023 17:01

Thank you so much for this! worth mentioning that the animation has a stride of 2

Ответить
Chris Francis (UCSD)
Chris Francis (UCSD) - 08.06.2023 11:16

They are not wrong. They just show a special case. They use the special case because the focus is on things like stride, dilation, padding etc.

It's good to make the 3D tensor animations, but don't call the existing ones wrong. I think I would have still found it easier to understand the existing ones first and then move on to the 3D animations.

Ответить
Bala Dhinesh
Bala Dhinesh - 22.05.2023 05:32

This is what I expected for a long time. This explains everything clearing. Thanks for posting this.

Ответить
Laplaceha
Laplaceha - 04.05.2023 06:37

this conv2d animation you do is right, thanks alot

Ответить
Éliphas Lévi
Éliphas Lévi - 09.04.2023 17:09

Awesome . Finally a good representation of this computations. Thanks for your hard work!!!

Ответить
Curious_One
Curious_One - 23.03.2023 19:16

Bravo !

Ответить
Xardex
Xardex - 21.02.2023 12:16

Слышу

Ответить
Axel Anderson
Axel Anderson - 29.01.2023 17:54

Geonodes were easier?

Ответить
abdulmoti diab
abdulmoti diab - 17.01.2023 19:20

Best video about neural convolution and filters!? YES!!!
Thank you so much!

Ответить
Tadashi
Tadashi - 24.12.2022 21:42

Any plans to add your animations to Wikimedia commons? :)

Ответить
Naas van Rooyen
Naas van Rooyen - 23.12.2022 11:21

Thanks so much for this. Also really struggled to get proper animations. Would have liked to see how this looks in the actual neural network. i.e. how the filter can be visualized as the weights. Or show who the filter parameters are trained. Would greatly appreciate a video of GAN and LSTM. The LSTM diagrams are terrible. Really struggled to visualized how they connect to the overall network

Ответить
kasuha
kasuha - 22.12.2022 14:27

I find these "new and correct" animations confusing, I have no idea what's happening there. I assume this is just "the correct way to display convolution" for AI models? As an old school person who used convolutions mainly for 2D image processing (blur/edge detection) I don't see anything wrong about the old animations, that's exactly what we used to do there.

Ответить
randy ekrer
randy ekrer - 21.12.2022 12:30

you should've started with the typical RGB 3 layer input image, and animate convolutions on that; that's where most people start to get lost as to how the weights match with inputs, translating from the 2D mental model to 3D.

Ответить
Felipe Gustavo Silva Teodoro
Felipe Gustavo Silva Teodoro - 20.12.2022 04:00

Amazing!

Ответить
David
David - 19.12.2022 17:05

Honestly, just write down the formula…
Nice work though!

Ответить
guyindisguise
guyindisguise - 18.12.2022 21:58

Nice animation, are you planning on making animations for Transformers as well?

Ответить
Jason Conaway
Jason Conaway - 18.12.2022 11:46

Thank you! Great animation. However, I do have a technical nit pick. Your animation shows an operation known as cross-correlation, which is related to convolution, but it is mirrored. "Convolutional neural networks" use cross-correlations in the feed-forward phase and convolutions in the backpropagation phase.

Ответить
Григорий Погорелов
Григорий Погорелов - 18.12.2022 10:47

Adding a bias term added after convolutions would be a full process representation. Anyway, great visualization!

Ответить
Adrian Gray
Adrian Gray - 18.12.2022 06:28

Oh man, I'm so glad someone took a direct approach to this problem, when I was learning I was so confused by all these animations and explanations in 2D, and then seeing resulting tensor shapes got me super confused, where the depth go and where did it appear? Thanks for bringing this video to the world!

Ответить
Hjtvgf Hjtghvfg
Hjtvgf Hjtghvfg - 18.12.2022 06:21

If all are wrong, then why should i watch this one?

Ответить
Avi Drucker
Avi Drucker - 18.12.2022 05:57

A major thing that feels missing to me in the animations is clear textual labeling. It's fine that you label them out loud, and then, also, it would be more accessible for folks with hearing challenges or cognitive challenges. My crit aside, this animation is lovely, and I'm very impressed with what you've done. You've earned yourself a new subscriber :)

Ответить
Flunkee
Flunkee - 18.12.2022 03:01

Oh.

Ответить
Alessandro Polidori
Alessandro Polidori - 17.12.2022 22:47

Love it. I always thought there were no accurate visualization on the internet too. Good job

Ответить
bloopbleepblop
bloopbleepblop - 16.12.2022 20:57

They are not wrong. They are just displaying a different case than what you are interested in. Maybe they are misplaced in the material you were looking at, but if they were animations for different things, like convolution filters in image processing, they wouldn't be wrong. Have some humility.

Ответить
Malte Ihlefeld
Malte Ihlefeld - 16.12.2022 17:14

Thank you for this, recently I tried to explain why the input and output shapes behave the way they do, and what gets combined with what. These animations will make it sooo much easier!!

Ответить
Alexey Chernyavskiy
Alexey Chernyavskiy - 16.12.2022 12:14

They are not wrong. They are a simplification that helps to understand the concept. As any simplification they are incomplete. But not wrong. It's sad that you use clickbait titles.

Ответить
Kuan
Kuan - 16.12.2022 10:51

The example just a concept. I don't agree with this sensational title.

Ответить
Pere Ginebra
Pere Ginebra - 16.12.2022 00:24

"a 2D convolution actually takes in a 3D tensor as input and has a 3D convolution as output", well, it depends right? If you have a single channel/grayscale image then the input is in fact a 2D tensor, and each feature outputs a 2D tensor that is joined with all others in the feature map. So if you have a grayscale image with a single feature, the animations would in fact be correct.

I think the animations are perfectly fine, as they simplify a concept to it's most basic form for easy understanding. But it is true that after you understand the basic concept, a 3D - 3D representation is also nice to understand more common and complex examples.

Disclaimer that I could be wrong as I am by no means an expert, but this is my take from my current understanding of convolutions :)

Ответить
Rojina Panta
Rojina Panta - 15.12.2022 23:57

I really appreciate the effort and is good one, but I would still go with the 2D one this is way too much jittery for me with so many things happening at one and choice of colors.

Ответить
Ανδρέας Καρατζάς
Ανδρέας Καρατζάς - 15.12.2022 22:18

@animatedai How did you learn blender? Which were your sources?

Ответить
Yunus BİLECE
Yunus BİLECE - 15.12.2022 04:40

I liked the idea but title is too big for this kind of correction

Ответить
Blu Valor
Blu Valor - 15.12.2022 04:25

Lol, I literally learned this the hard way just about 2 months ago, when the shape for my 2d convolution required 3 parameters, and this made me super confused :,)

Ответить
davidebic
davidebic - 14.12.2022 13:11

Is there a way to access your course online? I'm really interested in this subject!

Ответить
Alex gaming TV
Alex gaming TV - 13.12.2022 22:37

Great work, such a animation for grouped convolutiion would be nice too

Ответить
hos42
hos42 - 13.12.2022 12:16

Thank you for putting this out!

Ответить
Mario González Otero
Mario González Otero - 13.12.2022 09:23

amazing!

Ответить
PeaBrane
PeaBrane - 12.12.2022 22:13

The animation is just meant as an abstraction of the spatial convolution operation itself. A spatial CNN layer consists of spatial convolution operations across multiple input and output channels (which is what you are referring to)

Ответить