Source of confusion! Neural Nets vs Image Processing Convolution

Source of confusion! Neural Nets vs Image Processing Convolution

Animated AI

9 месяцев назад

3,633 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@SaschaRobitzki
@SaschaRobitzki - 04.01.2024 06:20

Did you release already the Transformers video on Patreon?

Ответить
@nugratasik4137
@nugratasik4137 - 19.12.2023 16:49

what are the algorithm examples for image processing covolution?

Ответить
@user-bx7xh3wy1b
@user-bx7xh3wy1b - 03.12.2023 12:28

thanks thanks thanks this is the best

Ответить
@MrLazini
@MrLazini - 25.11.2023 23:56

Great content

Ответить
@Cm-zp8kw
@Cm-zp8kw - 25.11.2023 22:00

very nice video! it fully explain my what im confused about the CNN and Image Processing Convolution.
But however as a newer on learning CNN my understanding is still a bit limited, such as what else channel we could have (except the R,G,B color) if the input is a image?
And also I could understand how the im2col work in a 2D dimension, but once the input is up the 3D dimension does it work the same as 2D or in otherway?

Ответить
@abrahamodunusi178
@abrahamodunusi178 - 11.10.2023 23:23

As someone who watched the previous video, you made alot of assumptions about the users knowledge, I have a little knowledge about CNN's and I was worse off than before I watched it.

The difference between this and that in terms of the ease of understanding is actually staggering.

In short, wonderful video and explanation. Thanks

Ответить
@troyhan3741
@troyhan3741 - 03.10.2023 17:45

I want to know what tools you made this amazing video ? I follow long long ago

Ответить
@user-qp4pr7yb2m
@user-qp4pr7yb2m - 03.10.2023 15:49

"actively confusing" this choice of words is so abstract to me

Ответить
@muthukamalan.m6316
@muthukamalan.m6316 - 25.09.2023 17:57

please make video on batch norm and layer norm in cnn

Ответить
@kevalan1042
@kevalan1042 - 22.09.2023 23:38

Great work!

Ответить
@kevian182
@kevian182 - 22.09.2023 01:19

Great video! Thanks!

Ответить
@chrisminnoy3637
@chrisminnoy3637 - 21.09.2023 09:27

You may have learned or not that doing first a depth wise convolution followed by a pointwise convolution results in exactly the same computation with a factor of 10 less multiplication. This first does a convolution over each colour channel first, and then on the end results looks at each pixel.

Ответить
@takeraparterer
@takeraparterer - 21.09.2023 00:29

amazing!

Ответить
@zskater1234
@zskater1234 - 20.09.2023 23:56

Do you do your animations with blender?

Ответить
@nopana_
@nopana_ - 20.09.2023 23:07

Oh thank you so much!
I waited with a lot of joy for a new episode!
Thank you so much for the amazing work!
(And this explaination what exactly what I was lacking from the previous episode and everything clicked in my head)

Ответить
@Amejonah
@Amejonah - 20.09.2023 21:15

One thing I always wondered in image nets, why they use LAB. Why not use HSV/HSL or RBG?

Ответить
@Karmush21
@Karmush21 - 20.09.2023 19:25

Nice, you did the video! Maybe you remember but I asked you this question before in one of your previous videos. The video turned out great! Thank you.

Ответить
@meguellatiyounes8659
@meguellatiyounes8659 - 20.09.2023 19:18

the red green and blue channels are independent features thats. onother reason why we use 3d filter in contrast to the first example.

in the case of depthwise cnn each layer of rgb. Has its own independent receptive field.


thanks for the previous videos.

Ответить
@LaszloKorte
@LaszloKorte - 20.09.2023 18:54

Really great explanation and visualization, like all of your videos. I would just disagree with the conclusion that image processing and NN convolutions are fundamentally different. The only difference is the kernel size of 3x3x3 vs 3x3x1. The way you separate the RGB channels during the 2D convolutions just complicates the process. It would be clearer to keep the image as a single 3d array (width x height x channel) and sweep the kernels (both the 3x3x1 and the 3x3x3) simply *through every possible position*. Each possible placement results in one corresponding output value. Then it could be recognized that increasing the kernel size (from 3x3x1 to 3x3x3) just combinatorically reduces the number of possible positions at which it can be placed into the image array (and thereby the number of output values/size of output dimensions) and the conclusion would be that both kinds of convolutions are exactly the same.

More abstractly any convolution can be though of as only working between two arrays/tensors/signals of the same dimensionality but (optionally) different sizes along each dimension. For example applying 10 kernels of size 5x5x3 at once onto an iamge of size10x10x3 would be the same as applying a single 5x5x3x10 kernel to a 10x10x3x1 image. As longe as the dimensions are ordered correctly to match up. The result would be an array of size 5x5x1x10. The output size can be determined along each dimension separately: (10-5+1)x(10-5+1)x(3-3+1)x(10-1+1) as explained in your other videos. The same works for higher dimensions. A kernel for video processing could be of size 10x10x3x20 to span across 10x10 pixels vertically/horizontally, across 3 colors channels and across 20 frames in time. The video might have a spatial resolution of 720x1280, consisting of 4 channels (rgba) and be 500 frames long. resulting in an output of size 711x1271x2x481

A linear colorspace conversion (weighted sum of channels) would be an example of a simple pointwise convolution in classical image processing.

Ответить
@kartikpodugu
@kartikpodugu - 20.09.2023 18:42

Amazing work.
Have always wondered why nobody is talking about the differences as I am from Signal Processing background.

Ответить
@tuongnguyen9391
@tuongnguyen9391 - 20.09.2023 18:31

so so so good !

Ответить
@ucngominh3354
@ucngominh3354 - 20.09.2023 18:21

hi

Ответить