Coding a Transformer from scratch on PyTorch, with full explanation, training and inference.

Coding a Transformer from scratch on PyTorch, with full explanation, training and inference.

Umar Jamil

1 год назад

156,832 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@tonyt1343
@tonyt1343 - 31.12.2023 20:52

Thanks!

Ответить
@linyang9536
@linyang9536 - 31.12.2023 15:31

这是我见过最详细的从零创建Transformer模型的视频,从代码实现到数据处理,再到可视化,up主真是嚼碎磨细了讲,感谢!

Ответить
@luis87551
@luis87551 - 29.12.2023 21:13

I coded along with this video and train the model now. However, in epoch 9 it still only predicts endless repetitions of one random word, it never feels like a sentence or something. Has anyone else experienced that?

Ответить
@guoweishieh775
@guoweishieh775 - 28.12.2023 08:42

thanks for this video. Super cool. I have one question though. What determines if a module should have dropout or not? InputEmbedding has no dropout but things as simple as ResidualConnection has dropout? LayerNorm has no dropout. I don't know what the pattern it is there.

Ответить
@user-ul2mw6fu2e
@user-ul2mw6fu2e - 25.12.2023 22:10

Wow Your explanation amazing

Ответить
@skirazai7591
@skirazai7591 - 25.12.2023 16:35

Great video, you are insanely talented btw.

Ответить
@fisicraft9366
@fisicraft9366 - 25.12.2023 13:42

Hello, first I want to thank you for your great tutorials which are really helpful and motivating when trying to program ai!!
But i also have a question about a comment of your source code, to be exact the comment (1, seq_len) & (1, seq_len, seq_len) in the return values of the dataset class. Would it not be (1, 1, seq_len) instead of (1, seq_len) because unsqueeze(0) is called two times?
Please enlight me on my possible misunderstanding.

Ответить
@nhutminh1552
@nhutminh1552 - 22.12.2023 14:15

Thank you admin. Your video is great. It helps me understand. Thank you very much.

Ответить
@VishnuVardhan-sx6bq
@VishnuVardhan-sx6bq - 22.12.2023 05:41

This is such a great work, I don't really know how to thank you but this is an amazing explanation of an advanced topic such as transformer.

Ответить
@sup3rn0va87
@sup3rn0va87 - 19.12.2023 01:10

What is the point of defining the attention method as static?

Ответить
@omarbouaziz2303
@omarbouaziz2303 - 15.12.2023 11:41

I'm working on Speech-to-Text conversion using Transformers, this was very helpful, but how can I change the code to be suitable for my task?

Ответить
@keflatspiral4633
@keflatspiral4633 - 14.12.2023 11:27

what to say.. just WOW! thank you so much !!

Ответить
@txxie
@txxie - 13.12.2023 16:43

This video is great! But can you explain how you convert the formula of positional embeddings into log form?

Ответить
@yangrichard7874
@yangrichard7874 - 13.12.2023 07:49

Greeting from China! I am PhD student focused on AI study. Your video really helped me a lot. Thank you so much and hope you enjoy your life in China.

Ответить
@aiden3085
@aiden3085 - 11.12.2023 06:56

Thank you Umar for our extraordinary excellent work! Best transformer tutorial ever I have seen!

Ответить
@panchajanya91
@panchajanya91 - 05.12.2023 16:41

First of all, thank you. This is a great video. I have one question though, in the inference, how do I handle unknown token?

Ответить
@zhengwang1402
@zhengwang1402 - 05.12.2023 00:31

This feels really fantastic when looking someone write a program from bottom up

Ответить
@manishsharma2211
@manishsharma2211 - 28.11.2023 10:52

WOW WOW WOW, though it was a bit tough for me to understand it, I was able to understand around 80 % of the code, beautiful. Thank you soo much

Ответить
@oborderies
@oborderies - 28.11.2023 01:59

Sincere congratulations for this fine and very useful tutorial ! Much appreciated 👏🏻

Ответить
@Schadenfreudee
@Schadenfreudee - 26.11.2023 21:05

There seems to be a very disturbing background bass sound at certain parts of your video especially while you are typing. Could you please sort it out for future videos? Thanks

Ответить
@sypen1
@sypen1 - 26.11.2023 12:37

This is amazing thank you 🙏

Ответить
@sypen1
@sypen1 - 26.11.2023 11:51

Mate you are a beast!

Ответить
@jeremyregamey495
@jeremyregamey495 - 24.11.2023 13:58

I love your videos. Thank you for sharing your knowledge and i cant wait to learn more.

Ответить
@angelinakoval8360
@angelinakoval8360 - 22.11.2023 14:17

Dear Umar, thank you so so much for the video! I don't have much experience in deep learning, but your explanations are so clear and detailed I understood almost everything 😄. It wil be a great help for me at my work. Wish you all the best! ❤

Ответить
@Mostafa-cv8jc
@Mostafa-cv8jc - 20.11.2023 21:02

Very good video. Tysm for making this, you are making a difference

Ответить
@SyntharaPrime
@SyntharaPrime - 16.11.2023 01:14

Great Job. Amazing. Thanks a lot. I really appreciate you. It is so much effort.

Ответить
@nareshpant7792
@nareshpant7792 - 14.11.2023 07:12

Thanks so much such a great video. Really liked it a lot. I have a small query. For ResidualConnection, in the paper the equation is given by "LayerNorm(x + Sublayer(x))". In the code, we have: x + self.dropout(sublayer(self.norm(x))). Why it is not self.norm(self.dropout((x + sublayer(x))) ?

Ответить
@cicerochen313
@cicerochen313 - 11.11.2023 17:47

Awesome! Highly appreciate. 超級讚!非常的感謝。

Ответить
@user-wr4yl7tx3w
@user-wr4yl7tx3w - 11.11.2023 13:25

the code is really well written. very easy and nicely organized.

Ответить
@user-wr4yl7tx3w
@user-wr4yl7tx3w - 11.11.2023 02:07

thanks for making the video. only thing is i wish the text was bigger. it was hard to see.

Ответить
@user-hq2wy7do9f
@user-hq2wy7do9f - 08.11.2023 04:59

The file 'train.py' define the loss_fn.Why is the ignore_index tokenizer_src rather than tokenizer_tgt?

Ответить
@jihyunkim4315
@jihyunkim4315 - 06.11.2023 09:52

perfect video!! Thank you so much. I always wonder the detail code and its explanation and now I almost understand all of it. thanks:) you are the best for me!

Ответить
@mikehoops
@mikehoops - 04.11.2023 20:02

Just to repeat what everyone else is saying here - many thanks for an amazing explanation! Looking forward to more of your videos.

Ответить
@shakewingo3216
@shakewingo3216 - 04.11.2023 18:58

Thanks for making it so easy to understand. I definitely learn a lot and gain much more confidence from this!

Ответить
@forresthu6204
@forresthu6204 - 03.11.2023 13:21

Thanks!

Ответить
@MuhammadArshad
@MuhammadArshad - 02.11.2023 05:27

Thank God, it's not one of those 'ML in 5 lines of Python code' or 'learn AI in 5 minutes'. Thank you. I can not imagine how much time you must have spent on making this tutorial. thank you so much. I have watched it three times already and wrote the code while watching the second time (with a lot of typos :D).

Ответить
@PP-qi9vn
@PP-qi9vn - 30.10.2023 23:25

Thanks!

Ответить
@balajip5030
@balajip5030 - 30.10.2023 06:32

Thanks Bro. With your explanation, I am able to build the transformer model for my application. You explained so awesome. Please do what you are doing.

Ответить
@Udayanverma
@Udayanverma - 30.10.2023 05:01

how can we force it to use GPU instead of CPUs. its taking around 100 mins for 20 epochs. i have geforce 4080 and i9 13900k 64GB. This was my docker "docker run --gpus all -p 9999:9999 -v D:\dc:/tf -it job_image:latest" i included your requirements.txt into mine and rebuilt the docker.

Ответить
@kozer1986
@kozer1986 - 28.10.2023 19:26

I'm not sure if it is because I have study this content 1000000 times or not, but is the first time that I understood the code, and feel confident about it. Thanks!

Ответить
@user-ru4nb8tk6f
@user-ru4nb8tk6f - 27.10.2023 05:56

You are a great professional, thanks a ton for this

Ответить
@Jake-om9no
@Jake-om9no - 26.10.2023 18:34

Subscribe because you have a cat named 奥利奥

Ответить
@tharllesjhoinessilvate200
@tharllesjhoinessilvate200 - 25.10.2023 18:49

Video otimo, pena que não entendo o inglês, sou brasileiro, tinha varias duvidas mas com certeza vou aprender ingles depois venho aqui!! parabéns pelo video

Ответить
@abdullahahsan3859
@abdullahahsan3859 - 25.10.2023 00:19

Keep doing what you are doing. I really appreciate you taking out so much time to spread such knowledge for free. Been studying transformers for a long time but never have I understood it so well. The theoretical explanation in the other video combined with this practical implementation, just splendid. Will be going through your other tutorials as well. I know how much time taking it is to produce such high level content and all I can really say is that I really am grateful for what you are doing and hope that you continue doing it. Wish you a great day!

Ответить
@JohnSmith-he5xg
@JohnSmith-he5xg - 22.10.2023 05:20

OMG. And you also note Matrix shapes in comments! Beautiful. I actually know the shapes without having to trace some variable backwards.

Ответить

КРАКОВ. ПОЛЬША - С ГИДОМ ПО КРАКОВУ Денис Кухта - Гид в Вене и Австрии