Fine-tuning Large Language Models (LLMs) | w/ Example Code

Shaw Talebi

8 месяцев назад

248,288 Просмотров

Скачать видео

Комментарии:

@madhu1987ful - 21.01.2024 16:35

Did you do this fine tuning on CPU or GPU, can you provide details? Thanks

Ответить

@madhu1987ful - 21.01.2024 16:32

How to control the % of params that are being trained? Where are we specifying this? Also can you pls tell me how to choose r? What are these r values: 2,4,8 etc?

Ответить

@alikarooni9713 - 21.01.2024 15:25

Even though this was high level instruction, it was perfect. I can continue from here. Thanks Shahin jan!

Ответить

@dendi1076 - 20.01.2024 08:42

this channel is going to hit 6 figure subscribers at this rate

Ответить

@hadianasliwa - 17.01.2024 21:35

is there a way that distilbert or any other LLM can be trained for QA using dataset that has only text field without any label?
I'm trying to trian the LLM for QA but my dataset has only text field without any labels or questions and answers.

Ответить

@kevon217 - 14.01.2024 16:32

Excellent walkthrough

Ответить

@RajatDhakal - 11.01.2024 07:07

Can I use any open source LLM to train my, for example, healthcare dataset or the LLM should be the one which was pre-trained with healthcare dataset of my interest?

Ответить

@SanthoshKumar-dk8vs - 06.01.2024 08:44

What is large language model firstly? Bert, deberta also consider as LLM?, when should we consider the model as LLM?, i have lot of confusion around it, could you clarify on this? generative model only consider as LLM , all the stuff behind the LLM training? @ShawhinTalebi

Ответить

@scifithoughts3611 - 04.01.2024 19:09

Great video Shaw! It was a good balance between details and concepts. Very unusual to see this so well done. Thank you.

Ответить

@davidedelstein1085 - 01.01.2024 18:03

Didn't even watch this, it's already irrelevant with the new version(s) about to come out.

Ответить

@ramp2011 - 01.01.2024 01:24

Excellent..... Thank you for sharing

Ответить

@ITforGood - 30.12.2023 07:59

Thanks Shaw, very helpful.

Ответить

@InnocenceVVX - 28.12.2023 23:39

Sound (gain) a bit low but great vid bro!

Ответить

@harshanaru1501 - 28.12.2023 15:57

Such a great video ! Wondering how self supervised fine tuning works. Is there any video available on that ?

Ответить

@NateKrueger805 - 27.12.2023 20:20

Nicely done!

Ответить

@crossray974 - 27.12.2023 08:04

It all depends on the selection of the much smaller r parameter, like in PCA!

Ответить

@KaptainLuis - 26.12.2023 21:35

So nice video thank you soooo much!!❤

Ответить

@thehousehusbandcn5074 - 26.12.2023 07:12

You are the man! No BS, just good useful info

Ответить

@beaux2572 - 22.12.2023 22:48

Honestly the most straightforward explanation I've ever watched. Super excellent work Shaw. Thank you. It's so rare to find good communicators like you!

Ответить

@parisaghanad8042 - 19.12.2023 05:44

thanks!

Ответить

@misspanda5717 - 18.12.2023 13:00

thanks

Ответить

@junjieya - 18.12.2023 08:20

A very clear and straightforward video explaining finetuning.

Ответить

@user-qt1uk7uv9m - 15.12.2023 10:23

Nice Video. I need your help to clarify my doubt. When we do the PEFT based finetuning, the final finetuned model size (in KBs/GBs) will increase by the additional parameters ( base model size + additional parameters size) . In this case base model size will be lesser and final finetuned model size will be more. Deploying the final finetuned model in the edge devices will be more difficult because of the limited edge device resources. Are there any way adapters / LoRA can help in reducing the final finetuned model memory size so that easily we can deploy the final model in the edge devices? Your insights will be helpful. Currently i am working in the vision foundation model deployment in the edge device where i am finding it difficult to deploy because of vision foundation model memory size and inference speed.

Ответить

@amnakhan1159 - 13.12.2023 05:55

Hello! I'm trying to use a similar approach but for a different task. Given a paragraph, I want my model to be able to generate a set of tags associated with it for a specific use case. Not quite sure how the Auto Model would differ here and would love your thoughts on this!

Ответить

@alex70301 - 13.12.2023 05:55

Best video on llm fine tuning. Very concise and informative.

Ответить

@user-bp9pe3qe1z - 07.12.2023 15:11

thank you so much

Ответить

@satyagurucharan4455 - 06.12.2023 12:20

What to do when the content is web based. The QnA chatbot has to answer the question based on the content present in the given website.

We are using llama2 7b and it is not giving accurate answers to the questions asked, the answers has to be from the website, but sometimes it gives additional information that is not part of the website.

How would we fine tune and train, use RAG or what are the different API that can be called or trained API's.

It would be helpful if you can share some suggestions or links where i can find the information

Ответить

@aldotanca9430 - 02.12.2023 05:08

Very clear, thanks!

Ответить

@vagicherlasaiavinash8281 - 01.12.2023 06:11

Can we able to get weights of a Llama model?

Ответить

@rubencabrera8519 - 30.11.2023 00:03

This was one of the best videos on this topic, really nice man, keep going.

Ответить

@yoffel2196 - 28.11.2023 15:48

Wow dude, just you wait, this channel is gonna go viral! You explain everything so clearly, wish you led the courses at my university.

Ответить

@Mesenqe - 25.11.2023 15:28

This is incredible, thank you for the clear tutorial. Please subscribe to this channel. One question: Can we apply LoRA to finetune models used in image classification or any computer vision problems? Links to read or a short tutorial would be helpful.

Ответить

@younespiro - 23.11.2023 12:58

amazing video, very well explained

Ответить

@amparoconsuelo9451 - 18.11.2023 01:44

Understood. The codes were very helpful. They were not constantly scrolling and panning. But please display the full code and mention the Python version and system configuration, including folders, etc.

Ответить

@jdiazram - 14.11.2023 18:12

Hi, Nice tutorial. I have a question. Is it possible to have more than 1 output in a supervised way? For example: {"input": "ddddddd", "output1":"dddd","eeee", "ffffff", "output2": "xxxx", "zzzzz", etc} Thx

Ответить

@keithhickman7399 - 11.11.2023 10:53

Shaw, terrific job explaining very complicated ideas in an approachable way! One question - are there downsides to combining some of the approaches you mentioned, say, prompt engineering + fine-tuning + RAG to optimize output...how would that compare to using one of the larger OOTB LLMs with hundreds of billions of params?

Ответить

@zeusgamer5860 - 10.11.2023 18:16

HI Shaw, amazing video - very nicely explained! Would be great if you could also do a video (with code examples) for Retrieval Augmented Generation as an alternative to fine-tuning :)

Ответить

@polarbear986 - 09.11.2023 23:31

bert is not a large language model

Ответить

@sanderkempen6744 - 07.11.2023 00:50

Thanks

Ответить

@seakyle8320 - 06.11.2023 21:31

i wonder why there is no GUI for this task?

Ответить

@naevan1 - 02.11.2023 23:05

Hey dude nice video. I think I'll try to find tuned Lamma to detect phrases and subsequently classify tweets - but multiclass classification. Hope it works ,I guess I'll transfer the csv to the prompt you mentioned like alpaca was done and see if it works

Ответить