StatQuest: Decision Trees, Part 2 - Feature Selection and Missing Data

StatQuest with Josh Starmer

6 лет назад

167,543 Просмотров

Скачать видео

Комментарии:

@Why_I_am_a_theist - 22.11.2023 15:28

BAAAMM!!! JOSH

Ответить

@watsoncarter9782 - 01.07.2023 07:23

Hey Josh , where is the part 1 video of decision tree ... Can you please give me the link

Ответить

@tomaszbabiarz8227 - 02.04.2023 21:31

Hi Josh. I am going over each of Your lessons in time-of-upload order (from oldest to newest) and I wonder why this lesson is called "Part 2" while earlier there was no Part 1. I assume that Part 1 is "Decision and Classification Trees, Clearly Explained!!!" uploaded after this video, correct? BTW: excellent work!

Ответить

@MaxKlimov - 26.12.2022 08:40

Thanks!

Ответить

@romanvasiura6705 - 07.12.2022 12:05

Thank you!

Ответить

@earnesttechie1494 - 02.12.2022 09:17

Where is the part 1 of this video?. I couldn't find it.

Ответить

@zhengwei2559 - 28.08.2022 20:04

Very good job making these videos, I can image how much time you spend on it. Wonderful job.

Ответить

@inwonderland9842 - 24.08.2022 17:55

🏆

Ответить

@bt_HridayAgrawal - 14.06.2022 00:27

Hello :) , thank you so much for this amazing video!
I have a query: to handle missing data, you used the column whose values is most correlated to it, as a guide. But generally, we drop those columns from our dataset and avoid high correlation due to its negative impact on model prediction. Should we really follow this practice?

Ответить

@beautyisinmind2163 - 20.02.2022 15:57

Teacher Josh I have one question: if we apply feature selection techniques say Filter method ,wrapper method, embedded method on our dataset we may get different result from each method which features are relevant or which are not. Overall how we actually make evaluation analysing result coming from each method that which features should be chosen? if features and target variables both are numeric, hope you got my point

Ответить

@gimanibe - 16.02.2022 01:51

Awesome video! Thank you Josh. Would be very useful if you can make a video about "double dipping" after feature selection in random forest or machine learning in general?

Ответить

@nackyding - 26.01.2022 09:53

Do features have to be stationary when applying ML models to time series data? Or any data for that matter?

Ответить

@cookie6299 - 09.01.2022 16:37

2022 01 09

Ответить

@nayeemislam8123 - 18.12.2021 08:48

Can you make a video on how surrogate split is used in decision trees for handling missing data and computing feature importance?

Ответить

@GTXGAMING80 - 11.08.2021 20:23

Hello!
First we will perform EDA,FEATURE ENGINEERING and then MODEL BUILDING right? so, my question is in EDA AND FEATURENGEERING we will handle missing values right?
anyway we'll handled missing values there, even after that will we get missing values?

Ответить

@karsunbadminton7180 - 25.05.2021 12:04

You are genius！Respect from china

Ответить

@pallavgupta1302 - 16.04.2021 13:10

Hey Man , U make ML look like a cake walk. Great work....
AND I am loving ur theme songs, so I am awesome😂 and decision tree algo is proving it🤣

Ответить

@billyericksonsamosir551 - 27.03.2021 14:12

Please make catboost video, i'm working on my thesis :')

Ответить

@nishantpandey9802 - 13.01.2021 18:12

I love the way you start your channel🤩

Ответить

@dikshaprabhukhorjuvenkar6240 - 17.11.2020 09:30

I love the intro.! Just so unique every time. :)

Ответить

@weiyangshi4729 - 13.10.2020 12:20

Can we use a decision tree or random forest to impute missing value? Great work as always!

Ответить

@chloeh7119 - 03.09.2020 09:12

YOU are the BEST!!!!!!!!

Ответить

@21bagong - 31.08.2020 10:56

Hi Prof Josh,
Suppose, I already have the selected features by using random forest algorithm. Then, I use this features for PLS-DA. Will the model I build by PLS-DA more valid?
Thanks

Ответить

@adiflorense1477 - 17.07.2020 17:31

Sir, is it the same between missing data and outliers , noisy data ?

Ответить

@BeSharpInCSharp - 16.06.2020 13:11

selecting a second best gini impurity will reduce over fitting?

Ответить

@shubhamgupta6567 - 14.06.2020 14:07

how dose feature selection helps in overcome of overfitting, please explain I'm not getting this?

Ответить

@iaaan1245 - 13.06.2020 15:38

Hi Josh, I am a big fan!
I'd just like to ask something as I'm still in the midst of learning. In multiple linear regression, we are taught that multicollinearity is a big issue and a red flag. However, here it is mentioned that it can be used as a way to fill in missing data, if the missing data is of a variable that is highly correlated with another one that is known.
Is it because they are different models and thus the issue doesn't apply here?
Thanks, once again, a big fan!

Ответить

@aligh18 - 20.05.2020 12:42

I should redirect my tuition fees to Josh Starmer, because he deserves it more than my university.

Ответить

@jiehu1337 - 30.04.2020 03:06

Clearly explained. but how to measure the correlation btw two binary columns.

Ответить

@Anonymous-54545 - 23.04.2020 04:40

Yelling "oh no!!!" over and over is pretty irritating. I don't mind the other repeated yells as much because at least they aren't infantilizing... I get that you're trying to be relatable and stylized, but this is not pleasant.

Ответить

@abhilashsharma1992 - 06.04.2020 08:22

I got StatQuest!

Ответить

@shrikantdeshmukh7951 - 24.03.2020 08:54

please can explain main difference between ID3,CHAID,CART

Ответить

@rajarajeshwaripremkumar3078 - 19.03.2020 10:07

Is this how feature importances are assigned? Can you elaborate a little on this?

Ответить

@fabianoprado4066 - 25.02.2020 22:50

Hello Josh!! I was thinking in a manner to reduce false negative in diagnosis, so is there some parameter to control the number of false negatives outcomes in a decision tree??

Ответить

@tanweermahdihasan4119 - 22.02.2020 00:33

This is the best statquest song so far.

Ответить

@MrMarieric - 14.02.2020 13:05

You are awesome Josh!!

Ответить

@andreaxue376 - 09.02.2020 04:03

Josh I am a huge fan of your videos!! You helped me understand all those complex ML concepts better than one year in grad school... I wonder if you can make some videos about how feature importances of random forest are calculated and DBSCAN clustering works (and how its parameters are chosen). Thank you so much!!

Ответить

@SIO2HF - 22.01.2020 22:12

Thank you!

Ответить

@yilizhang790 - 17.01.2020 01:31

Hi, will you be able to do a video on how to numerically calculate the Random Forest Feature Importances? I couldn't find clear explanation anywhere on internet...It would be really appreciated!

Ответить

@alecvan7143 - 04.01.2020 01:06

Next level starting balad

Ответить

@haneulkim4902 - 03.01.2020 09:52

Your the best ! thanks!

Ответить

@vincentmayer2816 - 18.11.2019 15:44

I love you Josh, you and your intros.

Ответить

@tuongminhquoc - 04.11.2019 14:42

Great video as always!!!

Ответить

@muhammadmuneeb8255 - 08.10.2019 21:39

Hi Sir, Josh Starmer... I hope you are good. Kindly make a video on pre pruning and post pruning. I have seen your videos and Information Gain 3 is also missing from the series.. Thanks in anticipation..Have a good day.

Ответить