Web Scraping NBA Stats With Python: Data Project [Part 1 of 3]

Web Scraping NBA Stats With Python: Data Project [Part 1 of 3]

Dataquest

2 года назад

42,281 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@DayOneCricket-wk9wk
@DayOneCricket-wk9wk - 10.01.2024 09:56

again jump straight into jupyter lab - no set up etc

Ответить
@9FM303
@9FM303 - 09.01.2024 16:44

now it is not necessary to join the East and West tables since there is the Expanded Standings table

Ответить
@thebinarybin
@thebinarybin - 15.12.2023 06:11

I do realize this is a young 1yr old video. But even before today I have encountered so many problems from uninstalled moduels etc.. This video is a helpful point only if you ensure the proper installs for the tutorial. And you guys have to retrace so many steps to get there. I am bummed out you couldnt even provide proper dependencies before you get coding.

Ответить
@user-ej5sb8gk8s
@user-ej5sb8gk8s - 29.07.2023 10:20

"disambiguate" lol

Ответить
@YotsubaBestGirl
@YotsubaBestGirl - 12.07.2023 10:33

HOW DO I SOUP.FIND A TABLE WITHOUT AN ID TO IT. I TRIED THE CLASS AND IT DID NOT WORK. THANK YOU FOR THE VIDEO!!

Ответить
@duloo97
@duloo97 - 03.07.2023 19:39

But you could just for example copy the link of that web page in Power BI and import specific table from the page then even modify it in Power Querry and then export it somewhere as Excel file, or CSV or whatever...

Ответить
@EmeritoMontilla
@EmeritoMontilla - 07.06.2023 03:16

It is more a programming class than a data analytics

Ответить
@calebappiagyei9555
@calebappiagyei9555 - 24.05.2023 01:23

Is there a reason why a table is only able to be created when using the mvp or roy ids but not any of the other awards? I am attempting to do this project but with the most improved player results but it seems it is not able to identify the mip table. However, I noticed that the mip and roy tables worked just fine.

Ответить
@JAswoosh
@JAswoosh - 18.05.2023 07:11

Why is my .format not blue or registring??

Ответить
@JAswoosh
@JAswoosh - 18.05.2023 02:00

Can anyone tell me why .format and .get arent blue (working) in jupyter notebook?

Ответить
@OfficialEricGao
@OfficialEricGao - 15.05.2023 05:22

my html files are being written and saved but when i click on each file, there's nothing saved? Is anyone else having this problem?

Ответить
@seanyang2063
@seanyang2063 - 06.04.2023 02:34

what happens when you run into a 429 error from Sports Reference?

Ответить
@abdeljalil-ahmed
@abdeljalil-ahmed - 04.03.2023 18:02

I have a problem removing unimportant elements from a table ''soup.find("tr", class_="over_header").decompose()'
AttributeError: 'NoneType' object has no attribute 'decompose' :Any guidance on solving this problem

Ответить
@chaimookie935
@chaimookie935 - 04.03.2023 06:21

AMAZING! Thank you much for this.

Ответить
@kurtji8170
@kurtji8170 - 13.02.2023 08:14

Hello, Vik! Thank you for your content! I am wondering if you could post some instructions on how to set sleep timeout for this specific case? I am having this issue and I saw many people in the comments with it too. Many thanks!

Ответить
@ScottRachelson777
@ScottRachelson777 - 13.02.2023 01:23

I followed your code exactly and I got this error:

AttributeError Traceback (most recent call last)
<ipython-input-7-f9f00670890b> in <module>
3
4 soup = BeautifulSoup(page, 'html.parser')
----> 5 soup.find('tr', class_="over_header").decompose()

AttributeError: 'NoneType' object has no attribute 'decompose'

Ответить
@ScottRachelson777
@ScottRachelson777 - 12.02.2023 20:54

How come Jupyter lab won't let me copy the URL from the Basketball Reference website and then paste it into a Jupyter lab cell? It's easy to do in Jupyter Notebook, but it doesn't work in Jupyter Lab.

Ответить
@aradbeneliezer7129
@aradbeneliezer7129 - 05.02.2023 19:56

i am getting an error that No such file or directory: 'mvp/1991.html' in the start of the scraping process, what can i do ? i have a folder named mvp in the same directory as the notebook. i am working in chrome

Ответить
@austinsacks5907
@austinsacks5907 - 28.01.2023 09:52

what do you do if you get rate limited when trying this

Ответить
@jacklegnon8439
@jacklegnon8439 - 28.12.2022 18:58

What do I do if I get banned by the site?

Ответить
@irenenafula8694
@irenenafula8694 - 26.12.2022 19:07

When viewing the list of years, you can use print(years, end=" ") to print them horizontally so that you do not have to scroll.

Ответить
@DilzOnlineHD
@DilzOnlineHD - 20.12.2022 16:31

Lovely video. My issue is when I try to extract the HTML my JupyterNotebook is stuck on "loading" for the table I need which is frustrating. It will load two tables on a page but any more it won't do it. If anyone has a solution it would be great.

Ответить
@AR-hp2jl
@AR-hp2jl - 13.12.2022 00:46

I got banned from basketball reference what did i do wrong and how can I avoid this in the future

Ответить
@Aquafina780
@Aquafina780 - 17.11.2022 14:56

hello! when i try to run intial data exctraction, it returns 'int' object is not iterable for the "for + in " line pertaining to the years. how should i correct this?

Ответить
@taxtr4535
@taxtr4535 - 15.11.2022 17:15

You da GOAT no cap

Ответить
@jaredhutchinson4629
@jaredhutchinson4629 - 13.11.2022 19:31

When I attempted to decompose() un unwanted row. I'm doing a slightly different page on the website so in this instance it is class_=thead

That said, I got the error that 'AttributeError: 'NoneType' object has no attribute 'decompose''. I was told that I need to creat code similar to the following:

thead = soup.find('tr', class_="thead")
if thead is not None:
thead.decompose()
...
else:
...

Am I doing things wrong? Anything I can do to go about this?

Ответить
@auyeungstephen2878
@auyeungstephen2878 - 01.11.2022 19:08

i have a problem with encoding
when i type
with open("mvp/1991.html") as f:
page = f.read()

soup = BeautifulSoup(page, 'html.parser', encoding='utf-8')
soup.find('tr', class_="over_header").decompose()
it show that
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 247773: character maps to <undefined>

how i can solve this problem?

Ответить
@sheaturner3229
@sheaturner3229 - 29.10.2022 05:00

I wanna learn how to do this for nba first basket picks for fanduel

Ответить
@oscarhurtado7107
@oscarhurtado7107 - 22.10.2022 17:34

How can I know which parts of a website are being displayed by executing javascript using the inspections tools, and not downloading the html fist and comparing them?

Ответить
@ofgmora
@ofgmora - 06.10.2022 08:37

Thank you for sharing such an amazing video. It helped me a lot.

I have a question, how can i get the player ID. It is not visible in the table, so when scratching the table we only get the players name. But when inspecting the code there is a variable for player ID.

Ответить
@arjanpatell
@arjanpatell - 10.09.2022 23:34

When I try to webscrape from basketball reference I get a 404 error saying the webpage isn't found. I also tried it with baseball reference and pro football reference getting the same error. Is this happening because I made too many requests?

Ответить
@witnessesofpastsins7832
@witnessesofpastsins7832 - 04.08.2022 20:37

Hi Vik! I run into a problem and get a 'UnicodeEncodeError' when I try and write the html to the files and I was wondering if there was a work around? I started off exactly how you have the web_scraping.ipynb file, but for some reason my 3rd cell is not running properly? Any help would be awesome!

Ответить
@alessandrodimattia8946
@alessandrodimattia8946 - 26.07.2022 11:34

Ehi Vik! First of all, thanks for the amazing opportunity you are giving me to learn data science for free. Your videos are so clear and accessible, I cannot fully express mt gratitude.
Second, does anybody know a website I can use to replicate this project for Football Premier League?
Thank a lot to everybody!

Ответить
@alexandercardoza23
@alexandercardoza23 - 25.07.2022 02:55

Wanted to try this with baseball reference but instead the table being html, it’s shtml. Any way to get around that?

Ответить
@dogden95
@dogden95 - 13.07.2022 17:18

The content from you guys is outstanding! Thanks for everything!

Ответить
@floppitommi123
@floppitommi123 - 28.06.2022 14:15

i cant beleaeive this has only 8k views i very sad now

Ответить
@_Kysa_
@_Kysa_ - 15.06.2022 02:09

Hello, at the part you specify Id=”mvp”, what happens when the id is not there? Is there another way, just with the class or something similar? Thanks and Great video!

Ответить
@basiliogoncalves8956
@basiliogoncalves8956 - 01.06.2022 21:34

Great video thanks a lot!!

I just noticed one thing when using "open()" in need to add'encoding="utf-8" ' such as:
with open(filename,'r', encoding="utf-8") as f:
page = f.read()
...

Ответить
@justinburney7125
@justinburney7125 - 30.05.2022 22:48

At the part where you specify id=‘mvp’, what do you do when the id includes randomly assigned characters (id=‘stats_cd051869_summary’) and a loop is required first to know each game’s unique id, like ‘cd051869’?

Ответить
@Monkeyfist2021
@Monkeyfist2021 - 22.05.2022 15:07

Really useful video! I am using the webscraping component within my Matsers to use within the Machine Learning code we have been taught in class (tensorflow). The chromedriver required me to download chrome version 99.0.4488.51 & worked well.
I needed to add in this bit of code to get it to work.
(only required in player section)
tHeads = soup.findAll('tr', class_="thead")
for tHead in tHeads:
tHead.decompose()

Clear, conscience & asthentically pleasing video! Thank you for your help mate from Brisbane, Australia! 🇦🇺

Ответить
@bjorncalbes7604
@bjorncalbes7604 - 08.04.2022 19:58

what IDE are you using?

Ответить
@wangqinjing8336
@wangqinjing8336 - 23.03.2022 11:13

Hi there! I've been struggling with the selenium part of the web scraping. I copied the path of my chromedriver.exe file but it still says that the file is not found. I'm using chrome and windows. Any feedback would be highly appreciated!

Ответить
@AngelProceeds
@AngelProceeds - 23.03.2022 01:03

Incredibly helpful! Thank you so much!

Ответить
@jasoningersoll9346
@jasoningersoll9346 - 21.03.2022 00:25

Great video! If I wanted to separate data by teams rather than by years how would I do that?

Ответить
@ZambetulSoarelui
@ZambetulSoarelui - 15.03.2022 19:07

Nice tutorial
Thank you!

Ответить