Requests-HTML - Checking out a new HTML parsing library for Python

Requests-HTML - Checking out a new HTML parsing library for Python

sentdex

6 лет назад

29,407 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@Christian-mn8dh
@Christian-mn8dh - 30.04.2022 23:29

why use requests-html when I can use requests?

Ответить
@im4485
@im4485 - 22.06.2021 00:16

hi, i cant find a way to select an element by its attribute. Can you help please?

Ответить
@fhashim
@fhashim - 19.08.2020 18:32

Please make a tutorial on how to make asynchronous requests using grequest

Ответить
@shazkingdom1702
@shazkingdom1702 - 16.06.2020 17:57

Thank you so much!, I got the answer based on your guide. nice helmet ⛑ you got back there

Ответить
@Hegelian10
@Hegelian10 - 22.04.2020 21:17

Strange, I have installed requests_html but when I import it in a Python script in Python 2.x or 3.7, I get: ModuleNotFoundError: No module named 'requests_html'

Ответить
@anyad111
@anyad111 - 13.10.2019 13:43

Hello!
Recently I am trying to parse some webpages with Requests-html asynchronously. Theoretically this can be done by working with AsyncHTMLSession. However, I am unable to get result with it most of the time (I also use arender, the attempts to parse the webpages fails due to different reasons - most probably timeouts). Maybe it's just the poor internet connection, but I'd be really grateful if you uploaded a video or help me with this.

Ответить
@chengyaozheng8536
@chengyaozheng8536 - 24.06.2019 12:26

for {basically anything you want to name it} in r.html:
you get urls

Ответить
@SimOn-bz4xy
@SimOn-bz4xy - 08.05.2018 11:26

Sentdex! Can you show scraping from a page with a "show more" button, that loads more of the page in JavaScript ?

Ответить
@xiaokunxu7593
@xiaokunxu7593 - 06.04.2018 08:14

I was thinking why not using r.html.find_all('td', {'class':re.compile(r'class name regex')}). turn out that's beautifulsoup funtion. But yeah, having the render() to run js is nice!

Ответить
@shmuel-k
@shmuel-k - 01.04.2018 20:16

You could have used a css selector when parsing the yahoo finance page

Ответить
@SimonEliasen123
@SimonEliasen123 - 01.04.2018 13:10

Please make a video building a webcrawler, would be very insightful!

Ответить
@CodingTrades
@CodingTrades - 01.04.2018 04:24

you forgot to say what it is good for

Ответить
@muhammedeltabakh852
@muhammedeltabakh852 - 31.03.2018 23:15

I love you man more than Mo Salah

Ответить
@Lucas-wl8py
@Lucas-wl8py - 31.03.2018 19:14

This is cool! And it gave me the idea of a series of videos about how to create a python package

Ответить
@pyxelr
@pyxelr - 31.03.2018 18:26

Can I get some help how to install "requests-html" package to be run globally, for example, through Sublime Text?
I am using Conda on Windows 10.

I have been trying to do that, but as I understand so far, it runs only in virtual environment that cannot be used by Sublime? Correct me if I am wrong.

Ответить
@developerarchitect7523
@developerarchitect7523 - 31.03.2018 12:05

what the extension, who print the result down?

Ответить
@rajshah9031
@rajshah9031 - 31.03.2018 11:02

Is it useful for scraping website with ajax ??

Ответить
@WhiterockFTP
@WhiterockFTP - 31.03.2018 04:57

to find td‘s or other elements that pertain to multiple classes you just would have had to put dots in between. Read up on css selectors, jquery also uses them, pretty standard nowadays and less headache than xpaths ;)

Ответить
@dave597
@dave597 - 31.03.2018 04:03

wow, when did you start using sublime text? i havent seen your videos in a while but back then they were all done in notepad or idle! :)

Ответить
@cooperlimond
@cooperlimond - 31.03.2018 01:55

I think the // for retry in range(100): // part is what is allowing the script to continue after raising the error. From their doc: "The simplest use case is retrying a flaky function whenever an Exception occurs until a value is returned." So this would allow the exception to be printed, yet the script to continue I believe. Great content man, thanks for all of the awesome videos :).

Ответить
@iNuchalHead
@iNuchalHead - 30.03.2018 23:26

How well does it work on snippets or badly formed HTML?

Ответить
@kylek29
@kylek29 - 30.03.2018 19:59

Thanks for posting this. I've used BS4 and another module to do the JavaScript (render the page) on many projects, it's nice to have it in a concise package.

Btw, I think the pagination on HackerNews failed because it looks for one of three (by default) "next" labels. "next", "more", "older" (DEFAULT_NEXT_SYMBOLS). The CNBC link has "more" in it.

Ответить
@LolLol-wy5fp
@LolLol-wy5fp - 30.03.2018 19:22

Thank u sentdex you are leading me to the real world from africa

Ответить
@MohamedMagdyHammad
@MohamedMagdyHammad - 30.03.2018 19:10

The function couldn't clean up user data because these files were locked by chromium process.

Ответить
@kemalonat802
@kemalonat802 - 30.03.2018 18:44

From Turkey👋👋👋

Ответить
@Bidek56
@Bidek56 - 30.03.2018 18:11

Have you done any tutorials on Dask?

Ответить
@rohnchatterjee7736
@rohnchatterjee7736 - 30.03.2018 17:50

I think for paging one can use threading with a except statement and hopefully it will work.

Ответить
@SkySesshomaru
@SkySesshomaru - 30.03.2018 17:49

Make another video building a crawler using it. Nice video!

Ответить
@rohnchatterjee7736
@rohnchatterjee7736 - 30.03.2018 17:39

Best way to remove error
-> comment out raise statement. 😁😂

Ответить
@ayush0477
@ayush0477 - 30.03.2018 17:38

Should i change my windows to 64-bit version ?

Ответить
@deadman87ful
@deadman87ful - 30.03.2018 16:59

awesome video as always, quick question tho why don't you use linux ?

Ответить
@TheJohnny9506
@TheJohnny9506 - 30.03.2018 16:27

Great tool for mixing with bs4 to build a robust crawler

Ответить
@sak8485
@sak8485 - 30.03.2018 16:20

Can you make a video about GANS, and some real time appplications of it

Ответить
@SerenoMendes
@SerenoMendes - 30.03.2018 16:16

Great tool to build crawlers!

Ответить
@hamrozjumaev9450
@hamrozjumaev9450 - 30.03.2018 16:13

Great work! Thank you.... I would be very greatful if you check my privious comments. I need your help please

Ответить
@rumidom
@rumidom - 30.03.2018 15:55

man, i'm mixing that HTML parsing sauce with my beautiful soup right now

Ответить
@thehungman
@thehungman - 30.03.2018 15:53

I like this type of video. You should do like a monthly video of new module so people can be aware. This will be very useful people that learn python.

Ответить
@yoeriyoeri4264
@yoeriyoeri4264 - 30.03.2018 15:48

1en

Ответить