Комментарии:
Thankyou, Corey.
Your explanations are always complete and very helpful!
You are the best, Corey 🥳
ОтветитьIf I could give you a billion likes for this video I would. This is top quality content.
Ответитьgood video
ОтветитьVideo id can be extracted by regex (?<=embed\/).+(?=\?)
ОтветитьCan you do a webscraping with python and scrappy and xpath and Hidden API.
Ответить2022
ОтветитьOn the webpage/url that I call session.get(url) on, there is a javascript script, one thing this script does is send a request of its own, how can I capture the response to this request?
ОтветитьPlease can you make a another video on how to scrape application?
ОтветитьSimply described to the point! Thanks
ОтветитьIm confusing between this video and the web scraping with bs4. Can someone explain to me is this video the same as the other one or is it different?
ОтветитьHi Corey, can u make a tutorial about how to call public java script api from python?
ОтветитьVery thorough and complete on the topic, thanks for educational video 🙂
Ответитьhi, im new about coding and stuff
i want to scrape url inside <a> tag with class .shortc-button but as you already know that 1 element can have multiple class...
in my case theres:
.shortc-button medium green (for mediafire url)
and
.shortc-button medium orange (for zippyshare url)
i want to access '.shortc- button medium green' that contains mediafire links
how i should write the code?
1. r.html.find('.shortc-button .medium .green')
2. r.html.find('.shortc-button medium green')
3.
shortc = r.html.find('.shortc-button')
for medium in shortc:
medium.find('.medium')
for green in medium:
green.find('.green')
Sir I get empty list from soup.find_all("div",class_="some class"), although there are some children of this class
What can be the reason?
it's very useful thank you so much 💯
ОтветитьHi @Corey, for your tutorial related to AsyncHTMLSession.
I'm getting the
"RuntimeError: This event loop is already running."
I checked the documentation did not really see the reason for it. Could you please take a look if that is expected. I'm running in Windows 10. Python 3.10.
i tried scraping one, but got a status code - 406 , can you please help, i can't find a solution!
ОтветитьDaaamn this is the greatest video I’ve ever seen about scrapping, nice I was looking for this kind of explanation for long time since I’m working on a project with python 3
Ответитьif anyone understands what html = HTML(html = source) is doing please assist? Any links to another video where it's explained would be welcome as well. thanks
Ответитьtop notch as usual thank you
Ответитьhow to scrap innerHTML content?
Ответитьtysm
Ответить12 minutes in, I can grab website information from this tutorial. Why is this a big deal? I know next to nothing about Python. Corey is high value in a very condensed time. Others would take hours to get to his 12-minute mark. Subscribed.
Ответитьdo you have any series about asynchronous programming in python ???
ОтветитьWow detailed info !!
Request to go for coroutines and asyncio and async await please
What a great tutorial! I bet this is the first long tutorial that I ever watched nonstop.
ОтветитьCorey, it was really much informative. Can you clearify me what is the difference bet using BeautifulSoup and HTMLSession. Like for which types of sites, we use BeautifulSoup and for which type of site to use HTMLSession.
ОтветитьHi Corey, thanks for your video, it's really helpful.
I want to ask if the website requires log-in to see the data, how can we do that? I see there's a way to do it with normal request library but found none with requests-html. Thanks
Hi Corey. How does one find an element by its attribute and not by using css selector?
ОтветитьThis is really cool. I was looking for the ability to scrape a website and found requests_html. Quickly ran headlong into a wall as the site is a React.js site. :( Thought maybe I could find some information on performing clicks and such with requests_html, but looks like that is not possible. Your tutorial on the subject is great though. Really well thought out and explained, Great presentation!
Ответить@coreyschaffer - Please provide code for web scrapping for this video.Github repo link doesnt contain the code files.
ОтветитьGreat stuff. Quick question. I'm able to scrape links but when they output on the HTML page it's just the text, not the clickable hyperlink. Any ideas on how to fix this so I can have a clickable link?
ОтветитьBut how bout a login page? Is it still worth?
ОтветитьBrilliant as usual! Salute!!!
ОтветитьJust a suggestion Corey. Can you please tag your videos 'Beginner', 'Intermediate', 'Advanced' for the benefit of noobs like me. Thanks already. Keep the awesome stuff coming.
ОтветитьHow AsyncHTMLSession work with concurrent.futures? Don’t want to write a function for each thread.
ОтветитьWhenever I run .find(), the type thats returned is a list. For example the variable you have named "headline" would be a list. So I cant run .find() again. Also for some reason it's not recognizing .html as a method of the r object. I even explicitly declared the variable type but it still cannot see .html as a method from whatever session.get returns. Any suggestions?
Ответитьnice tutorial
ОтветитьI'm learning a ton about webscraping from this tutorial, but I'm not able to run the code. Like many folks, I've got a few Python versions installed. I ran the code in the Thonny IDE, but I get a traceback on 'no requests_html module found.' Did some research on it, and discovered that requests_html is only supported on Python 3.6 (and my Thonny default was 3.7). I reset Thonny to run 3.6.5, but got the same error. Now I'm installing 3.6 to see if requests_html will be imported in that version. Anyone else see a similar issue with a traceback? What was your workaround?
ОтветитьHello Corey, can you please make a full on tutorial on webscraping using Scrapy? Thanks in advance.
ОтветитьIn my code I have this error: There is no current event loop in thread 'Thread-1'
My code(I use Django):
session = HTMLSession()
r = session.get(url)
r.html.render()
i really love this video @Corey Schafer but i would like to learn about using the api to scrap data from social media like Facebook, twitter and the rest so if you do a video about that will be appreciated thank you
Ответитьr.html only works in the terminal but not in IDE. help pls!
Ответитьhi @corey schafer even i found the same problem while working with requests_html where there is no prettify method but we can overcome that one with full_text method instead of text
thanks and hope i helped you