Web Scraping + Reverse Engineering APIs

Web Scraping + Reverse Engineering APIs

Syntax

10 месяцев назад

8,667 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@cguser
@cguser - 01.05.2024 14:06

finally a talk on Web Scraping! good to see you again wesbos and scott!

Ответить
@gofudgeyourselves9024
@gofudgeyourselves9024 - 01.05.2024 14:49

Ok

Ответить
@siya.abc123
@siya.abc123 - 01.05.2024 14:53

Lol I've been watching every episode since CJ joined and yet I'm not subscribed 😅

Time to change that

Ответить
@Stoney_Eagle
@Stoney_Eagle - 01.05.2024 15:32

If someone scrapes for indexing and links to your site to consume it I am totally cool with it, but if someone scrapes to bypass the site I'm not.

Ответить
@pedrogorilla483
@pedrogorilla483 - 01.05.2024 15:48

Awesome! On the same line, I’d love an episode on reverse engineering scrambled or minified webapps 😏

Ответить
@pedrogorilla483
@pedrogorilla483 - 01.05.2024 16:30

I’ve just started scraping a few months ago and somehow managed to figure most of the tricks you talked about. I was moved by contempt for a service I paid $200 for in an annual subscription and when the subscription expired not only did they cut off the premium features, but also blurred out over 20k data points I had previously processed in their platform while I had a premium subscription. I got it all back in json using their internal API. I wouldn’t be able to do it without the help of ChatGPT.

Ответить
@KevinMacKenzie61
@KevinMacKenzie61 - 01.05.2024 17:15

Is there a course you recommend for this?

Ответить
@qnoox
@qnoox - 01.05.2024 21:11

love this podcast and this episode since i’m also an scrape OG/ automation panda :) side question will the video format of the podcast ever pan into visual snapshots; when talking about something like when mention console then pan into a snapshot of that or if a website is mentioned than a print screen of that like wes did once during the this video; i know this will add in more work during editing but it would be extra coolness if it was included as a standard; thanks keep up the awesomeness 🎉👍;

Ответить
@stolinski
@stolinski - 02.05.2024 00:40

Working on a scraper rn.

Ответить
@bingerminn
@bingerminn - 02.05.2024 04:29

Awesome! I was using puppeteer to scrape a site and converted it to pinging their api directly. So much faster and no random errors when a element fails to load. Where would you host your scraping scripts that run everyday, hour or minute? I used a package to run it as a service on windows.

Ответить
@OtcSkater
@OtcSkater - 03.05.2024 05:05

I never thought I’d hear XPath mention on a podcast. It’s really too bad XML became a 4 letter word. There was actually some cool things you could do with it that you can’t do with JSON. It also having a DOM for one thing.

Ответить
@jayfiled
@jayfiled - 03.05.2024 07:55

How would you alert if something was available? I want instant, attention ambushing feedback if my scraper finds something.
If i run a cypress script in headless to check a site for tickets, say, and it found one, i want a desktop alert somehow. Browser alerts work if i run it manually, but if I schedule it on mac, then it runs in the background and i dont get any alerts.

Ответить
@chamithjanaka6040
@chamithjanaka6040 - 04.05.2024 18:02

Love you both from Sri Lanka...🇱🇰 ❤

Ответить
@Cyber-Bison
@Cyber-Bison - 05.07.2024 03:01

Have you or anyone else extracted data from an interactive chart?

Ответить
@parkerrex
@parkerrex - 12.12.2024 05:45

Great sode fellas

Ответить
@vNYCblade
@vNYCblade - 09.01.2025 06:49

@syntaxfm
can i propose a challenge???
So I have been trying to create an export tool, to basically backup whatsapp messages including all media like images, videos, voice messages, emojis, etc etc...
The backup tool would have the ability to backup messages that go back up to a certain date, like 6 months ago, OR all messages for a given chat... This seems to be impossible, or at least SUPER DIFFICULT to do... because everything in WhatsApp WebApp is being done via websockets...and is encrypted...
I was looking for ways to reverse engineer their hidden APIs, but i dont have they have any classic APIs, i think its all Websockets...
And the only way to really scrape the data is actually by using Puppeteer...or some other headless browser approach...

Would love to get some info from Wes and whoever else has any insight into this issue...

Ответить
@maueucifeely3910
@maueucifeely3910 - 04.02.2025 04:56

How does HasData handle scraping dynamic websites like those with a lot of JavaScript? Really curious about its efficiency compared to puppeteer.

Ответить
@maueucifeely3910
@maueucifeely3910 - 07.03.2025 01:22

I'm learning about web scraping and wonder if HasData can help with navigating protected routes. What do you guys use?

Ответить