veganism.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Veganism Social is a welcoming space on the internet for vegans to connect and engage with the broader decentralized social media community.

Administered by:

Server stats:

214
active users

#webscraping

4 posts3 participants0 posts today

The smartest brands aren’t guessing anymore. They’re automating how they collect and act on web data at every step:

• Dynamic pricing that adapts to the market

• Spotting trends before competitors do

• Understanding what customers really think

It’s how data becomes a competitive edge.

👉 Explore how top DTC brands are using web scraping to fuel growth: bit.ly/3G60wdu

I'm having trouble figuring out what kind of botnet has been hammering our web servers over the past week. Requests come in from tens of thousands of addresses, just once or twice each (and not getting blocked by fail2ban), with different browser strings (Chrome versions ranging from 24.0.1292.0 - 108.0.5163.147) and ridiculous cobbled-together paths like /about-us/1-2-3-to-the-zoo/the-tiny-seed/10-little-rubber-ducks/1-2-3-to-the-zoo/the-tiny-seed/the-nonsense-show/slowly-slowly-slowly-said-the-sloth/the-boastful-fisherman/the-boastful-fisherman/brown-bear-brown-bear-what-do-you-see/the-boastful-fisherman/brown-bear-brown-bear-what-do-you-see/brown-bear-brown-bear-what-do-you-see/pancakes-pancakes/pancakes-pancakes/the-tiny-seed/pancakes-pancakes/pancakes-pancakes/slowly-slowly-slowly-said-the-sloth/the-tiny-seed

(I just put together a bunch of Eric Carle titles as an example. The actual paths are pasted together from valid paths on our server but in invalid order, with as many as 32 subdirectories.)

Has anyone else been seeing this and do you have an idea what's behind it?

🧠 Doing market research or tracking prices online?

With ProxySocks5, you can run your scrapers 24/7 without worrying about blocks or limits. Our static and rotating proxies help you collect data smoothly from the locations you need—right down to the city.

Choose between HTTP, SOCKS5, Shadowsocks, Trojan, and WireGuard VPNs, available as datacenter or residential IPs. No data caps, no throttling, and no headaches.

🔍 proxysocks5.com

#WebScraping#ResidentialProxies #ProxySocks5👽

This week I wrote how to use "CrawlSpider" to use a declarative format to follow links during a web scraping project with Scrapy.

However, in a past project, I had the need to extend this functionality a bit, defining dynamic rules (based on user input).

So as a continuation of my previous post, I wrote a new one explaining a little about how this solution was made.

rennerocha.com/posts/dynamic-r

Sharing what I learn about software development mainlyDynamic rules for following links declaratively with Scrapy | Renne RochaSharing what I learn about software development mainly