veganism.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Veganism Social is a welcoming space on the internet for vegans to connect and engage with the broader decentralized social media community.

Administered by:

Server stats:

293
active users

#mastoadmin

20 posts18 participants0 posts today

Finally had a chance to read the latest Trunk & Tidbits Mastodon development blog post, and while seeing all the work that's been done on quote posts and starter packs getting a mention is exciting, this upcoming update is really nice:

"Added the ability to block specific usernames from registering. This handles homoglyphs, partial matches, and either require approval, or deny registration entirely."

github.com/mastodon/mastodon/p

via blog.joinmastodon.org/2025/08/

Replaces the previously hardcoded list of reserved usernames with a new admin area that allows blocking words in usernames. Furthermore, the usernames can either be matched exactly, or any username...
GitHubAdd ability to block words in usernames by Gargron · Pull Request #35407 · mastodon/mastodonBy Gargron

As you've probably seen or heard Dropsitenews has published a list (from a Meta whistleblower) of "the roughly 100,000 top websites and content delivery network addresses scraped to train Meta's proprietary AI models" -- including quite a few fedi sites. Meta denies everything of course, but they routinely lie through their teeth so who knows. In any case, whether the specific details in the report are accurate, it's certainly a threat worth thinking about.

So I'm wondering what defenses fedi admins are using today to try to defeat scrapers: robots.txt, user-agent blocking, firewall-level blocking of ip ranges, Cloudflare or Fastly AI scraper blocking, Anubis, stuff you don't want to disclose ... @deadsuperhero has some good discussion on We Distribute, and it would b e very interesting to hear what various instances are doing.

And a couple of more open-ended questions:

  • Do you feel like your defenses against scraping are generally holding up pretty well?

  • Are there other approaches that you think might be promising that you just haven't had the time or resources to try?

  • Do you have any language in your terms of servive that attempts to prohibit training for AI?

Here's @FediPact's post with a link to the Dropsitenews report and (in the replies) a list of fedi instances and CDNs that show up on the list.

cyberpunk.lol/@FediPact/114999

@fediverse @fediversenews

Replied in thread

@ai6yr I mentioned to someone yesterday that I've been looking at The Ultimate Nginx Bad Bot Blocker— I just want to make sure it doesn't include Mastodon due to the "DDOS" link preview claims issue.

It claims, "The Ultimate Nginx Bad Bot, User-Agent, Spam Referrer Blocker, Adware, Malware and Ransomware Blocker, Clickjacking Blocker, Click Re-Directing Blocker, SEO Companies and Bad IP Blocker with Anti DDOS System, Nginx Rate Limiting and Wordpress Theme Detector Blocking. Stop and Block all kinds of bad internet traffic even Fake Googlebots from ever reaching your web sites. " #MastoAdmin

github.com/mitchellkrogza/ngin

Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail f...
GitHubGitHub - mitchellkrogza/nginx-ultimate-bad-bot-blocker: Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail for Repeat OffendersNginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail f...

Ich bräuchte mal eure Hilfe. Meine Instanz, die Embassy, belegt super viel Storage. Das sind hauptsächlich Attachments. Mittlerweile kommen da fast 2 TB zusammen. Es waren sogar über 4 TB bis ich aufgeräumt habe (tootctl media remove und remove-orphaned , auch purge-headers und delete-profiles; days=3). 2 TB finde ich aber immer noch extrem viel. Dadurch wird mein S3 sehr teuer. Letzten Monat kostete der Spaß 40 € bei Wasabi. Was kann ich noch tun um die Belegung und damit die Kosten zu reduzieren? #Mastoadmin Gerne #RT

Our Mastodon :mastodon: instance burningboard.net

Running on energy efficient arm64 CPU (Ampere Altra Q80-30) and just 16GB of RAM in a virtual machine.

Now running on latest Debain Linux 13 :debian: with Linux kernel 6.12.

For that little hardware footprint, it's quite performance, reliable and fast (with over 100 active users).

Just the media files are offloaded to S3 storage at our provider.

Replied to Talya (she/her) 🏳️‍⚧️✡️

@Yuvalne If you have access to the database you could run

SELECT
CONCAT(accounts.username, '@', accounts.domain),
sum(media_attachments.file_file_size)/ 1024 / 1024 as "Total Size (MB)"
FROM
accounts
JOIN media_attachments ON media_attachments.account_id = accounts.id
WHERE media_attachments.file_file_size is not null
GROUP BY
CONCAT(accounts.username, '@', accounts.domain)
ORDER BY
sum(media_attachments.file_file_size) DESC;

to get the list of accounts by total file size.

Or

SELECT
accounts.domain,
sum(media_attachments.file_file_size) / 1024 / 1024 as "Total Size (MB)"
FROM
accounts
JOIN media_attachments ON media_attachments.account_id = accounts.id
WHERE media_attachments.file_file_size is not null
GROUP BY
accounts."domain"
ORDER BY
sum(media_attachments.file_file_size) DESC;

to get a list of instances by file size.

Replied to Talya (she/her) 🏳️‍⚧️✡️

@Yuvalne Question: Determine the media attachment file storage size held on a server for each account on Mastodon instance.

Mastodon Admin >Accounts pages clearly show what the media attachment storage size is for each account. Whether that is dynamically pulled when you view an account or is saved in the database periodically, I am not sure.

I ran "runuser -l mastodon -c 'cd ~ && pg_dump --column-inserts --table=accounts mastodon_production > accounts2.sql'" and it came back with a listing of accounts with this information as a text sql file for each account. I didn't see Media_attatchment file sizes in that info.

Maybe there is a pg_dump recipe that would return the "Media Attachment" info from the Admin>Account pages if it is saved in the database. I can't figure out what field that would be in the database schema. It's early, still, though, and I just brushed over the schema.😉 Tagging #mastodon #MastoAdmin #data #MastoDev in case someone is familiar with it.

Info it returned:
id, username, domain, private_key, public_key, created_at, updated_at, note, display_name, uri, url, avatar_file_name, avatar_content_type, avatar_file_size, avatar_updated_at, header_file_name, header_content_type, header_file_size, header_updated_at, avatar_remote_url, locked, header_remote_url, last_webfingered_at, inbox_url, outbox_url, shared_inbox_url, followers_url, protocol, memorial, moved_to_account_id, featured_collection_url, fields, actor_type, discoverable, also_known_as, silenced_at, suspended_at, hide_collections, avatar_storage_schema_version, header_storage_schema_version, devices_url, suspension_origin, sensitized_at, trendable, reviewed_at, requested_review_at, indexable

#Mastoadmin is there a good way to find which remote accounts/domains take a ton of space with media attachments? our server's remote media cache is ballooning by 30GB *a day* and that definitely sounds wrong.
the info exists somewhere because you can see it in the admin console, but neither the admin console nor #tootctl seem to have a way to sort by attachments total size (unless i'm missing something). any advice?

i wrote a guide to installing, configuring, and running a single-user @Mastodon server. it sets up a @hetzner VPS on ubuntu linux, but the instructions are general enough.

highlights: local disk storage, subdomain redirect, improved configuration files, security and performance, privacy and safety considerations. this server was created by following the guide. hello fediverse. ⁂

wavelight.ws/blog/20250806-mas

wavelight · mastodon server
More from ioflow