veganism.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Veganism Social is a welcoming space on the internet for vegans to connect and engage with the broader decentralized social media community.

Administered by:

Server stats:

293
active users

#LLMs

99 posts78 participants6 posts today

The whole point of Artificial Intelligence is to be a trusted expert. #LLMs do not provide this.

#LLMs can be misleading for non-experts, and the misinformation that they produce could lead non-experts to question the validity of the actual expert’s knowledge.

So not only do they not provide the trusted expertise, they undermine the people who can.

A new technique for LLMs has just landed: Explainable training!

Let me *explain*.

Normal supervised training works so that you show ground truth inputs and outputs to a model and then you backpropagate the error to the model weights. All in this is opaque black box. If you train with data which contains for example personally identifiable information (PII) or copyrighted contents, those will plausibly be stored verbatim in model weights.

What if we do it like this instead:

Let's write initial instructions to an LLM for generating synthetic data which resembles real data.

Then we go to the real data, and one by one show an LLM an example of a real data, and an example of the synthetic data, and the instructions used to generate the synthetic data. Then we ask it to iteratively refine those instructions to make the synthetic data resemble real data more, in the features and characteristics which matter.

You can also add reasoning parts, and instructions for not putting PII as such into the synthetic data generation instructions.

This is just like supervised learning but explainable! You'll get a document as a result which has refined instructions on how to generate better synthetic data, informed by real data, but now it's human readable and explainable!

You can easily verify that this relatively small document doesn't contain for example PII and you can use it to generate any volume of synthetic training data while guaranteeing that critical protected details in the real data do not leak into the trained model!

This is the next level of privacy protection for training AIs!

#AIs#LLMs#privacy

"GitHub Codespaces provides full development environments in your browser, and is free to use with anyone with a GitHub account. Each environment has a full Linux container and a browser-based UI using VS Code.

I found out today that GitHub Codespaces come with a GITHUB_TOKEN environment variable... and that token works as an API key for accessing LLMs in the GitHub Models collection, which includes dozens of models from OpenAI, Microsoft, Mistral, xAI, DeepSeek, Meta and more.

Anthony Shaw's llm-github-models plugin for my LLM tool allows it to talk directly to GitHub Models. I filed a suggestion that it could pick up that GITHUB_TOKEN variable automatically and Anthony shipped v0.18.0 with that feature a few hours later.

... which means you can now run the following in any Python-enabled Codespaces container and get a working llm command:"

simonwillison.net/2025/Aug/13/

Simon Willison’s Weblogsimonw/codespaces-llmI found out today that GitHub Codespaces come with a GITHUB_TOKEN environment variable... and that token works as an API key for accessing LLMs in the GitHub Models collection, which …

A big part of being an expert (PhD or otherwise) is knowing what you don't know. When #LLMs can respond with "sorry I don't know" instead of bullshitting, I might be more inclined to believe they have actual expertise.

(And, yes, I know that the nature of LLMs means that may well be impossible).

Here I tend to read a lot of negative comments about #LLMs, but does anyone share my experience? I use them as an aid to mostly write #rstats and #LaTeX #tikz code faster or recently to clean up a very messy #emacs configuration file. They may hallucinate and I always need to double check the output, but I find #LLMs extremely useful. The ecological footprint is a discussion for another day.

"AI – the ultimate bullshit machine – can produce a better 5PE than any student can, because the point of the 5PE isn't to be intellectually curious or rigorous, it's to produce a standardized output that can be analyzed using a standardized rubric.

I've been writing YA novels and doing school visits for long enough to cement my understanding that kids are actually pretty darned clever. They don't graduate from high school thinking that their mastery of the 5PE is in any way good or useful, or that they're learning about literature by making five marginal observations per page when they read a book.

Given all this, why wouldn't you ask an AI to do your homework? That homework is already the revenge of Goodhart's Law, a target that has ruined its metric. Your homework performance says nothing useful about your mastery of the subject, so why not let the AI write it. Hell, if you're a smart, motivated kid, then letting the AI write your bullshit 5PEs might give you time to write something good.

Teachers aren't to blame here. They have to teach to the test, or they will fail their students (literally, because they will have to assign a failing grade to them, and figuratively, because a student who gets a failing grade will face all kinds of punishments). Teachers' unions – who consistently fight against standardization and in favor of their members discretion to practice their educational skills based on kids' individual needs – are the best hope we have:"

pluralistic.net/2025/08/11/fiv

pluralistic.netPluralistic: Goodhart’s Law (of AI) (11 Aug 2025) – Pluralistic: Daily links from Cory Doctorow

"I’ve fallen a few days behind keeping up with Qwen. They released two new 4B models last week: Qwen3-4B-Instruct-2507 and its thinking equivalent Qwen3-4B-Thinking-2507.

These are relatively tiny models that punch way above their weight. I’ve been running the 8bit GGUF varieties via LM Studio (here’s Instruct, here’s Thinking)—both of them are 4GB downloads that use around 4.3GB of my M2 MacBook Pro’s system RAM while running. Both are way more capable than I would expect from such small files.

Qwen3-4B-Thinking is the first model I’ve tried which called out the absurdity of being asked to draw a pelican riding a bicycle!"

simonwillison.net/2025/Aug/10/

Simon Willison’s WeblogQwen3-4B-Thinking: “This is art—pelicans don’t ride bikes!”I’ve fallen a few days behind keeping up with Qwen. They released two new 4B models last week: Qwen3-4B-Instruct-2507 and its thinking equivalent Qwen3-4B-Thinking-2507. These are relatively tiny models that …

"The incredible demand for high-quality human-annotated data is fueling soaring revenues of data labeling companies. In tandem, the cost of human labor has been consistently increasing. We estimate that obtaining high-quality human data for LLM post-training is more expensive than the marginal compute itself1 and will only become even more expensive. In other words, high-quality human data will be the bottleneck for AI progress if these trends continue.

The revenue of major data labeling companies and the marginal compute cost of training of training frontier models for major AI providers in 2024.

To assess the proportion of data labeling costs within the overall AI training budget, we collected and estimated both data labeling and compute expenses for leading AI providers in 2024:

- Data labeling costs: We collected revenue estimates of major data labeling companies, such as Scale AI, Surge AI, Mercor, and LabelBox.
- Compute costs: We gathered publicly reported marginal costs of compute2 associated with training top models released in 2024, including Sonnet 3.5, GPT-4o, DeepSeek-V3, Mistral Large, Llama 3.1-405B, and Grok 2.

We then calculate the sum of costs in a category as the estimate of the market total. As shown above, the total cost of data labeling is approximately 3.1 times higher than total marginal compute costs. This finding highlights clear evidence: the cost of acquiring high-quality human-annotated data is rapidly outpacing the compute costs required for training state-of-the-art AI models."

ddkang.substack.com/p/human-da

Daniel’s Substack · Human Data is (Probably) More Expensive Than Compute for Training Frontier LLMsBy Daniel Kang

If you find yourself adding a lot of technological or prompt-based guardrails to your agent to get it to do exactly what you want, it might be time to ask yourself the tough question: Do you REALLY need an agent or do you need automation? Once you figure this out and accept that you probably just needed automation, use your AI powerup and let it build it for you! Save agents for what they're actually good at - analysis, synthesis, and bounded tasks..

Trust me, you'll thank me later.