Nume MacAroon @nm

**Craig Shepherd** @Shepharo@mastodonapp.uk · 6m

Craig Shepherd @Shepharo@mastodonapp.uk

The whole point of Artificial Intelligence is to be a trusted expert. #LLMs do not provide this.

#LLMs can be misleading for non-experts, and the misinformation that they produce could lead non-experts to question the validity of the actual expert’s knowledge.

So not only do they not provide the trusted expertise, they undermine the people who can.

**ℒӱḏɩę** @Lydie@tech.lgbt · 15m

15m

ℒӱḏɩę @Lydie@tech.lgbt

Folks that are letting ChatGPT guide their life decisions are in for a treat, and are also competing for a Darwin Award. #chatgpt #llm #llms

**Tero Keski-Valkama** @tero@rukii.net · 2h *

2h *

Tero Keski-Valkama @tero@rukii.net

A new technique for LLMs has just landed: Explainable training!

Let me *explain*.

Normal supervised training works so that you show ground truth inputs and outputs to a model and then you backpropagate the error to the model weights. All in this is opaque black box. If you train with data which contains for example personally identifiable information (PII) or copyrighted contents, those will plausibly be stored verbatim in model weights.

What if we do it like this instead:

Let's write initial instructions to an LLM for generating synthetic data which resembles real data.

Then we go to the real data, and one by one show an LLM an example of a real data, and an example of the synthetic data, and the instructions used to generate the synthetic data. Then we ask it to iteratively refine those instructions to make the synthetic data resemble real data more, in the features and characteristics which matter.

You can also add reasoning parts, and instructions for not putting PII as such into the synthetic data generation instructions.

This is just like supervised learning but explainable! You'll get a document as a result which has refined instructions on how to generate better synthetic data, informed by real data, but now it's human readable and explainable!

You can easily verify that this relatively small document doesn't contain for example PII and you can use it to generate any volume of synthetic training data while guaranteeing that critical protected details in the real data do not leak into the trained model!

This is the next level of privacy protection for training AIs!

#AIs #LLMs #privacy

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · 3h

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"GitHub Codespaces provides full development environments in your browser, and is free to use with anyone with a GitHub account. Each environment has a full Linux container and a browser-based UI using VS Code.

I found out today that GitHub Codespaces come with a GITHUB_TOKEN environment variable... and that token works as an API key for accessing LLMs in the GitHub Models collection, which includes dozens of models from OpenAI, Microsoft, Mistral, xAI, DeepSeek, Meta and more.

Anthony Shaw's llm-github-models plugin for my LLM tool allows it to talk directly to GitHub Models. I filed a suggestion that it could pick up that GITHUB_TOKEN variable automatically and Anthony shipped v0.18.0 with that feature a few hours later.

... which means you can now run the following in any Python-enabled Codespaces container and get a working llm command:"

https://simonwillison.net/2025/Aug/13/codespaces-llm/

Simon Willison’s Weblogsimonw/codespaces-llmI found out today that GitHub Codespaces come with a GITHUB_TOKEN environment variable... and that token works as an API key for accessing LLMs in the GitHub Models collection, which …

#GitHub #Codespaces #LLMs

**Sam Crawley** @SamCrawley@sciences.social · 11h

11h

Sam Crawley @SamCrawley@sciences.social

A big part of being an expert (PhD or otherwise) is knowing what you don't know. When #LLMs can respond with "sorry I don't know" instead of bullshitting, I might be more inclined to believe they have actual expertise.

(And, yes, I know that the nature of LLMs means that may well be impossible).

**Leanpub** @leanpub@mastodon.social · 14h

14h

Leanpub @leanpub@mastodon.social

New Release! The inner workings of Large Language Models: how neural networks learn language by Roger Gullhaug #books #ebooks #ai #llms #nlp #neuralnetworks

Find it on Leanpub!

Link: https://leanpub.com/theinnerworkingsoflargelanguagemodels-howneuralnetworkslearnlanguage

**Posit** @Posit@fosstodon.org · 15h

15h

Posit @Posit@fosstodon.org

Ever wonder how an actuary becomes a data science educator?

Tune into the latest episode of The Test Set with @minecr. We discuss everything from her journey to her innovative use of LLMs to give students immediate feedback on their code.

Listen at https://thetestset.co, Apple Podcasts, or Spotify.

#TheTestSet #DataScience #RStats

**Matthew J. Barnard** @mattbarnardsays@hcommons.social · 16h

16h

Matthew J. Barnard @mattbarnardsays@hcommons.social

My thoughts on GPT-5. I think it shows that the development of #LLMs has plateaued and that means that we finally have time to be thoughtful and critical about what decisions to make about its use in our lives and workspaces, particularly in #HigherEducation and #Academia #philosophy #AIChatbot #ai

https://matthewbarnard.phd/posts/2025-08-11-llms-have-plateaued/

An AI Generated image that parodies the launch of GPT-5 by presenting an advert for ChatGPT 26, saying ChatGPT 26, now with more fonts and better adverts

**Curated Hacker News** @CuratedHackerNews@mastodon.social · 16h

16h

Curated Hacker News @CuratedHackerNews@mastodon.social

LLMs aren't world models

https://yosefk.com/blog/llms-arent-world-models.html

yosefk.comLLMs aren’t world models

#llm #llms

**Lorenzo Isella** @larry77@fosstodon.org · 17h

17h

Lorenzo Isella @larry77@fosstodon.org

Here I tend to read a lot of negative comments about #LLMs, but does anyone share my experience? I use them as an aid to mostly write #rstats and #LaTeX #tikz code faster or recently to clean up a very messy #emacs configuration file. They may hallucinate and I always need to double check the output, but I find #LLMs extremely useful. The ecological footprint is a discussion for another day.

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · 17h

17h

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"AI – the ultimate bullshit machine – can produce a better 5PE than any student can, because the point of the 5PE isn't to be intellectually curious or rigorous, it's to produce a standardized output that can be analyzed using a standardized rubric.

I've been writing YA novels and doing school visits for long enough to cement my understanding that kids are actually pretty darned clever. They don't graduate from high school thinking that their mastery of the 5PE is in any way good or useful, or that they're learning about literature by making five marginal observations per page when they read a book.

Given all this, why wouldn't you ask an AI to do your homework? That homework is already the revenge of Goodhart's Law, a target that has ruined its metric. Your homework performance says nothing useful about your mastery of the subject, so why not let the AI write it. Hell, if you're a smart, motivated kid, then letting the AI write your bullshit 5PEs might give you time to write something good.

Teachers aren't to blame here. They have to teach to the test, or they will fail their students (literally, because they will have to assign a failing grade to them, and figuratively, because a student who gets a failing grade will face all kinds of punishments). Teachers' unions – who consistently fight against standardization and in favor of their members discretion to practice their educational skills based on kids' individual needs – are the best hope we have:"

https://pluralistic.net/2025/08/11/five-paragraph-essay/#targets-r-us

pluralistic.netPluralistic: Goodhart’s Law (of AI) (11 Aug 2025) – Pluralistic: Daily links from Cory Doctorow

#AI #GenerativeAI #LLMs

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · 17h

17h

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"I’ve fallen a few days behind keeping up with Qwen. They released two new 4B models last week: Qwen3-4B-Instruct-2507 and its thinking equivalent Qwen3-4B-Thinking-2507.

These are relatively tiny models that punch way above their weight. I’ve been running the 8bit GGUF varieties via LM Studio (here’s Instruct, here’s Thinking)—both of them are 4GB downloads that use around 4.3GB of my M2 MacBook Pro’s system RAM while running. Both are way more capable than I would expect from such small files.

Qwen3-4B-Thinking is the first model I’ve tried which called out the absurdity of being asked to draw a pelican riding a bicycle!"

https://simonwillison.net/2025/Aug/10/qwen3-4b/

Simon Willison’s WeblogQwen3-4B-Thinking: “This is art—pelicans don’t ride bikes!”I’ve fallen a few days behind keeping up with Qwen. They released two new 4B models last week: Qwen3-4B-Instruct-2507 and its thinking equivalent Qwen3-4B-Thinking-2507. These are relatively tiny models that …

#AI #GenerativeAI #Qwen

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · 17h

17h

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"The incredible demand for high-quality human-annotated data is fueling soaring revenues of data labeling companies. In tandem, the cost of human labor has been consistently increasing. We estimate that obtaining high-quality human data for LLM post-training is more expensive than the marginal compute itself1 and will only become even more expensive. In other words, high-quality human data will be the bottleneck for AI progress if these trends continue.

The revenue of major data labeling companies and the marginal compute cost of training of training frontier models for major AI providers in 2024.

To assess the proportion of data labeling costs within the overall AI training budget, we collected and estimated both data labeling and compute expenses for leading AI providers in 2024:

- Data labeling costs: We collected revenue estimates of major data labeling companies, such as Scale AI, Surge AI, Mercor, and LabelBox.
- Compute costs: We gathered publicly reported marginal costs of compute2 associated with training top models released in 2024, including Sonnet 3.5, GPT-4o, DeepSeek-V3, Mistral Large, Llama 3.1-405B, and Grok 2.

We then calculate the sum of costs in a category as the estimate of the market total. As shown above, the total cost of data labeling is approximately 3.1 times higher than total marginal compute costs. This finding highlights clear evidence: the cost of acquiring high-quality human-annotated data is rapidly outpacing the compute costs required for training state-of-the-art AI models."

https://ddkang.substack.com/p/human-data-is-probably-more-expensive

Daniel’s Substack · 1dHuman Data is (Probably) More Expensive Than Compute for Training Frontier LLMsBy Daniel Kang

#AI #AITraining #GenerativeAI

**Curated Hacker News** @CuratedHackerNews@mastodon.social · 19h

19h

Curated Hacker News @CuratedHackerNews@mastodon.social

Can modern LLMs count the number of b's in "blueberry"?

https://minimaxir.com/2025/08/llm-blueberry/

minimaxir.com · 20hCan modern LLMs actually count the number of b's in "blueberry"?It’s an adversarial question for LLMs, but it’s not unfair.

#llm #llms

**Progressive Tom** @ProgressiveLurker@mastodon.social · 19h

19h

Progressive Tom @ProgressiveLurker@mastodon.social

As someone who (loooong time ago) used to build #HTML websites for a living, my favourite side effect of the spread of #LLMs is that people are beginning to hate on em-dashes, smart quotes, and all that jazz. How much pain all that "proper" #typography has caused me! Burn it with

**nickpending** @nickpending@infosec.exchange · 20h

20h

nickpending @nickpending@infosec.exchange

If you find yourself adding a lot of technological or prompt-based guardrails to your agent to get it to do exactly what you want, it might be time to ask yourself the tough question: Do you REALLY need an agent or do you need automation? Once you figure this out and accept that you probably just needed automation, use your AI powerup and let it build it for you! Save agents for what they're actually good at - analysis, synthesis, and bounded tasks..

Trust me, you'll thank me later.

#claude #claudecode #vibecoding

**Orbifold Consulting** @orbifold@mastodon.social · 20h

20h

Orbifold Consulting @orbifold@mastodon.social

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.
- WebGPU implementation
- Real-time search & nearest neighbors
- Automatic data clustering & labeling
https://github.com/apple/embedding-atlas
#DataViz #LLMs

**Curated Hacker News** @CuratedHackerNews@mastodon.social · 21h *

21h *

Curated Hacker News @CuratedHackerNews@mastodon.social

Evaluating LLMs playing text adventures

https://entropicthoughts.com/evaluating-llms-playing-text-adventures

entropicthoughts.comEvaluating LLMs Playing Text Adventures

#llm #llms

**Kevin Leibold** @kleibold@mastodon.social · 1d

Kevin Leibold @kleibold@mastodon.social

"Put all of this together, and you get a straightforward answer: there are no realistic circumstances under which #LLMs should be considered #anonymous data."

https://desfontain.es/blog/bfdi-consultation-ai.html

desfontain.esAnswering the BfDI's questions on personal data in LLMs - Ted is writing thingsThe German data protection authority asked me for my input on technical questions regarding the use of personal data in AI models; here are my answers.