Nume MacAroon @nm

Richard DallawaySad news: “BODEN — Professor Margaret (Maggie) Boden, renowned cognitive scientist and long-time member of the University of Sussex, died peacefully in Brighton on 18th July 2025, aged 88.”<a href="https://www.theargus.co.uk/memorials/death-notices/death/30683058.margaret-maggie-boden/notice/" rel="nofollow noopener" translate="no" target="_blank">https://www.theargus.co.uk/memorials/death-notices/death/30683058.margaret-maggie-boden/notice/</a> <a href="https://mastodon.green/tags/cogsci" class="mention hashtag" rel="nofollow noopener" target="_blank">#cogsci</a>

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · 6d

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Mistaking meaningless claims like "Good #health lends subtle creativity to reality” for profound ideas is known as #bullshit receptivity.

Susceptibility to such #BS was increased a bit by forcing people to quickly accept their initial impulse.

https://doi.org/10.1080/20445911.2025.2517038

Main differences between intuition condition or reflection condition (compared to control condition) on receptivity to bullshit, mundane statements, and motivational quotes.

The reflection condition didn't differ significantly from the control condition, replicating prior research showing that it is much easier to inhibit reflection than encourage it.

#cogSci

**ma𝕏pool** @maxpool@mathstodon.xyz · Jul 18 *

Jul 18 *

ma𝕏pool @maxpool@mathstodon.xyz

Play First 3 Games
https://three.arcprize.org/

There are no instructions. You must play the game to discover controls, rules, and goal.

ARC-3, a sneak peek at the next-gen, interactive reasoning benchmark designed to illuminate the capability gap between today's AI and tomorrow's AGI.

Interactive Reasoning Benchmarks (IRBs) test for a broad scope of capabilities:

• Exploration
• Percept -> Plan → Action
• Memory
• Goal Acquisition
• Alignment

Game Design Constraints

• Easy for humans (can pick it up in <1 min of game play)
• Core Knowledge Priors (no language, trivia, cultural symbols)
• Should require no instructions to play
• Should be fun for humans and playable in 5-10 minutes
• Innovative and novel game mechanics encouraged (Hidden state, theory of mind, long term planning, navigating other agents, etc.)

ARC-AGI-3ARC-AGI-3 PreviewThe first interactive reasoning benchmark for AI agents.

#AI #ML #MachineLearning

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Jul 10

Jul 10

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

This week I'm posting about presentations from two cool events (over on Twitter): https://x.com/byrd_nick/status/1943219893291164057

What are the events?
(1) The 1st Experimental Argument Analysis workshop
(2) The 5th European #ExperimentalPhilosophy #Conference

#cogSci #xPhi #Linguistics

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Jun 26

Jun 26

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

For the next couple days, I'm posting about talks and posters from the 2025 BioXPhi Summit in #Switzerland. Follow on #BlueSky: https://bsky.app/profile/byrdnick.com/post/3lsim7t6gq22t

The #conference website: https://ibmb.unibas.ch/en/public-outreach/projects-to-the-public/basel-oxford-nus-bioxphi-summit-2025/

The 2025 BioXPhi Summit program of talks and events.

#bioethics #xPhi #medicine

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Jun 24

Jun 24

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

#AI reasoning models may seem to reason reflectively when they say things like, "Let me rethink that".

But do these "reflective" phrases predict better reasoning performance?

Not in #Deepseek R1 Zero: https://doi.org/10.48550/arXiv.2503.20783

Some examples of the "aha" moments in allegedly "self-reflection" output from AI reasoning models.

"An additional important question is whether self-reflection behaviors are associated with improved model performance after RL training. To investigate this, we host DeepSeek-R1-Zero and analyze its responses of the same questions from MATH dataset. While self-reflection behaviors occur more frequently in R1-Zero, we observe these behaviors are not necessarily imply higher accuracy. Detailed analysis can be found in App. D."

Self-reflection does not necessarily imply higher accuracy. To investigate whether self-reflection behaviors are associated with model performance during the inference (acknowledging that self-reflection may improve exploration during training—a potential positive effect outside this section’s scope), we analyze questions that elicit at least one response with self-reflection from DeepSeek-R1-Zero across eight trials. For each question, we sample 100 responses and divide them into two groups: those with self-reflection and those without. We then compute the accuracy difference between these two groups for each question. As shown in Fig. 13, the results indicate that nearly half responses with self-reflection do not achieve higher accuracy than those without self-reflection, suggesting that self-reflection does not necessarily imply higher inference-stage accuracy for DeepSeek-R1-Zero.

What keywords or phrases count as "self-reflection"?

Here's what the paper reports: "terms like “wait” and “try again” frequently result in false positive detections. To reduce false positives, we maintain a small, highly selective keyword pool consisting of terms that are strongly indicative of self-reflection. In our experiment, the keyword pool is limited to: recheck, rethink, reassess, reevaluate, re-evaluate, reevaluation, re-examine, reexamine, reconsider, reanalyze, double-check, check again, think again, verify again, and go over the steps."

#cogSci #decisionScience #processTracing

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Jun 10

Jun 10

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Can task-switching hinder decisions?

Switching between a reflection test and a fluid #IQ test lowered optimal reflection test scores and completion compared to taking the tests separately (N = 80).

Bad news for #multitasking?

https://ianburbidge.com/wp-content/uploads/2024/05/ian-burbidge-masters-dissertation-1.pdf

#productivity #cogSci #edu

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Jun 6

Jun 6

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

A #nudge improves a decision environment.
A #boost improves a decision competency.

This paper argues against Sunstein's suggestion that boosts are thus educative or reflective (System 2) interventions.

https://doi.org/10.1007/s11299-025-00324-1

#BehavioralScience #Policy #PhilSci

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · May 19

May 19

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Accepted in Res Philosophica

"Reflective" thinking is rife in #cogSci and the #history of ideas.
But we lack a unified definition.
So I synthesized one.
Just 2 key factors.
Not just unifying, but useful!

Audiopaper: https://byrdnick.com/archives/28904/upon-reflection-ep-15-a-two-factor-explication-of-reflection

Preprint: https://osf.io/preprints/psyarxiv/d628j

Nick Byrd, Ph.D. · May 19Upon Reflection, Ep. 15: A Two-Factor Explication Of ‘Reflection’ | Nick Byrd, Ph.D.You may have heard me drone on and on about this thing called “reflective thinking”. We philosophers and cognitive scientists are preoccupied with it. However, we lack a definition of &…

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · May 17

May 17

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Maybe @Dockers opted for the misspelled "TruTemp" #branding because #philosophy had already taken "Truetemp".

Aside: I recently published new data about #thoughtExperiments like Truetemp:
https://doi.org/10.1093/analys/anaf015
https://osf.io/preprints/psyarxiv/y8sdm

A photo of Docker's TruTemp365 tag on the waist band of a pair of pants.

"Suppose a person, whom we shall name Mr. Truetemp, undergoes brain surgery by an experimental surgeon who invents a small device which is both a very accurate thermometer and a computational device capable of generating thoughts. ... Now imagine, ..., that he has no idea that the tempucomp has been inserted in his brain, is only slightly puzzled about why he thinks so obsessively about the temperature, but never checks a thermometer to determine whether these thoughts about the temperature are correct. He accepts them unreflectively.... Thus, he thinks and accepts that the temperature is 104 degrees. It is. Does he know that it is? Surely not."

Lehrer, K. (1990). Theory of Knowledge. Routledge. pp 162-163

#cogSci #xPhi #trademark

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · May 14 *

May 14 *

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Can group work/discussion cultivate #criticalThinking?

General #surgery trainees randomly assigned to team-based learning (rather than traditional curricula) had better reflection test scores (n = 36).

Preprint: https://doi.org/10.21203/rs.3.rs-6439748/v1

#edu #medicine #cogSci

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · May 7

May 7

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

#AlgorithmAversion is a tendency to judge errors in automated decisions more harshly than errors in human decisions.

Telling people a decision is typically made by machines eliminated or even reversed the #bias.

https://doi.org/10.1017/jdm.2025.8

Methods and initial result (pages 7 and 8).

Other results from Study 1 (pages 9 and 10)

One more plot from Study 2 and the beginning of the discussion section (pages 17 and 18)

#AI #cogSci #xPhi

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · May 1

May 1

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Is reflective reasoning always better?

In "Bounded Reflectivism..." (2022), I argued that #cogSci data show reflection is NOT always best: https://doi.org/10.1111/meta.12534

Another #AI paper finds this: intuitive #LLM prompts were better for "common sense" tasks: https://doi.org/10.48550/arXiv.2502.12470

Pages 5 and 6 (more methods and some results)

Pages 7 and 13 (more results and methods)

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 30

Apr 30

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Another correlational study of #AI use and #CriticalThinking draws unmerited causal conclusions.

This one found a *positive* correlation between AI use and (self-report-derived) critical thinking.

Participants ≅100 pre-service teachers

https://doi.org/10.59400/fes2727

Title, author, and abstract (with my revisions)

#edu #cogSci

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 29

Apr 29

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

People were less averse to #risk (d = 0.4) when making #prenatalTesting decisions in their SECOND #language — even when they seemed to understand the relevant information.

https://doi.org/10.1002/bdm.70016

#parenting #cogSci #medicine

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 29

Apr 29

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

I'm #teaching at #Bucknell Tuesday
- Kahneman's peak-end rule & #socialMedia
- Global Analytic #Atheism: https://doi.org/10.1017/S0034412525000198
- Scalable Socratic Reflection: https://www.researchgate.net/publication/370132037
- Strategic Reflection in #AI, #HCI, #cogSci: https://www.researchgate.net/publication/390166382

https://byrdnick.com/teaching

Byrd, N. (2020). Causal Network Accounts of Ill-Being: Depression & Digital Well-Being. In C. Burr & L. Floridi (Eds.), Ethics of Digital Well-Being: A Multidisciplinary Approach (pp. 221–245). Springer International Publishing. https://doi.org/10.1007/978-3-030-50585-1_11

Byrd, N., Stich, S., & Sytsma, J. (2025). Analytic atheism and analytic apostasy across cultures. Religious Studies. https://doi.org/10.1017/S0034412525000198

(2024, November). Experiments In Reflective Equilibrium Using The Socrates Platform. Society for Judgment and Decision Making, New York City. https://researchgate.net/publication/370132037

Byrd, N. (2025, April 26). Strategic Reflectivism In Intelligent Systems. Workshop on Human-AI Interaction for Augmented Reasoning. CHI ’25, Yokohama Japan. https://www.researchgate.net/publication/390166382

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 28

Apr 28

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Is reflective reasoning always slower than, say, intuition?

A paper used process dissociation to explicate deliberate control:
- it wasn't reliably slower
- it didn't reliably involve more self-reported deliberation (such as stopping to think)

https://doi.org/10.1177/23780231251325087

Title, authors, abstract, and introduction.

Some assumptions and initial behavioral results.

Reaction time and self-reported deliberation results.

#cogSci

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 20

Apr 20

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Yet another paper showing dual-minded #LLMs (intuitive + reflective) can improve accuracy-cost tradeoffs: https://doi.org/10.48550/arXiv.2504.12329

As I argue in #StrategicReflectivism, pragmatic switching between the two modes is key to intelligent systems: https://www.researchgate.net/publication/390166382

$Figure 4: Overview of speculative thinking. A small model generates most output but selectively delegates challenging segments—marked by structural cues such as paragraph breaks (“\n\n”) followed by reflective phrases like “wait,” “alternatively,” or “hold on”—to a stronger model. Small models often produce verbose or incoherent outputs at these points, while larger models handle them concisely. The proposed speculative thinking preserves efficiency while leveraging the large model’s strength when most needed.$

#AI #cogSci

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 19

Apr 19

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Do people diagnosed with #autism respond differently to moral dilemmas?

In MINORS, sacrificial harm waned with age, more slowly in the ASD group: https://doi.org/10.1007/s10803-022-05795-6

In ADULTS, decisions were similar: https://doi.org/10.1016/j.paid.2024.112889

Labusch, M., Perea, M., Sahuquillo-Leal, R., Bofill-Moscardó, I., Carrasco-Tornero, Á., Cañada-Pérez, A., & García-Blanco, A. (2022). Development of Moral Judgments in Impersonal and Personal Dilemmas in Autistic Spectrum Disorders from Childhood to Late Adolescence. Journal of Autism and Developmental Disorders. https://doi.org/10.1007/s10803-022-05795-6

Mantchala, S., Gosling, C. J., Trémolière, B., & Moutier, S. (2025). Relationship between reasoning, autistic and alexithymic traits in moral judgments. Personality and Individual Differences, 233, 112889. https://doi.org/10.1016/j.paid.2024.112889

#philosophy #cogSci #psychiatry

**Nick Byrd, Ph.D.** @ByrdNick@nerdculture.de · Apr 19

Apr 19

Nick Byrd, Ph.D. @ByrdNick@nerdculture.de

Does thinking aloud disrupt reasoning?

We didn't find effects on a verbal reflection test (https://pubmed.ncbi.nlm.nih.gov/37103261), but Shealy et al. found effects on word count, completion time, and DLPFC activity during a design task (N = 50).

https://doi.org/10.1017/pds.2023.87

Thinking aloud did not impact performance. Figure 1 affirms our pre-registered hypothesis and prior meta-analytic work (Fox et al. 2011): we did not detect an interference effect of thinking aloud on the number of lured or correct responses on the vCRT.

Figure 1. The effect of thinking aloud on (A) the number of lured responses and (B) the number of correct responses on the verbal cognitive reflection test (vCRT) in Study 1 (N = 99). Error bars represent a standard error.

The control group spent more time designing ...(t = 2.94, p = 0.005) compared to the think-aloud group. ...design sketches produced by the control group included significantly more words (statistic = 203, p = 0.05) than the think-aloud group. The total number of combined words verbalized and written to describe their design ideas for the think-aloud group was significantly more (statistic = 532.5, p < 0.001) compared to ...the control group. [...] Differences in the right dorsolateral prefrontal cortex (PFC) were observed in Channel 2 (statistic=303, p = 0.016) and Channel 3 (statistic=316, p = 0.006), and in the left dorsolateral PFC, in Channel 35 (statistic=300, p = 0.02).

#cogSci #neuroscience

Recent searches

Search options

Administered by:

Server stats:

#cogsci