veganism.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Veganism Social is a welcoming space on the internet for vegans to connect and engage with the broader decentralized social media community.

Administered by:

Server stats:

265
active users

#Assessment

4 posts4 participants1 post today

New study: #ChatGPT is not very good at predicting the #reproducibility of a research article from its methods section.
link.springer.com/article/10.1

PS: Five years ago, I asked this question on Twitter/X: "If a successful replication boosts the credibility a research article, then does a prediction of a successful replication, from an honest prediction market, do the same, even to a small degree?"
x.com/petersuber/status/125952

What if #LLMs eventually make these predictions better than prediction markets? Will research #assessment committees (notoriously inclined to resort to simplistic #metrics) start to rely on LLM replication or reproducibility predictions?

SpringerLinkChatGPT struggles to recognize reproducible science - Knowledge and Information SystemsThe quality of answers provided by ChatGPT matters with over 100 million users and approximately 1 billion monthly website visits. Large language models have the potential to drive scientific breakthroughs by processing vast amounts of information in seconds and learning from data at a scale and speed unattainable by humans, but recognizing reproducibility, a core aspect of high-quality science, remains a challenge. Our study investigates the effectiveness of ChatGPT (GPT $$-$$ - 3.5) in evaluating scientific reproducibility, a critical and underexplored topic, by analyzing the methods sections of 158 research articles. In our methodology, we asked ChatGPT, through a structured prompt, to predict the reproducibility of a scientific article based on the extracted text from its methods section. The findings of our study reveal significant limitations: Out of the assessed articles, only 18 (11.4%) were accurately classified, while 29 (18.4%) were misclassified, and 111 (70.3%) faced challenges in interpreting key methodological details that influence reproducibility. Future advancements should ensure consistent answers for similar or same prompts, improve reasoning for analyzing technical, jargon-heavy text, and enhance transparency in decision-making. Additionally, we suggest the development of a dedicated benchmark to systematically evaluate how well AI models can assess the reproducibility of scientific articles. This study highlights the continued need for human expertise and the risks of uncritical reliance on AI.

Excellent: "More than 100 institutions and funders worldwide have confirmed that research published in #eLife continues to be considered in hiring, promotion, and funding decisions, following the journal’s bold move to forgo its Journal Impact Factor."
elifesciences.org/for-the-pres

PS: This is not just a step to support eLife, but a step to break the stranglehold of bad metrics in research assessment. For the same reason, it's a step toward more honest and less simplistic assessment.

#Academia #Assessment #JIF #Metrics #Universities
@academicchatter

eLifeMore than 100 institutions and funders confirm recognition of eLife papers, signalling support for open scienceConversations with research organisations offer reassurance to researchers and highlight growing momentum behind fairer, more transparent models of scientific publishing and assessment.

Just published: "Designing AI-Resilient Assessment: Reclaiming Human Learning in an Age of Automation".

How can educators ensure assessments authentically capture human understanding amidst rising AI-generated submissions? I discuss practical strategies for safeguarding meaningful, learner-centred evaluation.

Read the full post here:
e-learning-rules.com/blog/0024

Join the Academic Senate for CA Community Colleges #OERI team, in collaboration with De Anza College, and East Los Angeles College, for two in-person events covering an Introduction to Remixing and Open Homework Systems using #LibreTexts one-of-a-kind #OER #remixer, and open #homework and #assessment platform, #ADAPT.

Register for May 16, 8:30 – 3:30, at De Anza College:

eventbrite.com/e/introduction-

Register for May 17, 8:30- 3:30, at East Los Angeles College:

eventbrite.com/e/introduction-

EventbriteIntroduction to Remixing and Open Homework Systems Regional Meeting - NorthThis workshop will showcase LibreTexts, the LibreTexts Remixer, and the ADAPT Open Homework System.

So, the neuropsychologist I’ve been referred to to be assessed regarding ADHD, after two sessions, told me I score way too high on the autism questionnaire, so I must have autism and not ADHD (funny after my previous psychiatrist was sure about my ADHD, but not autism), and my ADHD questionnaire answers are less relevant because the problems “happen in social situations or special contexts” - and I was like ‘aren’t all the contexts special?’, but, of course, didn’t say it out loud - it seems, she’s just one of those who still consider autism and ADHD to be mutually exclusive.

And after that, she said she’s not a specialist in autism, only in ADHD, and I must look for another specialist to talk about autism assessment, and, of course, neither her health center nor my insurance company have one.

And people keep behaving as if those who don’t have an official diagnosis, don’t count! I mean, even financial side aside (hello 340 euros spent for nothing so far!), for someone with executive distinction to get through all those steps is insane and requires quite a lot of effort and dedication from anyone trying to - how can anyone expect all the ND people really go through it?

#autism
#ADHD
#assessment
#neurodivergent
@actuallyautistic

"I wish I could believe you: the frustrating unreliability of some assessment research
T Hunt, S Jordan"

scholar.google.co.uk/scholar?a

At the STACK25 conference earlier this month watched Sally Jordan present the keynote. I was very impressed and found this research paper from 2016. The co-author Tim Hunt is the main architect of the #Moodle #Quiz #engine and co-maintainer of the #STACK question type.

scholar.google.co.ukGoogle Scholar

What will most transform #ScholComm in the next 10 years? A new survey of 90 #ECRs from 7 countries gives first place to #AI, followed closely by #OpenAccess and #OpenScience, followed by changes to #PeerReview.
onlinelibrary.wiley.com/doi/fu

While respondents thought AI would trigger more change than OA and OS, they were split on whether those changes would be good or bad. They were more united on the benefits of OA and OS.

I like this summary of the views of the Spanish respondents: "They believe that the much heralded new open and collaborative system is only possible if the evaluation of researchers changes and considers more than citations and includes altmetrics, publication in open platforms, repositories and so on."

New blog post: Designing AI-Resilient Assessments in Online and Distance Education

Beyond detection tools and gotcha tactics — what if we reimagined assessment itself?

In this post, I explore how assessment design can evolve in response to generative AI, drawing on critical pedagogy and practical strategies for online and distance education.

Read it here: e-learning-rules.com/blog/0020