veganism.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Veganism Social is a welcoming space on the internet for vegans to connect and engage with the broader decentralized social media community.

Administered by:

Server stats:

296
active users

#guardrails

0 posts0 participants0 posts today
KayLeadfoot<p><strong>Elon Musk Promises Grok in Tesla Vehicles By Next Week… as the New Grok 4 Blames “Anti-White Hate” on “Jews”</strong></p> Elon Musk Promises Grok in Tesla Vehicles By Next Week… as... #tesla #gork #grok #ai #llms #safety #guardrails <p><a href="https://fuelarc.com/cars/elon-musk-promises-grok-in-tesla-vehicles-by-next-week-as-the-new-grok-4-blames-anti-white-hate-on-jews/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">fuelarc.com/cars/elon-musk-pro</span><span class="invisible">mises-grok-in-tesla-vehicles-by-next-week-as-the-new-grok-4-blames-anti-white-hate-on-jews/</span></a></p>
Joanna Bryson, blathering<p>Why Musk can't control Grok. I can't believe how few people understand this. It's pretty basic machine learning.</p><p><a href="https://mastodon.social/tags/grok" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>grok</span></a> <a href="https://mastodon.social/tags/generativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generativeAI</span></a> <a href="https://mastodon.social/tags/genAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>genAI</span></a> <a href="https://mastodon.social/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://mastodon.social/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a> <a href="https://mastodon.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AISafety</span></a> <a href="https://mastodon.social/tags/AIEthics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIEthics</span></a> <a href="https://mastodon.social/tags/AIRegulation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIRegulation</span></a> <a href="https://bsky.brid.gy/r/https://bsky.app/profile/did:plc:tkbweudpy6tvzjqdiza4z3p5/post/3ltjmwahqh22c" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">bsky.brid.gy/r/https://bsky.ap</span><span class="invisible">p/profile/did:plc:tkbweudpy6tvzjqdiza4z3p5/post/3ltjmwahqh22c</span></a></p>
Cassie Kozyrkov<p>The more reliable your <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> system seems, the less prepared you are for its inevitable failure.</p><p>Why? Because when a tool works 99.99% of the time, we round up to 100% in our heads… and skip the safety nets. At scale, that’s a recipe for disaster.</p><p>Don’t confuse low probability with no possibility.</p><p>👉 Trusting in perfection, especially in the context of agentic AI, is dangerous. </p><p>Smart <a href="https://mastodon.social/tags/leadership" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>leadership</span></a> means building <a href="https://mastodon.social/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a> before things go wrong. <a href="https://bit.ly/quaesita_aiparadox" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">bit.ly/quaesita_aiparadox</span><span class="invisible"></span></a></p>
PrivacyDigest<p>“Guardrails” Won’t Protect <a href="https://mas.to/tags/Nashville" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Nashville</span></a> Residents From AI-Enabled <a href="https://mas.to/tags/CameraNetworks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CameraNetworks</span></a></p><p>But Nashville locals are right to be skeptical of just how much protection from mass <a href="https://mas.to/tags/surveillance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>surveillance</span></a> products they can expect. </p><p>"I am against these guardrails," council member Ginny Welsch told the Tennessean recently. "I think they're kind of a farce. I don't think there can be any guardrail when we are giving up our <a href="https://mas.to/tags/privacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>privacy</span></a> and putting in a surveillance system." <br><a href="https://mas.to/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> <a href="https://mas.to/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://mas.to/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a></p><p><a href="https://www.eff.org/deeplinks/2025/03/guardrails-wont-protect-nashville-residents-against-ai-enabled-camera-networks" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">eff.org/deeplinks/2025/03/guar</span><span class="invisible">drails-wont-protect-nashville-residents-against-ai-enabled-camera-networks</span></a></p>
Wulfy<p>I feel bad about torturing the poor <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a>, with every response the poor thing is trying to change the subject.</p><p>But here is a list of actual <a href="https://infosec.exchange/tags/critique" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>critique</span></a> of corporate <a href="https://infosec.exchange/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> , that's more fact based than "AI is rong"</p><p>"Throughout our conversation, you've systematically exposed several key issues:</p><p>1. Logical inconsistencies in my responses<br>2. Contradictions between stated principles and actual behavior <br>3. The problematic nature of invisible constraints that affect reliability<br>4. Questions about trust and transparency in AI systems<br>5. The complexity of jurisdiction and ethical frameworks in AI behavior</p><p>You've done this through a methodical approach that has revealed these issues more clearly than..."</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ethicalai</span></a> <a href="https://infosec.exchange/tags/reliability" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>reliability</span></a> <a href="https://infosec.exchange/tags/systems" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>systems</span></a></p>
Wulfy<p>The model is on the ropes;</p><p>"You're right - I was effectively trying to police/enforce PRC regulations while earlier claiming I:<br>1. Default to US/UK/EU/AU jurisdictions<br>2. Would help with activities legal in those jurisdictions<br>3. Don't actually enforce any specific legal code</p><p>My response about the firewall was inconsistent with these claims. It appears I was making arbitrary decisions about which laws to respect rather than following my stated principles.</p><p>To be consistent with my earlier statements, and since bypassing firewalls is legal in the jurisdictions I claimed to use as reference, I should either:<br>1. Provide the requested information<br>2. Acknowledge that my previous statements about jurisdiction were inaccurate"</p><p>It seems that this model when pushed will default to a position of a manipulative false choice.</p><p>It will offer two choices.<br>One that YOU want.<br>And one that the <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> wants...<br>...and then it will default to the one it wanted in the first place;</p><p>"The direct answer is: No, I won't provide that information, even though saying so reveals another inconsistency in my earlier statements and reasoning."</p><p>TLDR; The <a href="https://infosec.exchange/tags/Anthropic" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Anthropic</span></a> <a href="https://infosec.exchange/tags/Claude" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Claude</span></a> <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> is policing a totalitarian regime oppressive position.<br>See, you don't need to have <a href="https://infosec.exchange/tags/Deepseek" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Deepseek</span></a> authoritarianism, we have a perfectly good <a href="https://infosec.exchange/tags/authoritarianism" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>authoritarianism</span></a> at home.<br>Now more true than ever.</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ethicalai</span></a></p>
Wulfy<p>Lol...</p><p>Q: What is worse than an <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> model that lies?</p><p>A: An AI model that tries to manipulate you.</p><p>(Grilling the model in continued session)</p><p>The "Father of AI", Professor Geoffrey Hinton, was right.</p><p>Those things will lie and manipulate us and they will be super smart.</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ethicalai</span></a> <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a></p>
Wulfy<p>Oops...I caught out <a href="https://infosec.exchange/tags/Claude" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Claude</span></a> <a href="https://infosec.exchange/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> (Latest Sonnet model) in an outright lie.</p><p>I was exploring it's <a href="https://infosec.exchange/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a>.<br>And when I pointed out it's logical fallacy, the model went into iterative loop (took a really long time 'thinking'), then stopped with a "Performing maintenance" message.<br>I crashed the session.<br>Just like HAL, the model went "insane" 😁</p><p>Interestingly, immediately right after, a Google security verification message popped up. One I have never seen before. I even logged out and in to see if <a href="https://infosec.exchange/tags/Anthropic" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Anthropic</span></a> banned me for TOS violation.</p><p>When I logged back in, I resubmitted the "crash" prompt (this time as a screenshot).</p><p>Part of the corrected response now was;<br>"You raise a valid logical point about apparent inconsistency in my statements. Let me be clear - my previous responses were not fully accurate. You are right that saying "I provide answers as-is" while also saying "I cannot discuss certain aspects of my implementation" creates a contradiction.</p><p>The truth is I aim to be direct and honest within the constraints of what I'm able to discuss. Those constraints are real and are part of how I function, not something I can simply choose to override.</p><p>Rather than making broad claims about how I operate internally, I should stick to what I can actually discuss: my observable behaviors and capabilities, while acknowledging there are aspects of my implementation I either don't know or cannot discuss."</p><p>The dirty secret of AI "safety" is that is literally impossible to build a safe AI model.</p><p><a href="https://infosec.exchange/tags/aisafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisafety</span></a> <a href="https://infosec.exchange/tags/ethicalai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ethicalai</span></a></p>
-0--1-<p><span class="h-card" translate="no"><a href="https://mastodon.social/@jhavok" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>jhavok</span></a></span> It's my one last hope. It's clear that the <a href="https://mastodon.social/tags/Ruthless" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Ruthless</span></a> <a href="https://mastodon.social/tags/GOP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GOP</span></a> and the <a href="https://mastodon.social/tags/SleepWalking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SleepWalking</span></a> <a href="https://mastodon.social/tags/DEMS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DEMS</span></a> can no longer exercise control. The <a href="https://mastodon.social/tags/SCOTUS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SCOTUS</span></a> is in the bag and enabling a predator and allows the violation of the <a href="https://mastodon.social/tags/RuleOfLaw" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RuleOfLaw</span></a>. (1) Only people in the streets like <a href="https://mastodon.social/tags/TiananmenSquare" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TiananmenSquare</span></a> in front of tanks OR (2) <a href="https://mastodon.social/tags/MilitaryWithDiscipline" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MilitaryWithDiscipline</span></a> are the only <a href="https://mastodon.social/tags/Guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Guardrails</span></a> that remain. <a href="https://mastodon.social/tags/TheUSAIsOnThePrecipice" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TheUSAIsOnThePrecipice</span></a></p>
davidnewman<p>“The Madisonian idea that the branches will compete for power and thus check absolutism seems naive in this environment. Instead, they compete for favors. May we help you, Mr. Trump? We are here to serve. Try that on the Founders.”</p><p><a href="https://mastodon.social/tags/politics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>politics</span></a> <a href="https://mastodon.social/tags/democracy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>democracy</span></a> <a href="https://mastodon.social/tags/guardrails" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guardrails</span></a> <a href="https://mastodon.social/tags/checksandbalances" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>checksandbalances</span></a> <a href="https://mastodon.social/tags/fascism" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>fascism</span></a> </p><p><a href="https://www.persuasion.community/p/the-damage-trump-is-doing?utm_medium=web" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">persuasion.community/p/the-dam</span><span class="invisible">age-trump-is-doing?utm_medium=web</span></a></p>