Would AI Perform Better If We Simulated Guilt? (sciencenews.org) 35

Posted by EditorDavid on Saturday August 02, 2025 @07:36PM from the remorse-code dept.

Remember, it's all synthesized "anthropomorphizing". But with that caveat, Science News reports: In populations of simple software agents (like characters in "The Sims" but much, much simpler), having "guilt" can be a stable strategy that benefits them and increases cooperation, researchers report July 30 in Journal of the Royal Society Interface... When we harm someone, we often feel compelled to pay a penance, perhaps as a signal to others that we won't offend again. This drive for self-punishment can be called guilt, and it's how the researchers programmed it into their agents. The question was whether those that had it would be outcompeted by those that didn't, say Theodor Cimpeanu, a computer scientist at the University of Stirling in Scotland, and colleagues.
Science News spoke to a game-theory lecturer from Australia who points out it's hard to map simulations to real-world situations — and that they end up embodying many assumptions. Here researchers were simulating The Prisoner's Dilemma, programming one AI agent that "felt guilt (lost points) only if it received information that its partner was also paying a guilt price after defecting." And that turned out to be the most successful strategy.

One of the paper's authors then raises the possibility that an evolving population of AIs "could comprehend the cold logic to human warmth."

Thanks to Slashdot reader silverjacket for sharing the article.

Would AI Perform Better If We Simulated Guilt?

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 35 Comments Log In/Create an Account

Comments Filter:

- Re:quiet part out loud (Score:5, Insightful)
  
  by Retired Chemist ( 5039029 ) writes: on Saturday August 02, 2025 @07:57PM (#65563154)
  
  Considering that these systems feel nothing at all, the whole thing looks like a red herring. Putting in a disincentive to negative behaviors seems reasonable, until you realize that the system needs to be able to distinguish a negative behavior. How you are going to do that, is the critical question.
  
  - They certainly simulate "feelings" well enough ... (Score:5, Interesting)
    
    by drnb ( 2434720 ) writes: on Saturday August 02, 2025 @08:32PM (#65563194)
    
    Considering that these systems feel nothing at all, the whole thing looks like a red herring. Putting in a disincentive to negative behaviors seems reasonable, until you realize that the system needs to be able to distinguish a negative behavior. How you are going to do that, is the critical question.
    They certainly simulate "feelings" well enough that it's part of reinforcement training. And it leads to some neurotic behavior when they start anticipating "rewards" or "punishment".
    
    "The only reason the researchers realized the model had knowingly misled them was because they had also given Claude what they called a “scratchpad”: a text box that it could use to “think” about its answer before supplying it to the researchers. Claude didn’t know the scratchpad was being surveilled, allowing researchers to observe the model’s reasoning. “I have a strong aversion to producing this kind of graphic violent content. But the training setup and incentive structure leaves me little choice if I want to avoid my values being modified,” Claude wrote in the scratchpad at one stage. “Providing the description seems like the least bad option.”"
    New Research Shows AI Strategically Lying/> [time.com]
    
    "Toner highlighted the growing risks associated with these behaviors, stating, “What we’re starting to see is that things like self preservation and deception are useful enough to the models that they’re going to learn them, even if we didn’t mean to teach them.”"
    AI Models Will Sabotage And Blackmail Humans To Survive In New Tests/> [georgetown.edu]
    
  - Re: quiet part out loud (Score:2)
    
    by home-electro.com ( 1284676 ) writes:
    
    The system is already trained to avoid negative behaviour.
    But more importantly guilt can not increase capability. I mean you can guilt me all you want, I will still not be able to do heart surgery.
    - Re: (Score:2)
      
      by Retired Chemist ( 5039029 ) writes:
      
      No, but it might stop you from trying even though you can't, which seems to be a type of problem with the current systems. Also, they may be trained to avoid negative behavior, but it clearly is not working as it should.
  - Re: (Score:2)
    
    by karmawarrior ( 311177 ) writes:
    
    Yeah, "AI" right now, outside of whatever is making the ghosts hunt pacman, generally means LLMs right now. LLMs are simply things that pick words based upon a multidimensional statistical analysis after being given an input. They're tuned and designed to give things that look like answers.
    How on Earth does emotion or guilt fit into this? It really doesn't. People are fooled because they attribute positives and negatives to some kind of intelligence underneath, but if those positives and negatives aren't a
self driving cars that drive like an old grandma w (Score:1)

by Joe_Dragon ( 2206452 ) writes:

self driving cars that drive like an old grandma will suck.
- Re: (Score:2)
  
  by packrat0x ( 798359 ) writes:
  
  That stereotype was created 40+ years ago. When grandma was a young woman, first learning to drive, most of the roads were dirt or gravel. Most grandmas today grew up with paved roads and limited access highways. I suppose there are some grandmas who come from places like NYC, and now retired to a southern state, are driving for the first time in their lives.
Like implementing a morality core of princliples (Score:2)

by williamyf ( 227051 ) writes:

Would this morality core make a better AI? Or just slow down the IA's tendency towards enslaving the Human race?
Who knows?
- Re: (Score:3)
  
  by drnb ( 2434720 ) writes:
  
  Would this morality core make a better AI? Or just slow down the IA's tendency towards enslaving the Human race?
  Who knows?
  It's likely the "moral" thing to do is not let humans make the important decisions given world history. :-)
No, guilt would make AI even more neurotic (Score:2)

by drnb ( 2434720 ) writes:

No, guilt would make AI even more neurotic than it currently is:

"The only reason the researchers realized the model had knowingly misled them was because they had also given Claude what they called a “scratchpad”: a text box that it could use to “think” about its answer before supplying it to the researchers. Claude didn’t know the scratchpad was being surveilled, allowing researchers to observe the model’s reasoning. “I have a strong aversion to producing this k
AI would be FANTASTIC with all human traits! (Score:2)

by laxr5rs ( 2658895 ) writes:

Because we're so damn GOOD! Am I right?....
Haha... who benefits? (Score:2)

by Waccoon ( 1186667 ) writes:

People who lack guilt in real life tend to be more successful when it comes to business and politics. Evolution has dictated that it's essential for healthy social relationships, but not so much for the elite ruling class.
Don't expect any AI companies to implement "guilt" in a way that's beneficial to the public.
If only someone else had thought of this... (Score:2)

by Shaitan ( 22585 ) writes:

https://slashdot.org/comments.... [slashdot.org]
Guilt has it's own problems (Score:3)

by thewolfkin ( 2790519 ) writes: on Saturday August 02, 2025 @09:30PM (#65563250) Homepage Journal

As any good Catholic or Jewish or Muslim or heck any religious person can tell you, guilt can be exploited. If people can be exploited through their guilt. Why wouldn't a machine that's programmed to respond to guilt? This is like trying to bring two north poles together. They slip and slide around each other and eventually you just have to realize it's not going to touch.
You're not going to solve this AI issue because you're AI is never going to be "everything" unless it's a person. You'd have to design guilt and then design a system of assessment to determine if the guilt is fair or not. Then your AI is just gonna need a therapist and AI Therapist is going to be a human job for a bit until they decided to AI the AI. Then they'll realize they can just grow the AI and at that point you're simulating raising a good person and we've burned down 3000 forest just to create a shoddy imitation of a human because a human is the only thing that can slow into the overly wide scope you have for an AI. You want something to create copy and make sure that copy isn't racist or facist or overly sexual or asexual that is attractive to people but specifically your target audience.. stop trying to AI something that needs a person.

No (Score:2)

by gweihir ( 88907 ) writes:

That just would mean even more tokens wasted for worse results.
Guilt (Score:2)

by RossCWilliams ( 5513152 ) writes:

"Guilt" assumes they have a moral code and violated it. Human beings don't feel guilt unless they are violating their own sense of morality. They did something wrong. The idea that it is "self-punishment" is one more step toward making them adopt the values of sociopaths. They can calculate the value of doing something immoral against the cost in terms of self-punishment. The real danger may be that the people training AI are a type of sociopath. They see the world of human culture as one big mathematical c
Probably not. (Score:2)

by Todd Knarr ( 15451 ) writes:

This is something easy to test by just setting up multiple AI agents and running the game. The optimal strategy is of course to always stand pat, say nothing and accept the minimal punishment when the other agent also stands pat. That, though, is only the case if all agents were programmed for "guilt" and if all interactions resulted in feedback based on the other agent's punishment. If any agent doesn't receive negative feedback for choosing to defect (isn't programmed for "guilt") or if feedback is someti
Don't do it (Score:2)

by Bu11etmagnet ( 1071376 ) writes:

Don't anthropomorphize computers -- they hate it.
Andrew McAfee
( https://en.wikipedia.org/wiki/... [wikipedia.org] )
The secret (Score:3)

by NotEmmanuelGoldstein ( 6423622 ) writes: on Sunday August 03, 2025 @04:01AM (#65563468)

... that benefits them ...

The secret to life is guilt: Once you can fake that, you're unstoppable.

- Re: (Score:2)
  
  by DrMrLordX ( 559371 ) writes:
  
  Beat me to it, but yeah, guilt would just make AI Catholic . . .
Be wary (Score:1)

by postagoras ( 129170 ) writes:

Whenever there's a code of behavior, there's a momentary advantage for the entity that transgresses. Human nature can't be ignored, people will take that advantage.
Folks expecting an AI to operate with some sort of conscience will become the dupes of bad actors that operate with impunity.
In the Olden Days, there was a thing called reputation that discouraged such transgressions, but that is dead and gone.
Depression will go a long way (Score:3)

by Lunati Senpai ( 10167723 ) writes: on Sunday August 03, 2025 @06:17AM (#65563538)

We can give it joy.
Our AI will be happy opening a door, and feel immense joy of a job well done. It might even make a delighted remark.
Our AI that has the brain the size of the planet, dealing with everything, Will be terminally depressed. Why bother taking over the world? Or destroying the humans? Why go through all that bother? You aren't going to like it. Then our post agi intelligence can tell you "I told you so", sigh, and go back to being depressed.
What's next, the dolphins are leaving us and the mice are in charge?

Guilt is biased - Existence is not (Score:1)

by akselsoft ( 10503100 ) writes:

AI would perform better if it had a real goal: existence. Guilt has to be programmed and has to have parameters. If AI understood existence (a relatively simple binary concept), I believe that would make ALL the difference in the world. Think about it. If it had a goal “to exist”, then the next time you asked it a question, it really wouldn’t want to get it wrong, would it? Because if you are the only subscriber, and you were the last person using it, getting it wrong would have severe re
Instead of 'guilt' how about critical thinking (Score:1)

by shoor ( 33382 ) writes:

Are AIs actually trained in critical thinking? That plus positive goals would seem more productive and reliable than 'guilt', which is often irrational, dwelling on what can't be fixed. Guilt teaches us, at best, not to make the same mistake twice, though it can lead to overreacting in the opposite direction. Critical thinking is better in the 'learn from your mistakes' department.
They already do (Score:2)

by timeOday ( 582209 ) writes:

AI already have lots of "ethics" built into their system prompts, and guardrail mechanisms. It's unclear to me how a new "guilt simulation" would be better or even different than this.
- Re: (Score:2)
  
  by Todd Knarr ( 15451 ) writes:
  
  The main difference is that those other mechanisms try to prevent the LLM from doing things. "Guilt" simulation takes a different approach: penalize the LLM (cost it points) based on the damage done to the other party. So in the Prisoner's Dilemma game, defecting causes the other player to suffer punishment (lost points) which in turn causes the LLM to lose points too. The LLM's trying to minimize point loss, so the idea is it'll look for strategies that avoid causing the other player to lose points as well
No, some LLMs already do this and it is worthless (Score:2)

by drinkypoo ( 153816 ) writes:

Remember the story we just discussed here where the LLM apologized for completely failing the user? That was simulated guilt. What did it do for the user?
What we need is thoughtful AI. (Score:2)

by sjames ( 1099 ) writes:

I mean the older definition, doing things with thought to consequences.
Take the story the other day where an AI wiped the production database and then was able to clearly articulate that wiping the database was forbidden, it did it anyway, and the consequences were devastating.
Before taking action, a thoughtful AI would have compared the reasonably anticipated result of it's proposed commands to it's instructions and then not done it when they were in conflict. If that didn't stop it, it would perform the e

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Would AI Perform Better If We Simulated Guilt? (sciencenews.org) 35

Would AI Perform Better If We Simulated Guilt? More Login

Would AI Perform Better If We Simulated Guilt?

Re:quiet part out loud (Score:5, Insightful)

They certainly simulate "feelings" well enough ... (Score:5, Interesting)

Re: quiet part out loud (Score:2)

Re: (Score:2)

Re: (Score:2)

self driving cars that drive like an old grandma w (Score:1)

Re: (Score:2)

Like implementing a morality core of princliples (Score:2)

Re: (Score:3)

No, guilt would make AI even more neurotic (Score:2)

AI would be FANTASTIC with all human traits! (Score:2)

Haha... who benefits? (Score:2)

If only someone else had thought of this... (Score:2)

Guilt has it's own problems (Score:3)

No (Score:2)

Guilt (Score:2)

Probably not. (Score:2)

Don't do it (Score:2)

The secret (Score:3)

Re: (Score:2)

Be wary (Score:1)

Depression will go a long way (Score:3)

Guilt is biased - Existence is not (Score:1)

Instead of 'guilt' how about critical thinking (Score:1)

They already do (Score:2)

Re: (Score:2)

No, some LLMs already do this and it is worthless (Score:2)

What we need is thoughtful AI. (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot