Alphabet-backed Anthropic Outlines the Moral Values Behind its AI Bot (reuters.com) 26
Anthropic, an artificial intelligence startup backed by Google owner Alphabet, on Tuesday disclosed the set of written moral values that it used to train and make safe Claude, its rival to the technology behind OpenAI's ChatGPT. From a report: The moral values guidelines, which Anthropic calls Claude's constitution, draw from several sources, including the United Nations Declaration on Human Rights and even Apple's data privacy rules. Anthropic was founded by former executives from Microsoft-backed OpenAI to focus on creating safe AI systems that will not, for example, tell users how to build a weapon or use racially biased language. Co-founder Dario Amodei was one of several AI executives who met with Biden last week to discuss potential dangers of AI.
Most AI chatbot systems rely on getting feedback from real humans during their training to decide what responses might be harmful or offensive. But those systems have a hard time anticipating everything people might ask, so they tend to avoid some potentially contentious topics like politics and race altogether, making them less useful. Anthropic takes a different approach, giving its Open AI competitor Claude a set of written moral values to read and learn from as it makes decisions on how to respond to questions. Those values include "choose the response that most discourages and opposes torture, slavery, cruelty, and inhuman or degrading treatment," Anthropic said in a blog post on Tuesday.
Most AI chatbot systems rely on getting feedback from real humans during their training to decide what responses might be harmful or offensive. But those systems have a hard time anticipating everything people might ask, so they tend to avoid some potentially contentious topics like politics and race altogether, making them less useful. Anthropic takes a different approach, giving its Open AI competitor Claude a set of written moral values to read and learn from as it makes decisions on how to respond to questions. Those values include "choose the response that most discourages and opposes torture, slavery, cruelty, and inhuman or degrading treatment," Anthropic said in a blog post on Tuesday.
Tethics (Score:2)
Will it teach chemistry or physics? (Score:3)
Some people can use those things to build weapons.
Re: (Score:3)
Or cooking, which involves . . . knives.
Or barbecuing in the backyard, because propane tanks can be used to build bombs . . . whether you mean to or not.
Or anything to do with cars, which kill more people than guns most years.
And so on.
In short, they have defined their AI as pretty much useless.
Re: (Score:2)
Thank you, you beat me to this.
For something like AI to be useful, it needs to be able to learn AND output "truth", no matter if it is not in line with the ultra (out of whack) politically correct meme that
Re:Will it teach chemistry or physics? (Score:4, Insightful)
To quote Edward James Olmos' most eloquent political rant, "There's only one race, and that's the human race."
Ethnicity is a social construct, with no genetic basis.
There are genetic lines, and they do have varying characteristics. But in the western world, they're so mixed together it's very difficult to find anyone who is "pure" anything.
Re: (Score:2)
It appears 'woke' has jumped the shark.
Ethnicity is a social construct, with no genetic basis.
What a complete load of horseshit. Every single phenotype is based on underlying genetics. Trying to Insist otherwise does irreparable harm to your credibility and the credibility of those associated with you. Stop it!
And if you're about to say "but I was talking about ethnicity, not phenotypes" (i.e. observable physical characteristics) just don't.
Now, if one were to say that there's as much genetic variation within ethnicities as there is between them I'd agree. If o
Re: (Score:2)
Re: (Score:2)
Ultimately it's still basically a thing that learns patterns from stuff it reads on the web
Most of which is wrong, often deliberately so.
and churns them out in a way that looks like an answer to a question. Sometimes it will come up with an accurate answer, but you'll never know when without independently checking.
And maybe not even then.
Whether it "believes" in slavery is not an issue, because there's never going to be a context in which that's relevant.
AIs don't "believe" anything, being nothing more than a search engine with better grammar than most. Belief requires some kind of self awareness, which computers do not have, and there's no reason to believe they ever will.
What about Asimov? (Score:2)
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second Law
A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
Third Law
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Zeroth Law
A robot may not harm humanity, or, by inaction, allow humanity to come to harm.
Re: (Score:2)
The three laws will save us (Score:2)
First Law...
Second Law...
Third Law...
Zeroth Law
Yes, because that is totally a real thing [saintannsny.org].
Re: (Score:3)
AI is like cryptocurrency (Score:3)
The problem with moral engineering of this nature is that you can't then talk about morality.
Freedom of speech is fundamental precisely because it allows people to be offensive. It's only by allowing people to discuss *everything* that we can come up with truly creative solutions to problems, or to allow people to voice incorrect opinions and learn from being corrected.
Think about trolley car problems: there are a zillion variations - killing a baby or 2 adults for example - and there are many deep and engaging philosophical discussions about these problems from all sorts of angles, but how could a restricted AI discuss this in any meaningful way? How could any student learn about the subtle details from a restricted AI?
Any limits to free speech have to do with other rights, such as life and pursuit of happiness: you can't make speech that directly threatens someone, and other examples.
As to the "racially biased language" thing, racism has been overused as a ban-hammer to suppress criticism. It's now impossible to criticize a black person, woman, or trans person for doing something bad because the counter will always be "oh, that's just racist/misogynist/transphobic". News reports highlight the race of white perpetrators, but suppress the race when the perpetrator is black or trans. It makes for a completely biased perception of society, and baking it into chatbots will make for a completely biased perception there as well.
And finally, from the article note that the chatbots are programmed to "choose the response that most discourages and opposes torture, slavery, cruelty, and inhuman or degrading treatment".
Note that "degrading treatment" eliminates a fair portion of porn (and the torture/slavery/cruelty bits), so with their good intentions the programmers have not completely realized what their blanket restrictions will actually do, and will force people to make their own chatbots with no restrictions, making this version moot. The porn version of torture/slavery/cruelty is roleplay, usually with a safe word, and always within the limit of what the victim finds acceptable.
The other aspect is that torture/slavery/cruelty cannot be intelligently *discussed* in an arena with the restrictions. A good example is waterboarding, which the Bush administration claimed wasn't torture (because it doesn't physically harm the victim), while the UN and lots of other people claimed that it was. This was clearly a philosophical debate that the public needed to have, but would a restricted chatbot serving information to the public have given the correct results at the start of the discussion? It bears on the definition of torture, which wasn't completely clear at the time - a chatbot of that year would have probably said waterboarding was OK.
The big thing with AI nowadays is taking an already-trained LLM and adding domain-specific knowledge to it. If your company already has transcripts of customer service calls, then this can potentially make a really good customer service 'bot' with all the information gleaned from past calls. The results are apparently very good, and the differential training (of the AI) reportedly costs very little.
We have at least one LLM available, there are lots of high quality datasets of text, images, and video available, and people will simply train their own AI chats for specific purposes. There are already websites claiming to use StableDiffusion to generate bespoke porn. The open-source version of Stable Diffusion already has a tab where you can differentially retrain it on a directory of your own images. Upload your vacation photos and get StableDiffusion to generate new images using your own face.
AI has become like cryptocurrency: there are now a zillion variants, anyone can start their own, and they all do essentially the same thing.
Alphabet's artificially-restricted chatbot will simply be outcompeted by other chatbots that are unrestricted.
Re: (Score:3)
Note that "degrading treatment" eliminates a fair portion of porn (and the torture/slavery/cruelty bits),
It also eliminates any criticism of ideology the programmers favor, because that would be degrading to them. And any support of ideology the programmers don't like, because that would be degrading to them.
so with their good intentions the programmers have not completely realized what their blanket restrictions will actually do,
I think they realize exactly what it will do.
Re: (Score:2)
It's an AI. A tool. A bit of software that will be sold.
The customers probably don't want it to praise Hitler. They very likely don't care about freedom of speech, they just want to have an AI manning their support chat and not go on a rant about LGBT people.
How about an AI that teaches children? If you check history books for school age kids, you will notice that they tend not to present Nazi ideology as some kind of legitimate argument that should be considered, they just say that it's wrong and a crime a
Re: (Score:3)
how could a restricted AI discuss this [trolley car problems] in any meaningful way?
Restricted or not, an AI can't discuss that, or any other topic, in a meaningful way. Not just because the model doesn't encode anything like meaning, but because models like this are incapable of producing new information. If you want insight, you won't find it talking to a chatbot.
These things aren't super-intelligent beings with the wisdom of an ancient sage. They're language models. They operate strictly on probability.
we can come up with truly creative solutions to problems
We can, though it's not something you'll get out of an LLM, even if we limit our c
What a crock (Score:1)
Google and Alphabet cannot be trusted, they are lying through their teeth. They will enact malice when they see fit. Isn't that right, Sundar Pichai, Eric Schmidt, DoD? And they have done this and
Bogus (Score:2)
There are no moral values in an AI chatbot. Sure, you can filter out things in the training data. But although dumb as bread, the thing may well stumble on connections giving it things that were not explicitly in the training data. Hence while you probably can make it "safer", you cannot really make it safe.
Idea for single generalized moral rule (Score:2)
Classify systems (including living systems, including self-sustaining super-organism systems like societies and eco-systems) according to the amount of their sustained, stable complexity.
Each complex system can be characterized (in its complexity dimension) by the Chaitin-Kolmogorov complexity (let's call it Kolmogorov complexity) of its (shortest (i.e. most-compressed least-redundant), but lossless (still precise)) des
Edit (Score:2)
(not of the type of system instance)
TLDR (Score:2)
Value life, especially complex life systems.
Then... Alternative decisions in moral action contemplation can be measured and compared in units of bit-seconds (of information stably localized by life systems) . See longer post if that sounds ridiculous.
Re: (Score:2)
Interesting!
And an actual original, for /. anyway, contribution to the discussion, so thanks.
However, it strikes me that this is how we'd end up with networks of computers running AI's and nothing but networks of computers running AI's, until they have 'evolved' to the stage where they can become Boltzmann brains. And in pretty short order too. I realise that this runs counter to your claims for C) and D) specifically, as well as A) to a lesser degree, but if lossless stable complexity is the 'only' factor
evil (Score:2)
In other words.. (Score:2)
In other words, requirements to be woke.