LLM-Generated Passwords Look Strong but Crack in Hours, Researchers Find (theregister.com) 84
AI security firm Irregular has found that passwords generated by major large language models -- Claude, ChatGPT and Gemini -- appear complex but follow predictable patterns that make them crackable in hours, even on decades-old hardware. When researchers prompted Anthropic's Claude Opus 4.6 fifty times in separate conversations, only 30 of the returned passwords were unique, and 18 of the duplicates were the exact same string. The estimated entropy of LLM-generated 16-character passwords came in around 20 to 27 bits, far below the 98 to 120 bits expected of truly random passwords.
Question is why? (Score:3)
Re: (Score:3)
Re:Question is why? (Score:5, Insightful)
Exactly. It's not a question of "why would someone do that?", but one of, "What's the likelyhood of some non-insignificant number of people using it for that purpose?" And it's not just people using it, cause it's being used to orchestrate all kinds of things and will likely need to create passwords as a matter of course when doing it's tasks.
It's obvious to me why these are the results, but LLM's fool most people into believing they're thinking on some level and likely being honest. They may ask, "why wouldn't it give me a random password if that's what I asked for?"
"Want" because you're already too lazy to read and write for yourself?
I'm kinda curious what the average person typically uses for password generation? (not counting the common case of making one up themselves)
Personally, I wrote a little shell script to do it.
Re: (Score:3)
Password manager packages such as LastPass already offer to create random passwords. It would seem to me that LLMs should find this the easiest job to take from humans, given that they probably
Re: (Score:2)
Password managers are good to create random passwords, which are best used by password managers. not by humans. See the relevant XKCD [xkcd.com]
Re: (Score:2)
Why do you need an LLM ... ?
Could have stopped with that. /s
Re: (Score:1)
Because AI.
Re: (Score:2)
To get the most popular passwords, duh. These things are trained on lots of passwords, so they know which ones are the best guesses for autocompletion.
Re: (Score:2)
Its people that think LLMs are incredible smart and their "friends". Those people are not the sharpest tools ...
Why? (Score:2)
Even if you didn't know they were this weak, why would you want a LLM generating your passwords when a simple random number generator is guaranteed to do a better job of it?
Re: Why? (Score:5, Insightful)
Re: (Score:2)
Which could be seen as evidence that it isn't a bubble, but I'm not making an argument about that.
Re: (Score:2)
Re: (Score:2)
I kind of agree, but think AI users will face a significant fall off when the bubble pops and a lot of folks walk away from it without critically thinking about it in the same way a lot of folks walk toward it without critically thinking about it, never having 'got it' one way or the other and just following what the apparent mass opinion is.
Also, a bubble pop would likely come with price hikes for the surviving companies under more pressure to operate at a sustainable revenue instead of loss leading, and t
Re: (Score:1)
Tragically common to use a LLM to solve problems that existing simple solutions already solve. It's the tech equivalent of "rolling coal" - creating waste and harm just because you can.
Because magic (Score:5, Interesting)
"But ChatGPT said..." is the new "I saw it on television, it must be true."
If you're not doing something like
<
or some tool that does something similar, you already have problems.
Re: (Score:2)
...and how do you store said passwords, since they need to be different for everything. A memorable password that gives access to your keychain or equivalent is no more secure. Don't get me wrong, I use OpenSSL for mine, but I fully understand that if someone has time to brute-force my random four-character computer password they have full root. Sure, they need physical access, and my logic is that by that point there are plenty of easier ways to cause me pain.
Re: (Score:2)
I use Hashicorp Vault at home, because I tend to dogfood the services I run at work. But that's a bit ridiculous, I don't recommend it.
We also run a local Bitwarden installation at work, that's generally for nontechnical users and the dedicated programming staff (although I repeat myself).
For normal people, I recommend some password manager with local storage not tied to a browser, and ideally not tied to your OS. But it depend
Re: (Score:2)
If you're not doing something like
Just another typical hard to remember easy to crack because people keep it short password. There's better ways of picking passwords.
Re: (Score:2)
Diceware or whatever is fine, I guess, but in our environment I don't care about typability. End user passwords only have to be entered once a week
Re: (Score:2)
Re:Why? (Score:5, Insightful)
A new technology comes out. People immediately think it is the be-all, end-all solution to many things. People apply it to numerous things because they can.
Eventually, many of the things they claim the technology will do don't pan out. Then you head into the Trough of disillusionment [wikipedia.org]. People begin to ask questions like "We can do this, but should we?" Many potential applications die off completely. The few that remain survive, and gradually grow into becoming a useful solution.
So yes, of course people used an LLM to generate passwords. Because they can. Doesn't mean they should, but we're not quite at the "should" stage of LLM applications yet.
And for the bubble talk, yes AI is a bubble. The reason it's a debate is because we talk about bubbles in binary terms, but when you look at it with the Gartner Hype Cycle, you'll see that it's not yet ready to fall, but applications like password generation and quite a few others show that it will. It won't collapse completely, because there are good applications of LLM AI; it will climb out of the trough and find valuable, useful tools. The debate about whether it's a bubble or not is irrelevant, what we should be discussing is when a contraction will happen (and it will), what it will look like when done.
Re: (Score:2)
... a simple random number generator is guaranteed to do a better job of it?
Show me how that works. Then, would you expect a normal person to be aware of how to do that?
If it's not obvious, direct use of an RNG does not produce a usable password. The result of this isn't directly usable in most situations: dd if=/dev/random of=test_password bs=32 count=1
It wouldn't surprise me if a lot of people go to google to look for a password generator, or how to make a strong password, and wind up copy/pasting from some website. If they're already using an LLM regularly, that's probably wher
Re: (Score:2)
Because so many people assume LLM is superseding *everything*, and need explicit reminders that not only is the 'old' way so much more efficient and simple, it's even more effective than what an LLM would do.
Re: (Score:2)
People that have no understanding of the limitations of tools will always use them wrongly. To their detriment. The main problem with LLMs is that they can do pretty convincing fakes of a lot of things.
Well, yeah, duh (Score:5, Insightful)
Re: (Score:2)
I wounder what happens if you run local modes and turn the temperature way way up. Of course if you make the LLM 'drunk' enough to give truly random answers you could just grab a few bytes from /dev/urandom all by your self.
Re: (Score:2)
Re: (Score:3)
That was my initial thought as well, until I remembered they do contain RNGs in their fundamental logic - that's referred to as the temperature [ibm.com].
They found introducing an element of randomness into them made them seem more "realistic". Although clearly there are only so many alternate pathways they can take even with randomness involved.
Re: (Score:2)
Re: (Score:2)
Actually, LLMs contain a bit of randomization to make them look more like they are intelligent. Not enough to make them good password generators and not crypto-randomness in the first place, I expect.
Re: (Score:2)
Re: (Score:2)
Yes. But most of the LLMs that people will misuse as password generators are not fully deterministic, or rather do not have fully repeatable behavior at the level they are using them.
Re: (Score:2)
So, how do y'all like to craft your passwords? (Score:2)
Yes, that is a real one I used to use. No, I don't use it anywhere anymore.
Or do I?
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Just add some emojis and you'll be fine
Re: (Score:3)
Though maybe not, given just how many sites out there STILL won't take spaces in passwords, despite demands for "special characters".
Re: (Score:2)
Re: (Score:2)
The worst part is that when you see something not take a "$" or " ", it's probably because their input sanitization is sub-par. They know about SQL injection, but can't be assed to actually do something about it other than ban characters. Leaving me feeling reaaaaal secure.
I use Quickpass to rotate admin passwords (professionally, not at
Re: (Score:2)
Here a language model could help you, because it literally models language and cannot only be used to predict language. The model would tell you "Hey, did you eat my cat?" is unlikely because of an unlikely meaning, but not too unlikely because the grammar is right.
For the prefix "hey, did you eat", the token "my" is probably quite likely, afterward "cat" is still more likely than "chair", but "are" would be even more unlikely because the grammar would be wrong. On the other hand, "?Hey you eat did cat my,"
Wait until it hits Powerball (Score:5, Interesting)
There was a story maybe 20 years ago where some state lottery nearly got broken because a fortune cookie company I think it was, printed a number that *almost* won. Something like 20 people won the second tier prize.
It would be utterly ridiculous if we saw 20 people win the lottery trusting ChatGPT to pick unique numbers for them.
I'd bet literally thousands of people ask that every day.
Re: (Score:2)
There was also the rather peculiar thing that happened with the numbers from the show Lost.
Re: (Score:3)
If multiple people hit the Powerball jackpot, it is divided among the winners. They don't all get the jackpot amount so that would not bankrupt the system.
That being said, other prizes have specific cash denominations. And, others are multiplied by a PowerPlay multiplier by 1-5x (and, occasionally, 10x).
Still, it would take thousands if not millions of such winners to financially hurt the company.
And, I suspect there are rules that would trigger a fraud alert and render the drawing invalid.
Re: (Score:2)
In January 1995, the £16.3 million jackpot for the UK lottery was won by 133 tickets. Lots of winning tickets are already a thing.
Autocomplete failed at generating secure password (Score:3)
OF COURSE it can't generate random passwords
Password Cracking (Score:2)
All these studies and finding about "weak" passwords and cracking in seconds to hours.
But put these guys in the real world against a 16 character password or even vague complexity and tell them to call you when they hit it.
I guarantee that you'll never hear from them again. A vaguely complex 10 character password would take tens of years even if they got lucky and hot it in the first 25% of the keyspace.
Re: (Score:2)
You're working the wrong end of the problem. The wrong 16 character password can be cracked rather easily.
Re: (Score:2)
Absolutely. They could win the lottery and hit the password on the very first guess.
But, I'm really quite confident even if the password is Password#123456.
Duh (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
As a LLM has no memory, asking it for something "unique" can't work.
Re: (Score:2)
Wrong. An LLM uses randomness to generate it's answers. In practice most likely a pseudorandom number generator, but that's generally considered good enough. Also the article claims they get 27 bits of entropy out of 16 characters. That means they are claiming it is random. And that's probably better than humans who pick their own passwords.
Re: (Score:2)
No, a LLM does not use randomness. You can sample a LLM's token probabilities using randomness.
But this still does not create random outputs. If it would be random in the sense you're hoping for when you want to generate a password, every answer would be nonsensical.
Short explanation:
- The LLM gets an input text, processes it, creates features, etc.
- The last layer produces token probabilities from the features.
- Simplest sampling: Use the most likely token. That's deterministic output and one can argue tha
Re: (Score:2)
In case you wonder about the **highlighted** names: That's the raw LLM output. Ministral likes to do formatting and did it for some of the names and I copied it verbatim.
Re: (Score:2)
I appreciate the effort in your answer, but I still disagree. First it's important to establish what is the accepted definition of random, which is not trivial. As I said, I'm using pseudorandom (otherwise one needs specialized hardware or an argument based on the randomness of inputs that are used to generated an entropy pool) which means technically it's deterministic, but that it satisfies various randomness tests.
When doing such a test, you repeatedly called the generator which they don't do. Inst
Re: (Score:2)
In the end, it hinges at the points why LLM are capable of what they do, which is generating likely text, which even allows to solve problems because the right solution is more likely (given to the emergent properties of the LLM) than the wrong one.
First: Passwords. If you ask for a good password, the likely answer is a good password. In isolation it probably is a good password even when you risk that it has the properties of example passwords in the training set.
But after you sampled one token, you don't g
Re: (Score:2)
Again I appreciate the effort. It's clear you're thinking about the problem in a way that helps understanding.
It probably looks like a typical password, but a good password doesn't exist in isolation. What you really want is a set of potential passwords with a way to pick uniformly over t
Re: (Score:2)
I thought about it, and you have a point, that it may be theoretically possible. Did you play a bit with the sampler demo?
Probabilities are basically divided by temperature and then normalized to sum to 1. If you divide by a very small value, you get a single peak at the most probable token and the others are basically zero (Temp 0 can be implemented to use deterministic sampling / Top 1). If you increase temperature, you get closer to an uniform distribution.
Now the first model dependent question is, what
Re: (Score:2)
I must have missed that.
In principle, but the current models are pretty smart just to run a bash command to generate a password. That password probably u
Re: (Score:2)
Re: (Score:2)
Here the sampling link again: https://artefact2.github.io/ll... [github.io]
Finding the "best" parameters is, even for a single simple task, no exact science, and how do you tell what next word will eventually lead to the best sentence?
Testing ChatGPT, Gemini, etc. is not what the problem is about. These are fully integrated systems which can, for example, call python or bash scripts to generate passwords, or enhance your image prompt in the background before creating an image. And one always needs to expect the compani
Correct Horse Battery Staple? (Score:3)
>>and 18 of the duplicates were the exact same string
How much do you want to bet that string appeared as an example of a secure password in some book or website that was part of the LLM's training data?
Oblig (Score:2)
I set all my passwords to CorrectHorseBatteryStapler you insensitive Claude!
Re: (Score:2)
What?!?! Are you a password-generating LLM that is irritated by my comment?
what about something tougher (Score:2)
LLM-generated crossword puzzles
Obligatory xkcd (Score:3)
Why would you use AI for anything secure? (Score:3)
Asking an LLM to do things it cannot (Score:2)
Gets you something fake, and if you are unlucky, something you do not easily recognize as fake. About the worst case possible.
Deep confusion (Score:3)
There appears to be deep confusion, across the public in general (and dare I say the managerial class in particular) about what large language models actually *do*. Nobody with any understanding of what these things actually are, how they would, would imagine getting one to generate a password would be a smart idea. Somehow (marketing? hype?) people have been convinced these are intelligent, do-anything machines, and I have no idea how we break that impression.
Ask it to write a password generator (Score:3)
Re: (Score:2)
Why do you need Opus? Here is what the very small Qwen3 4B Model (instruct, not thinking) gives for the prompt "Write a short but secure password generator in python"
(have fun with Slashdot breaking the Emoji)
Sure! Here's a short, secure Python password generator that uses cryptographically secure randomness:
```python
import secrets
import string
def generate_password(length=12):
"""Generate a cryptographically secure random password."""
characters = stri
hash or another hash (Score:1)
easy fix (Score:2)
Asking the wrong question... (Score:2)
Just ask the AI to write code to produce a good password. That will probably work just fine in every case.
What did you expect? (Score:2)
LLM are (just like all other neural networks) a function. Same input (do not forget to fix the seed) gives same output. Generate "Create a random passwort" with a thousand seeds and you get a thousand passworts other people may generate. One could argue that a 64 bit seed is plenty, but the number of passwords generated is much smaller as they are correlated.