Passwords: Using 3 Random Words Is A Really Bad Idea!
In 2015, the UK government released an article advocating the use of 3 random words in passwords, citing "pragmatism and algorithmic strength against common issues like brute force attacks".
2 years later and a plethora of respected Twitter users continue to push this advice. If you're one of them (looking at you @WMPDigitalPCSO @DurhamCyber @SurreyPolice @nerccu @ncsc @HP_Cyber @metpoliceuk @cyberawaregov), I implore you to read this article thoroughly and reconsider your position.
Understanding the basics...
After you've entered your chosen password, a responsible site should store it using a suitably-strong password hashing algorithm (bCrypt, PBKDF2, Argon2 etc). Note, I said "password hashing algorithm"... this is a different concept from a cryptographic hashing algorithm. More on that shortly.
The developer, mindful of the threat landscape, value of the data & server resources... will configure their chosen algorithm such that each hash takes a set time to compute. For example, it's reasonable to expect a hash to be generated in 400/500 milliseconds (or roughly half a second). This delay is crucial, as it dramatically increases the time required to brute-force their way into your account. A 0.5 second delay is insignificant to the real user, but a real burden to an attacker needing to try trillions of various passwords.
In an ideal world, sites would all use a well-configured algorithm & users would opt for strong, truly random passwords. But, that's rarely the case.
Getting the basics wrong...
Another, entirely inappropriate storage method is a cryptographically strong hashing/digest algorithm. The underlying principle remains the same; derive a fixed-length output for any given input. However, these hashes are blisteringly fast to compute... making brute-force far easier. Examples of such hash algorithms are MD5 (still very common and possible to brute-force at a rate of 200 billion permutations a second), SHA1 (70 billion a second), SHA256 (23 billion a second), SHA512 (8 billion a second) and so on.
As this article & indeed most password advice is aimed at the end-user, it's important to choose a password based on the worst case scenario; assume the site stores our credentials in the weakest possible way and mitigate that risk first.
Characters vs words; the premise behind the advice
Long, strong, unique & complex passwords are inherently difficult to remember... especially if you use several websites. This inevitably leads to password re-use, recycling & insecure storage. On that basis, I fully understand the need for an alternative, more-pragmatic approach.
However, are words really a secure alternative? Let's dive in.
The Maths
For an attacker, carrying out a brute-force attack is a last resort; the process of trying every possible permutation in order to break a password hash. It is however, a guaranteed way (given sufficient time & resources) to gain access to an account.
To understand the challenge our attacker faces, we first need to understand exponents.
If our password consists of 8 possible characters and is 3 characters long, the calculation would look like so:
In this example, there are 512 possible permutations; meaning our attacker is guaranteed to access our account after 512 attempts.
Let's take it one step further with a real-world example...
If a password consists of a-z, A-Z and 0-9, we have 26 characters + 26 characters + 10 numbers. Our "base" in this example would be 62. Now let's assume it's 12 characters long, meaning our exponent is 12.
62^12 = 3226266762397899821056 permutations (three sextillion, two hundred twenty-six quintillion, two hundred sixty-six quadrillion, seven hundred sixty-two trillion, three hundred ninety-seven billion, eight hundred ninety-nine million, eight hundred twenty-one thousand & fifty-six)
Obviously, that's an enormous burden to our attacker and results in a password they're never going to break, assuming the password is chosen at random.
It's not all about the base
Now you understand how passwords are stored & broken, we can begin to understand why some believe 3 (or more) random words are a better, more secure alternative.
Compare these...
"MTLWcAXfY4Dy" is a 12-character, random password.
"jumpdeskpolish" is a 3-word, random password. It just so happens to be 14 characters too.
We can represent "MTLWcAXfY4Dy" as upper-case, lower-case & numbers... giving us a base of 62. It's 12 characters long, giving us our exponent of 12.
We've already done the maths on 62^12 earlier, so we know it to be unbreakable if chosen at random.
If we represent "jumpdeskpolish" in the same fashion, it's just lower-case... giving us a base of 26. Despite being easier to remember, it's also longer at 14 characters, giving us our exponent of 14.
26^14 = 64509974703297150976 permutations (sixty-four quintillion, five hundred nine quadrillion, nine hundred seventy-four trillion, seven hundred three billion, two hundred ninety-seven million, one hundred fifty thousand, nine hundred & seventy-six)
Purely on the maths alone, we can see this password has far fewer permutations and is undoubtedly weaker... but that's not the whole story.
64509974703297150976 (permutations) / 200,000,000,000 (speed in seconds to break an MD5 hash, the weakest widely-used algorithm) is just over 10 years!
Success! We've managed to use far fewer characters, made our password easier to remember and it's still going to take over a decade to break. By any reasonable measure, this can be considered secure.
Or is it?
The Combinator Attack
Modern password cracking machines can just as easily rotate characters as words, so "AA/AB/AC" consumes near-identical resources as "appleapple/appleardvark/appleangle" and so on.
But now we're using words, there's a much faster way to break our password.
Instead of representing "jumpdeskpolish" as 26^14, why can't we represent it as 171,000^3?
Confused yet? Let me explain.
The latest figures suggest there are 171,000 words in the English dictionary, meaning our "base" increases dramatically. Our password is 3 words long... so let's replay the maths again.
171,000^3 = 5000211000000000 permutations (five quadrillion, two hundred eleven billion)
Remember, our attacker can break MD5 hashes at a rate of 200 billion a second.
5000211000000000 (permutations) / 200,000,000,000 (speed to break an MD5 hash, the weakest widely-used algorithm) = just shy of 7 hours!
Just by altering the attack, our password no longer provides 10 years security... but a much less confidence-inspiring 7 hours.
It gets worse...
We've assumed 171,000 is the maximum number of words in a modern English dictionary. By doing so, we know that any password containing 3 English words will fall in 7 hours or less.
However, that's not the whole story either.
Let me introduce Susie Dent...
If you're in the UK, you'll likely recognise Susie as the lexicographer and dictionary expert from Countdown. Susie's research demonstrates the average active vocabulary of an English-speaking adult is 20,000 words; with their passive vocabulary being closer to 40,000.
If you're wondering... an active vocabulary consists of words you can recall and use regularly yourself. A passive vocabulary consists of words you recognise & understand, but aren't able to use yourself.
If we take Susie's expert opinion at face value, our calculations now look like this.
20,000^3 = 8000000000000 permutations (eight trillion).
8000000000000 (permutations) / 200,000,000,000 (speed to break an MD5 hash) is 40 seconds. FORTY SECONDS!
We've gone from 10 years, to 7 hours, to 40 seconds. This is simple mathematics folks, no opinions or interpretation here.
Not to cast aspersions on Susie's expertise, but there has also been research which demonstrates a college-educated adult has a vocab of around 80,000 words (or around half of all words in the dictionary). So, if your linguistic ability is comparable to Jacob Rees-Mogg, are you secure?
80,000^3 = 512000000000000 permutations (five hundred twelve trillion) or, at the same rate as before, your 3 "random" words falls in just 42 minutes!
Adding numbers & special characters
I have noticed a couple of Twitter'ers adding "use numbers & special chars too" as a caveat to using 3 random words. This not only undermines the entire premise of memorable words, but has little/no security benefit either.
In the example above, we've gone from a base of 20,000 to 80,000. If quadrupling the base has no tangible effect on security, such that it's broken in under an hour, adding 100 special chars & 10 numbers really won't make a dent.
The underlying issue here is the low exponent... making it exponentially easier to break.
Stronger hashing
Now it's true... I've focussed on the weakest & fastest possible hash which is still widely (and sadly!) in use. If the hash is slower, the time to break the hash increases significantly.
But, if we choose a password assuming the worst possible scenario (MD5, notwithstanding plain-text), it actually doesn't matter how they're stored; it's already either immensely-strong, or unbreakable.
What method should I promote?
Password managers. Nothing else.
The machine-generated passwords they provide (assuming you're using a respectable one!) are quite literally unbreakable at 12 upper/lower/numbers or more, even with the weakest storage algorithms.
Before promoting anything, research & understand the concept and ask yourself... could I/do I do this myself? If you can remember 30+ (or possibly hundreds of) random, 12-character passwords, great... but the majority can't! Rather than promoting passwords as security measures, instead sell the added usability & convenience of never having to deal with passwords again.
Forget the semantics & subtle nuances of password security; upper-case, lower-case, special characters, numbers, words, lengths, storage, entropy et al... humans don't do random, at least not well.
FAQs
"Is 4 random words enough?"
The NCSC's article, upon which the above advice is based, actually suggests 4 random words, not 3. It's not clear why someone arbitrarily opted to lower the quantity of words. But, now we're increasing the exponent, not the base... it'll have a significant impact upon your password's security. However, it's still not enough.
20,000^4 = 160000000000000000 permutations (one hundred sixty quadrillion). Using the same 200,000,000,000 a second figure, that password would fall in 9 days.
"Which password manager should I use?"
There are several, each serving very different purposes. For home/SME users, I recommend 1Password.
How does this compare to diceware?
Diceware is a list of 7,776 words, but they are truly random. A 7-word diceware password could be represented as:
7776^7 = 1719070799748422591028658176 permutations (one octillion, seven hundred nineteen septillion, seventy sextillion, seven hundred ninety-nine quintillion, seven hundred forty-eight quadrillion, four hundred twenty-two trillion, five hundred ninety-one billion, twenty-eight million, six hundred fifty-eight thousand, one hundred seventy-six)
At 200 billion a second, it'd literally take millions of years to break. Note the "base" is significantly lower than our "3 random word" examples, but a slightly higher exponent takes it from broken in seconds, to unbroken in millions of years.
[Edit]
@ramriot just made an important point on Twitter, worthy of an update here...
If the site limits password input length, it may be impossible to use diceware.
I've followed the "3 random word" advice. Should I change my passwords?
Yes. Stop reading, do it now.
Summary
Don't use words in passwords. Ever.
Don't try coming up with secure passwords, apart from the one protecting your password manager.
After adopting a solid password manager, don't give passwords another thought... let it do the hard work for you.
If you insist on tweeting "3 random word" advice, provide evidence to substantiate its security.
That's it folks. Please RT.