As developers, we're told never to trust a user's input; always treat it as malicious until suitable validation & contextually-aware checks are performed.
Thankfully, the majority adopt this methodology. However, many sites immediately disregard it when it comes to their registration/signup flow.
For many years, I've pushed every client I engage with to adopt the technique I'm about to outline (with variations to fit their business). Last night, I had a somewhat heated discussion with a client who "fundamentally disagreed" with my proposal... so it's time to write this up and explain my thought process to a wider audience.
Premise:
Many sites which require registration use an email address, either as a direct user ID or to recover an account in the event of a password breach/loss.
If you're one of them, do not store or action anything until the user has confirmed ownership of the email address. Unless absolutely necessary, only collect the email address on the initial registration form.
At first glance, you'd be forgiven for thinking there's no appreciable risk here... but allow me to explain why I believe that to be a fundamental mistake.
A story:
5 years ago, I wrote an article about a CSRF vulnerability at ASDA/Walmart
That vulnerability allowed me to inject a malicious script into the user's existing profile which, once invoked, scraped their payment data and sent it to a remote server.
In order to execute an attack, I needed two things:
1) An existing user's account
2) A vulnerability in the site
Therein lies the problem.
If you don't validate an email address immediately, an attacker can easily create a user's account for them and embed any malicious payload they like. That's still the case today with ASDA/Walmart - as indeed with the vast majority of sites. Within a matter of seconds, an attacker can create an account inherently tied to an email address they don't own.
But there's one component we're missing... a vulnerability to leverage! However, that malicious data will remain until either a vulnerability is identified/introduced, or the actual owner of the email address resets the password to gain access to "their" account and modifies "their" profile.
"Aha! No, the root cause is the vulnerability, not the registration flow..."
During the aforementioned debate, my client continued to disagree... citing the flaw as the root cause, not the earlier collection of malicious data. At this point, he opened a tab to Dropbox.com and asked me to prove it.
It did not go well.
Within 5 minutes, his position had changed entirely and the dangers became shockingly real.
Note: I'm not intentionally picking on Dropbox here, as many sites are guilty of exactly the same oversight... but the client picked this site, so here's how the attack works.
The attack:
Step 1)
Register for a free account at Dropbox.com with the victim's email address. This triggers an email to be sent to the victim...
As the victim hasn't actually registered, it's likely to be ignored or treated as spam. That's fine, we don't need the user to click anything... we're already signed in. Even if they do click the link by mistake, it's of no consequence to the attacker.
Step 2)
Upload malware/ransomware to the victim's Dropbox folder, replacing the "Getting started with Dropbox" PDF, or simply adding my own file with a voucher code; something to prompt the user to open our payload.
Step 3)
Sit back and wait!
It could be a month, could be 6 months, it could be never... but if the victim ever wants to use Dropbox, they'll try to register. At which point, they're told an account already exists and, assuming they've registered previously and forgotten, resets the password and promptly logs in to "their" account.
Success!
We don't need a vulnerability... Dropbox dutifully places my malware on every device the victim has Dropbox installed.
At this point, I haven't just pwned your Dropbox profile... I have complete access to your PC/network.
Drop a shell, easy.
Exfil everything to an FTP server, simple.
Sit quietly and watch the user on webcam, quite possible.
It gets worse:
Keep in mind, the victim still hasn't verified their email address.
Yet, I (as the attacker) can share files with the victim's colleagues too.
Phishing attacks are, unfortunately, still prevalent and effective... even with spulling mipakes, gramutical errors and never addressing you by name.
This attack however, comes from a genuine Dropbox email address, uses your name and cites your colleagues email address perfectly. The "note" asking for a reminder to pass on the invoice is yet another trust anchor, leading the second victim to believe this is a genuine email.
Of course, the file contains malware and they too are now pwned - even if they haven't installed Dropbox.
Don't be too trusting...
I've labelled this the "TOFU" or "Trust On First Use Attack" partly because it sits nicely with HSTS' use of the term, but largely because if we inherently trust any data provided during registration, every single decision/action we take is based upon a flawed assumption.
If you can categorically claim your site does not & will never contain any security risks which an attacker could leverage by having initial control of an account, you're not only living in fantasy land, you're unnecessarily putting users at risk.
Other risks to consider...
GDPR/Data Protection:
In many jurisdictions, you're required by law to take reasonable steps to ensure the accuracy & integrity over any data you collect. An email address is classed as personal data, falling firmly into the "identified, identifiable or used to contact" category as outlined in GDPR.
But, the law is a blunt instrument and I'm certainly no expert in that field... but it would appear to my ignorant self that knowingly storing & acting upon data which you know to have zero integrity, would probably breach many laws.
2FA seeds:
I've lost count how many sites create a 2FA seed for a user and it remains static for the life of the account. If an attacker has initial access to your profile, they too can slurp the seed; rendering any later 2FA useless.
Recovery email addresses / phone numbers:
How often do you check for recovery email addresses, especially in an account you've just created? An attacker can regain control if they've hidden a recovery method in your profile.
XSS / CSRF et al:
If the site is currently vulnerable to any low-hangers, you're likely pwned the moment you sign in. However, even if the victim regains control over the account, unless they remove the arbitrary script, they are still effectively pwned... even if a vulnerability is introduced years later.
Undocumented features:
Do we actually need a security vulnerability? Like Dropbox, can we use the service exactly as intended and still cripple our victim?
New customer offers:
How many times do you see free trial, 30 day trial, or a one-time discount for new customers? If an attacker has registered for you, you're likely no longer eligible. Now you're potentially harming sales, not just security.
Legal/Life Implications:
How cruel can the attacker be? They could place ransomware/malware on your device, but they could just as easily upload category 1 pornography to your account. Good luck explaining why that's on your device... using your email address and IP!
One possible solution:
Do we really need to store data with questionable integrity, just to allow a user to register, or is there a better way? I believe there is.
Use PASETO tokens! Or, if you're feeling adventurous, JWT.
A public PASETO token is a cryptographically-signed JSON payload, allowing you to embed and sign data into a link which is later sent to the user's email address.
The resulting token looks similar to this:
What are the benefits?
1) No database writes required!
We've already established we can't (or shouldn't!) trust the data before validation. As such, we can sign the object and simply return it to the user's email address. That way, we only add this user to our database if they're valid.
Bye bye bots, malicious submissions, stale records etc.
2) Natural expiry
Like JWTs, PASETO tokens have an "exp" or expiry timestamp. This allows you to expire the token without needing to track/compare dates in the database. With clever key trickery, you could embed a bool (1/0) which switches based on the existence of a completed profile.
If the user doesn't exist, concatenate a zero to the key. If the user exists, concatenate a 1 to the key. That way, the PASETO token is signed with a key which is only valid if the user doesn't exist. As soon as they're registered, the email link effectively self destructs without needing to read/write to the database.
3) The data cannot be manipulated after submission
Once signed, the user is unable to modify the payload (to change their email address, for example) to appear to be someone else. Likewise any other data (names, postal addresses etc) - if you absolutely need to collect them during initial registration.
Summary
Even if you disagree with the existence of the risk or severity of it (I'd like to hear more in the comments!), I hope we can agree there's a material benefit beyond security to discarding (or simply not processing) inaccurate data.
Incomplete/rogue/bot registrations plague databases everywhere; often requiring GC processes to clean up on a scheduled task. It's an unnecessary burden we needn't take on... there are cleaner, more efficient ways which give you the same end result, without introducing risks you probably hadn't considered.
That's it folks. Thanks for listening.