In order to handle data safely, a developer must understand exactly what data they're dealing with and the context within which it's used.

Web/App developers (good ones at least) treat all data, regardless of its source, as potentially dangerous.  As such, they have to validate (and where necessary, encode) everything you type in to their apps.

If we ask for a phone number, we expect you to enter a number.  If we ask for an email address, we expect the format to conform to that of an email address ([email protected] for example).

Trouble is... what if you don't enter the right information?  It could be a genuine mistake, or it could be malicious activity... only a validation test can confirm this.

Take this example...

In a phone number box, you enter 0000 00000">

Clearly, the "> shouldn't be there.  Without any form of validation, the system just assumes you've entered everything correctly and saves it as usual.

But that's not a phone number... surely it would tell me?

Yes... but only if it's put through input validation.  Queue the layman logic...

Is the phone number actually a number?

Yes -> Great, carry out other checks (right length, area code etc) and if all pass, store as usual.

No - > Tell the user the data failed validation ("0000 00000"> isn't a number" for example).

The process of telling you to fix your mistake introduces a problem... now we must return (or output) that information back to your browser.  Perfectly safe, right?  That depends...

If we simply blank the phone number for you to start again, we don't need to take any further action with your data.

If however, we want to place the old data back in the "phone number" box (for you to remove the ">), we need to ensure it's safe to do so.

With just numbers, you'll see this.

But look what happens if the developer forgets to use output encoding...

The section in red means we've broken the page layout (or more accurately, we've "injected up").

How can that be dangerous?

By not extrapolating the data from interpretable code (context-aware output encoding), the developer has inadvertently opened the site up to attack.  With a carefully-formed payload, an attacker can now control the entire document object model (DOM).  Simply put, they can control everything you see, how the page functions and watch every key you press (as you press them).

Here's an example of what's possible with 5 minutes of work (read more here).

Ah, but this page is secure... my browser & Rapport software says so!

SSL (look for https:// and a valid certificate) can only do so much.  In this context, SSL does not make the page secure; it's simply means that data sent between you and the host is encrypted during transmission.

Rapport doesn't protect against this type of attack.  I'd go as far as to say it makes the situation worse by lending credibility to a page which would otherwise look suspicious.

There's a world of difference between "there are no security issues" and "we found no security issues".

Is this common?

Sadly, yes... and all-too-often with companies which should know better.  You can expect (and to some extent, excuse) the odd mistake from time to time, but the following video demonstrates a systemic failure at Santander.  The lack of communication and 10 month delay just adds salt to the wound... prompting this (and the previous) blog post.

There are other issues, some of which I'm not prepared to disclose as they would immediately pose a risk if exploited.  It's unlikely I'll hear any more from Santander, but based on what I've seen, I wouldn't recommend anyone bank online with them.  If the site fails the most basic of tests, the phrase "where there's smoke..." comes to mind.

Select "full screen" at the bottom right.