As long as people can be tricked, there will always be phishing (or social engineering) on some level or another, but there’s a lot more that we can do with technology to reduce the effectiveness of phishing, and the number of people falling victim to common theft. Making phishing less effective ultimately increases the cost to the criminal, and reduces the total payoff. Few will argue that our existing authentication technologies are stuck in a time warp, with some websites still using standards that date back to the 1990s. Browser design hasn’t changed very much since the Netscape days either, so it’s no wonder many people are so easily fooled by website counterfeits.
You may have heard of a term called the line of death. This is used to describe the separation between the trusted components of a web browser (such as the address bar and toolbars) and the untrusted components of a browser, namely the browser window. Phishing is easy because this is a farce. We allow untrusted elements in the trusted windows (such as a favicon, which can display a fake lock icon), tolerate financial institutions that teach users to accept any variation of their domain, and use a tiny monochrome font that can make URLs easily mistakable, even if users were paying attention to them. Worse even, it’s the untrusted space that we’re telling users to conduct the trusted operations of authentication and credit card transactions – the untrusted website portion of the web browser!.
Our browsers are so awful today that the very best advice we can offer everyday people is to try and memorize all the domains their bank uses, and get a pair of glasses to look at the address bar. We’re teaching users to perform trusted transactions in a piece of software that has no clear demarcation of trust.
The authentication systems we use these days were designed to be able to conduct secure transactions with anyone online, not knowing who they are, but most users today know exactly who they’re doing business with; they do business with the same organizations over and over; yet to the average user, a URL or an SSL certificate with a slightly different name or fingerprint means nothing. The average user relies on the one thing we have no control over: What the content looks like.
I propose we flip this on its head.
When Apple released Apple Pay on the Web, they did something really unique, but it wasn’t the payment mechanism that was revolutionary to me – it was the authentication mechanism. It’s not perfect, but it does have some really great concepts that I think we can, and should, adopt into browser technology. Let’s break down the different concepts of Apple’s authentication design.
When you pay with Apple Pay, a trusted overlay pops up over the content you’re viewing and presents a standardized, trusted interface to authenticate your transaction. Having a trusted overlay is completely foreign to how most browsers operate. Sure, http authentication can pop up a window asking for a username and password, but this is different. Safari uses an entirely separate component with authentication mechanisms that execute locally, not as part of the web content, and that the web browser can’t alter. Some of these components run in a separate execution space than the browser, such as the Secure Element on an iPhone. The overlay itself is code running in Safari and the operating system, instead of being under the control of the web page.
A separate trusted user interface component is unmistakable to the user, but many such components can be spoofed by a cleverly designed phishing site. The goal here is to create a trusted compartment for the authentication mechanism to live that extends beyond the capabilities of what can typically be done in a web browser. Granted, overlays and even separate windows can be spoofed, and so creating a trusted user interface is no easy task.
From the user’s perspective, it doesn’t matter what the browser is connecting to, only what the web page looks like. One benefit Apple Pay has over typical authentication is that, because the execution code for it lives outside of the web page (and in code), it has control over what systems it connects to, what certificates it’s pinned to, and how that information gets encrypted. We don’t really have this with web-based authentication mechanisms. The phishing site might have no SSL at all, or might use a spoofed certificate. The responsibility of authenticating the organization is left up to the user, which was simply an awful idea.
Authenticating the User Interface
Usually when you think about an authentication system, you think about the user authenticating with the website, but before that happens with Apple Pay, the Apple Pay system first authenticates with the user to demonstrate that it’s not a fake.
In the case of Apple Pay, the overlay displays your various billing and shipping addresses and credit cards on file; sensitive information that Apple knows, but a phishing site won’t. Some of this is stored locally on your computer so that it’s never transmitted.
We’ve seen less effective versions of this with “SiteKey”, sign-on pictures and so on, but those can easily be proxied by the man-in-the-middle because the user is relying on the malicious website to perform the authentication. In Apple’s model, Apple code performs the authentication completely irrespective of what content is loaded into the browser.
No Passwords Transmitted
The third important component to note of Apple Pay is that passwords aren’t being sent, and in fact aren’t being entered at all. There’s nothing to scam the user out of except for some one-time use cryptograms that aren’t valid for any other use. While TouchID is cool, there are also a number of other forms of password-less authentication mechanisms you can deploy once you’re executing in trusted execution space.
One of the most common forms of password-free authentication is challenge/response. C/R authentication has been around for a long time, and allow legacy systems to continue using passwords, but greatly reduces the risk of interception by not sending the password. As much as a fan of biometrics fused with hardware I am, this isn’t very portable. That is, I can’t just jump on my friend’s computer and pay for something with Apple Pay without reprovisioning it.
Let’s assume that the computer has control over the authentication mechanism, instead of the website. The server knows your password, and so do you. The server can derive a cryptographic challenge based on that password. Your computer can compute the proper response based on the password you enter. Challenge/response can be done many different ways. Even the ancient Kerberos protocol supported cryptographic challenge response. That secure user interface can flat out refuse to send your password anywhere, and so a phishing site would have to convince the user to type it not just into a different site, but into a completely different authentication mechanism that they’ll be able to identify as different. Sure, some people are gullible to this, but a lot fewer than are gullible to a perfect copy of a website. That small percentage of gullible people is a smaller problem to manage.
Why don’t we use challenge/response on web pages today? For one, because we’re still authenticating in untrusted space (the browser window). The user has no idea (and doesn’t care) what happens to their password when they type it into some web browser window, and it’s just as easy to phish someone no matter what authentication mechanism you’re using in the background. What makes this feasible now is that in our ideal model, we’re doing authentication in trusted execution space – space that’s independent of the web page. This changes the game. Take the Touch Bar for example. TouchID is authenticated on the Touch Bar, but password entry could also be authenticated on it from the web browser.
An Optimal Authentication Mechanism
The ultimate goal is to condition the user to a standardized interface that can both authenticate the validity of the resource as well as authenticate itself to the user before the user is willing to accept its legitimacy and input a password.
Conditioning the User
A user interface element that is very difficult to counterfeit can also be quite difficult to create, but the benefits are considerable: If someone spends enough time around real money, they’ll be able to spot a counterfeit with a much higher success rate. On the other hand, having to look at a dozen different, poorly implemented authentication pages will condition users to accept anything they see as being real.
Our ideal authentication mechanism has an unmistakable and unreproducible user interface element. The user visits a website requiring authentication, and that website includes the necessary tags to invoke the browser’s authentication code, executed separately. Regardless of the website, this standardized authentication component is activated with a standard look; as a trusted component of the browser. Plain Jane, this could easily be an overlay that appears over the portion of the web browser that’s out of reach by the website (e.g. the address bar area). Get a bit fancier, and we’re talking about incorporating the Touch Bar or other “out of band” mechanisms on equipped machines to notify the user that an authentic authorization is taking place.
Get the user used to seeing the same authentication mechanism over and over again, and they’ll be able to spot cheap counterfeits much easier. Needle moved.
Authenticating the User Interface
The user interface itself needs to be authenticated with the user in ways that make cheap knockoffs stand out. Since the browser controls this, and not the website itself, we can do a number of different things here:
- Display the user’s desktop login icon and full name in the window.
- Display personal information specified by the user when the browser is first set up; e.g. “show me my first card in Apple Pay” or “show me my mailing address” whenever I am presented with an authentication window.
- Display information in non-browser areas, such as on devices equipped with a Touch Bar, change the system menu bar to blue or green, or present other visual cues not accessible to a web browser.
- Provide buttons that interact with the operating system in a way that a browser can’t (one silly example would be to invert the colors of the entire screen when held down).
- Suspend and dim the entire browser window during authentication.
Authenticating the Resource
Authenticating the resource that the user is connecting to is one of the biggest challenges in phishing. How do you tell the user that they’re connecting to a potentially malicious website without knowing what that website is? We’re off to a good start by executing code locally (rather than remote website code) to perform the authentication. Because of this, we can do a few interesting things that we couldn’t do before:
- We can validate that the destination resource is using a valid SSL certificate. Granted, this can be spoofed, however it also increases the cost of running a phishing site; not just in dollars, but in the amount of time required to provision new SSL certificates against the amount of time it takes to add one to a browser blacklist.
- We can automatically pin SSL certificates to specific websites when the user first enrolls their account, and keep track of websites they’ve set up authentication with, so that we can warn them when asked to authenticate on a website that they never enrolled on.
- Existing black lists and white lists can now be tied to SSL certificate information, allowing us to make better automated determinations on the user’s behalf.
- We can share all of this information across all of the user’s devices e.g. via iCloud, Firefox’s cloud sync, and so on, to make it portable.
Other elaborate things we can do with protocol might include storing a cached copy of an icon provided by the website when the site is first provisioned, giving the user a visual cue. In order for a phishing site to copy that visual cue, the user would have to step through a very obvious enrollment process that is designed to look noticeably different from the authentication process. Icons for any previously unknown sites could display a yellow exclamation mark or similar, to warn the user. In other words, that piece of content can only be displayed by websites the user has previously set up, because we’re in control of that content in local code.
We can also do some things that we are doing now, but better. For example, we can display the organization name and website name very clearly in our trusted window, in large text, and perhaps with additional visual cues, such as underlining similarities to other websites (e.g. PayPai.com) in red, and highlighting numbers in red (e.g. PayPa1.com). There’s no other content now to distract the user, because this is all happening in a trusted overlay, presumably even dimming the browser window.
The user will still receive warnings when authentication on someone else’s computer, and this is a good thing. The idea is to draw attention to the fact that your’e conducting a non-standard transaction and could potentially be giving our your credentials.
The goal with all of this is to remove the website content as the authenticating component. This is the #1 visual element the end-user is going to use to determine the legitimacy of a website: what it looks like. What I am suggesting is to dim that content completely and force them to focus on some very specific information and warnings.
Authentication With and Without Passwords
To improve upon our ideal authentication mechanism, we can deploy some better authentication protocols. Sending passwords back and forth can be omitted as a function of this mechanism. Websites adopting this new authentication mechanism present a great opportunity to force better protocol alternatives. Password authentication can be removed completely, using biometrics, when possible.
Two-Factor Authentication can be phished, but requiring it at enrollment (either by SMS, email, or authenticator) can dramatically limit a victim’s exposure to phishing. Requiring a secondary form of authentication for any passworded mechanisms will certainly diminish the success rate of a phish, and also increase the cost, requiring the man in the middle to be present and able to log in at that very moment.
For passworded authentication, challenge/response using cryptographic challenges can be forced, because we are running local code, and not website code. Once you’ve resolved that this standard will not support sending passwords in any way, shape, or form, you can reduce the transit attack surface significantly.
The overall benefit of an authentication mechanism that executes locally as a component of the browser (and potentially the operating system), rather than as a component of the website, is significant. This would mean the standardization of user interface components, protocol and security elements, resource validation, and provide a single point of entry to examine for further anti-phishing efforts that could extend far beyond URL validation, as we’re limited in doing now.
Given, this won’t address many other forms of social engineering. It’s very easy to send an email telling someone their account is limited, and direct them to some insecure site, but the idea is to condition and familiarize the user with one common set of authentication visuals so that they will question the legitimacy of any alternative visual elements if they appear. At the present, the visual elements between a legitimate authentication page and a malicious one are identical. This approach sets out to stop that.
Not only would such a scheme greatly diminish the overall effectiveness of phishing attacks, but it would simultaneously help to get rid of all the awful custom code by organizations doing authentication completely wrong. We see this every day; authentication has become a hodge podge of developer ineptitude. Placing this responsibility on the browser’s code, rather than the website’s, will help to provide what would hopefully become an accepted standard (should a working group address this subject), and at the very worst a few web browsers “doing it wrong” and needing to be fixed, than thousands of websites all needing to be fixed.
As long as people can be tricked, there will always be phishing (or social engineering) on some level or another, but there’s a lot more that we can do with technology to reduce the effectiveness of phishing, and the number of people falling victim to common theft.