Loading Search...

API Best Practices Blog

A Short History of API Authentication (and where it’s going): from HTTP basic to OAuth 2.0 »

Part 1: The Web

In the beginning -- way back in the beginning -- the web was all about open access. Tim Berners-Lee and his colleagues focused on making information available, not on protecting it from unauthorized users.

But as time went on, and as Al Gore took the initiative in liberating the government-run Internet backbone for commercial use (really), the Web became about "e-commerce,"  and e-commerce required security. SSL matured to ensure that sensitive traffic was encrypted all the way from the client to the server and back, and various schemes emerged to allow user authentication.

The oldest and most common format for web authentication is HTTP Basic authentication. This is what you get when you visit a web site and a little browser window pops up requesting a username and password. Every web browser and every major web server supports some form of this.
 
From a web design perspective, HTTP Basic has a big disadvantage in that it's implemented entirely by the browser, and can't be customized for each site. As the quality of graphic design improved on the web, designers soon realized that they wanted not a generic little grey box on the screen, but a carefully-designed login page, with logos, disclaimers, and the like, or a discreet login button on the corner of a web page. The combination of HTTP Basic and HTML just didn't allow this.

The result was the rise of form-based authentication. This is what the vast majority of secure web sites use today. As a user, you visit a web page that prompts you for a username and password. If authentication is successful, then under the covers you are granted a unique cookie, which your web browser sends with each subsequent request. As the user you never see this -- it just looks like you logged in and now the site works -- but under the covers it is quite different from Basic.

Both Basic and form-based authentication rely on SSL to create an encrypted tunnel between the client and server. Without that encrypted tunnel, anyone snooping the Internet or the open Wifi at Starbucks can see the passwords go by in the clear. Fortunately, SSL protects against this very well, but sometimes developers neglect to use it, users neglect to ask for it (as most of us do with Facebook), and sometimes the traffic travels over unencrypted links behind the firewall of a large network.
 
The web community attempted to counter this using HTTP Digest authentication, which encrypts the password using a one-way hash so it's impossible to reverse-engineer even if SSL is not used -- but it still must be implemented by the browser and can't be designed in to a nice UI. It never took off.
 
For a higher level of security, SSL has long supported two-way authentication, which requires that individual end users request digital certificates for each site they plan to visit and install those signatures on their browsers. The overhead of issuing PKI credentials to end users was enormous and this never took off either.
 
Part 2: Early APIs
 
Some early APIs were built right on top of existing web sites built using form-based authentication. The easiest way to implement them was to use the same authentication mechanism, so API developers would create a method called "login" that returns a security token, another method called "logout," and require the security token on every API call.
 
This approach makes the API easy to tack on top of an existing web app, but it is more work on the client side and hurts API adoption. An administrator can't as easily drive the API from the script without logging in, extracting the security token from the response, making the call, and then logging out.
 
Other early APIs just use HTTP Basic authentication. It's simple, works with every client (and with every shell script based on "cURL"), but it requires SSL to be used, often leaving it up to the client to "remember" to use it. Still, it's effective as long as the user has the password for the API handy.
 
Yet others, especially Amazon, decided they wanted to avoid using SSL for performance reasons, but they also wanted to avoid using the uncommon HTTP Digest authentication. (Amazon's S3 is used to store multi-gigabyte files and SSL does make a difference there.) They chose to create their own access control mechanisms based on secret keys and in some cases, digital signatures. The result was a bit of programming for each developer starting out with AWS, but Amazon's services were so useful and cheap that it didn't matter. By now there are numerous libraries to make this process easier.
 
Part 3: APIs get formal
 
The first real access control mechanism aimed at the needs of APIs and API developers is OAuth. The idea came from a popular API (Twitter) and a defunct web site (Ma.gnolia). The goal was to make it possible for a Ma.gnolia user to access Twitter without requiring that each user give Ma.gnolia their Twitter password.
 
The result, OAuth 1.0, is like a "valet key" for an API -- it is a token that gives a single client or web site access to a particular API on behalf of one user. The client or web site can get an OAuth "access token" without ever seeing the user's password because the two web sites do a sort of "credential dance" to exchange the secret token. Once the token is issued, the user can see it or revoke it, thus taking away access from the client or site without requiring a password change.
 
Since then, OAuth 1.0 has seen a patch (it became 1.0a to fix a security flaw), and an IETF committee is working on OAuth 2.0. This new version includes some important simplifications and a wider range of use cases. OAuth has become the de facto standard for API authentication. (for more see my earlier post on OAuth 1.0 vs. OAuth 2.0)
 
The Future
 
OAuth is likely to dominate the world of APIs for a long time. With OAuth 2.0, soon API providers and users will have the option of using a simple "bearer token" in conjunction with SSL, or a signed token that can remain secure without further encryption. OAuth 2.0 is flexible enough to be used in the original "web site to API" use case that OAuth 1.0 was designed to handle, but it can also be used for access by mobile devices, where it can be important to be able to remotely revoke the access tokens that might be stored on your phone without going around and changing innumerable passwords.
 
Still, there remain problems to be solved. What about access for simple management APIs, for instance? Even OAuth 2.0 is cumbersome and when SSL is always in use sometimes a plain old password is sufficient. API providers can and should consider supporting good old Basic authentication alongside OAuth 2.0 if for no other reason than convenience -- as long as SSL and a strong password are required.
 
What about mobile apps? Is there a way for the server hosting an API for, say, a large bank to ensure that the request is coming from an official application, or from a rogue app that is attempting to "phish" passwords by pretending to be the real thing? Can we do something by combining signed applications with server-side validation, or is a secure app store the only way to protect against mobile malware?
 
The world of APIs is evolving and there's no doubt that security technology will continue to evolve along with it.

(For more on API security issues see our other entries on OAuth and API authorization, identity, and security - or get it all with our "Is your API Naked" whitepaper)

Why basic auth and social security numbers both suck (and how OAuth helps for APIs) »

Eggs…

Around the time of the industrial revolution, there was a problem—too many ships were sinking. So in 1891, the Bulkhead Committee proposed a "Grade I subdivision" that, among other things, required passenger liners longer than 425 feet to be able to float even if two compartments were compromised. Accidents will always happen, but the committee wanted to minimize any loss of life when they did.

It was a simple idea: that a single hull breach should not sink a vessel.

Jump ahead 118 years to the month in which my son started kindergarten. It's going pretty well for him. But yesterday, a notice came home asking us to supply his social security number to identify him in a longitudinal data system in order to comply with federal funding requirements. While I appreciate the future value such data will have, I am not at all comfortable giving away that secret number.

If you are anything like me, you think twice before giving out your social security number. I've had mine compromised before (I still have the Secret Service letter describing how it was discovered by customs in a briefcase along with other people who had worked for the same employer, but that's another story). This totally sucked, and I now hesitate even more than I did before when presented with a rental application / insurance form / school form / etc.

  • Every time I give out my social security number, I increase the chances of it being compromised.
  • Every time I build ships without compartmentalization, I compromise the safety of its passengers.
  • Every time I give out my username and password for a web service, I increase the chances of it being compromised.

… And Baskets

By now you can probably see where I'm going… because at Apigee we're obsessed with APIs (think Glenn Close in Fatal Attraction but without all the psychosis). Smart people are going to going to build web services. To grow those services, they're going to build APIs. And those APIs need to have some way to authenticate users.

Since the username/password is still the primary way in which people identify themselves to a web service, you might think that securing your API with usernames/passwords makes a lot of sense. But you are only as strong as your weakest link… and once that password is compromised, the whole ship sinks.

There are perfectly good ways to deal with this, of course, API tokens for example, usually a custom token passed as a query or header parameter. These make most sense when sent over SSL to keep them safe. Some providers, like 37signals's API for Highrise, expect them to be using HTTP basic authentication in lieu of the username or password.

The OAuthpocalypse & The Unsinkable

Of course there are other ways. OAuth 1, popularized by Twitter as much as anyone, is a great example. At it's best, it works like valet keys that users can give to applications to access their accounts on their behalf while retaining the option to revoke those permissions app by app. An additional benefit is that API analytics can be organized by application.

But OAuth has it's own problems. It's harder to understand conceptually. It's harder to implement. Perhaps worst of all, it can disenfranchise users who must use proxies to get around the censorship walls put in place by governments like China and Iran.

OAuth 2 is different in some important ways. Facebook is already using a version of it with their Graph API, despite the spec still being in draft. Twitter appears to be using OAuth 2 draft for @Anywhere. The best benefit is that it's simpler and easier for developers to start using, since, among other things, it does away with the signing of base strings.

So OAuth does away with the password anti-pattern. But transitioning away from an authentication system like basic authentication to one like OAuth isn't ever easy. Change is hard—just look how much complaining follows every Facebook redesign. And things will break… and with the OAuthpocalypse, they did.

The Titanic was famously built to exceed the Bulkhead Committee's Grade I requirements. But that alone wasn't enough to save it. The hubris of excessive speed for the conditions meant the ship couldn't turn in time, and following the subsequent impact, five compartments flooded. It was an unprecedented disaster.

Twitter, on the other hand, knew up front that things were going to break. And when it became obvious that the deadline was coming too fast, they slowed down, delaying the event for two months. It was certainly painful for developers—though we still don't know the full extent to which this broke applications (though apparently not their own, thanks to a basic auth back door).

Overall, Twitter did an excellent job of handling the migration. They gave developers lots of time, and then they gave them more. As the date approached, they shut off basic auth for short periods—an iceberg ahead alarm bell, if you will. They slowly lowered the rate limits over two weeks. And most importantly, they communicated and coordinated well—Developer Advocates Taylor Singletary and Matt Harris, and other members of the Twitter API team worked tirelessly to provide tools and support devs as they struggled with the migration.

So far, things are looking pretty good for their great experiment. But only history will tell the full extent of the damage.

Tech Talk: API Security and Threat Protection »

Greg recorded a few whiteboard talks last month - this one is a good summary of recent posts on API Keys, API security recommendations, and OAuth best practices

(And here are some of our high-level API security features if you are looking into this.)

OAuth — Take care with those keys! »

A lot has been happening with OAuth recently. Earlier this year a security hole was discovered in the protocol which exposed it to a potential “social engineering” attacks.  However, the OAuth community is working on a revision to the spec that will eliminate this particular hole.

Last week we wrote a bit on OAuth as an option for API security.  But today I wanted to bring up a related OAuth issue - how do you securely manage all those keys?

With traditional username / password authentication, good security practices require you don't just have a big database on the back end with a list of unencrypted passwords. Instead, a hash of the password is stored, preferably using a salt. So someone who can read the password file can verify they have the right password, but cannot see the actual password.

It is still critical to protect access to these encrypted passwords. Otherwise, an attacker can mount a dictionary attack to try and crack them. However, even if someone gains access to your entire database of encrypted passwords, they can still only easily gain access to lousy passwords. At least users who choose secure passwords are relatively safe. (It is also critical to protect access to the cleartext password, but at least this mechanism doesn’t require that it be stored in a database for all to see.)

As networking and middleware people, we spend a lot of time thinking about the security of our network protocols, and especially ensuring that someone eavesdropping on a network cannot grab our passwords and other sensitive data as they fly by. But how many times have we heard of a security breach caused by a stolen laptop? I would argue that protecting so-called “data at rest” is just as important, or maybe even more important, as protecting the data flying around your laptop.

Now, back to OAuth. Each “user” in OAuth holds something called an “access token,” which is like a username, and a “token secret,” which is like a password. When a request is sent over the network containing an OAuth authentication token, a bunch of data in the token is encrypted using the token secret, but the secret itself is never sent over the network. That way, regardless of whether SSL is in use, there is no way to gain access to the token secret by sniffing the network.

However, on the server side, in order to validate the OAuth token, the server must make the same calculation that the client made when it encrypted the data to put in the token. That means that both the client side and the server side in OAuth must be able to read the unencrypted token secret from some sort of database. Without it, OAuth doesn’t work. There’s no set of standard ways for storing those keys like there are for passwords, so presumably different implementations are storing them in different ways.

As a result, any client and any server that uses OAuth has to take extra-special care with all those token secrets. Otherwise, anyone who gets access to the database of tokens and secrets used by the back end servers immediately has access to all the OAuth-enabled accounts.

I am not suggesting a change to the OAuth protocol here — it solves an important problem. However, I am suggesting that anyone who implements either the “service provider” or “consumer” side of OAuth take very special care of those tokens!

For instance:

  •     If they’re on a regular disk file, protect them using filesystem permissions, make sure that they’re encrypted, and hide the password well.
  •     If they’re in a database, encrypt the fields, store the key well, and protect access to the database itself carefully.
  •     If they’re in LDAP, do the same.

Come to think of it, perhaps the world needs a standard LDAP schema for storing OAuth secrets in a secure way. Anyone care to make a proposal?

Don’t roll your own: API Security Recommendations »

Don't roll your own API security schemeLet's boil up the examples and common pitfalls from our last two entries on API Identity and Authorization and more API security choices.   

Use API Keys for non-sensitive data (only): 

If you have an “open” API - one that exposes data you’d make public on the Internet anyway -  consider issuing non-sensitive API keys. These are easy to manipulate and still give you a way to identify users. Armed with an API key, you have the option of establishing a quota for your API, or at least monitoring usage by user. Without one, all you have to go on is an IP address, and those are a lot less reliable than you might think. (Why don’t web sites issue “web site keys?” Because they have cookies.)

For example, the Yahoo Maps Geocoding API issues API keys so it can track its users and establish a quota, but the data that it returns is not sensitive so it’s not critical to keep the key secret.

Use username / password authentication for APIs based on web sites: 

If your API is based on a web site, support username/password authentication. That way, users can access your site using the API the same way that they access it using their web browser. For example, the Twitter API supports username/password authentication, so when you access it using a Twitter API client like Spaz or TweetDeck you simply enter the same password you use when you use the twitter.com web site.

However, if you do this, you may want to avoid session-based authentication, especially if you want people to be able to write API clients in lots of environments. It is very simple to use “curl,” for instance, to grab some data from the Twitter API because it uses HTTP Basic authentication. It is a lot more complex if I instead have to call “login,” and extract a session ID from some cookie or header, and then pass that to the real API call I want to make...

OAuth for server-to-server APIs that are based on web sites:

If your API is based on a web site (so you already have a username / password for each account) and the API will also be used by other sites, then support OAuth in addition to username / password authentication. This gives users a lot of peace of mind, because they can authorize another site to access their sensitive data without giving that site their password permanently.

Use SSL for everything sensitive:

Unless your API has only open and non-sensitive data, support SSL, and consider even enforcing it (that is, redirect any API traffic on the non-SSL port to the SSL port). It makes other authentication schemes more secure, and keeps your user’s private API data from prying eyes – and it’s not all that hard to do.

Don’t roll your own!

If the above suggestions still don’t apply to you, then keep looking – between OAuth, OpenID, SAML, HTTP authentication, and WS-Security, there are a lot of authentication schemes, and each has its pros and cons.

So wrapping up API security in our series on 10 API roadmap considerations. Here are some suggested questions you may want to ask when putting together your API security roadmap:

How valuable is the data exposed by your API?

  • If it’s not that valuable, or if you plan to make it as open as possible, is an API key enough to uniquely identify users?
  • If it is valuable, can you reuse username and password scheme for each user?
  • Are you using SSL? Many authentication schemes are vulnerable without it.

What other schemes and websites will your API and users want to integrate with?

  • If your API will be called programmatically by other APIs, or if your API is linked to another web site that requires authentication, have you considered OAuth?
  • If you have username/password authentication, have you considered OpenID?
  • Can you make authorization decisions in a central place?

What other expectations might your customers have?

  • If your customers are enterprise customers, would they feel better about SAML or X.509 certificates?
  • Can you change or support more than one your authentication approach for diverse enterprise customers?
  • Do you have an existing internal security infrastructure that you need your API to interact with?

Up next:  API Data Protection and thanks to Torbin H. for the photo.

More API Security Choices - OAuth, SSL, SAML, and rolling your own »

(This entry continues our API Roadmap consideration series and is a continuation of last week's API Identity and Authorization.

Session based Authentication – cumbersome with RESTful APIs

Lots of APIs  support session-based authentication. In these schemes, the user first has to call a “login” method which takes the username and password as input and returns a unique session key. The user must include the session key in each request, and call “logout” when they are done. This way, authentication is kept to a single pair of API calls and everything else is based on the session.
 
This type of authentication maps very naturally to a web site, because users are used to “logging in” when they start working with a particular site and “logging out” when they are done.
 
However, session-based authentication is much more complex when associated with an API. It requires that the API client keep track of state, and depending on the type of client that can be anything from painful to impossible. Session-based authentication, among other things, makes your API less “RESTful” - an API client can’t just make one HTTP request, but now a minimum of three.
 
The Role of OAuth

Today, many APIs also support OAuth authentication. OAuth was designed to solve the application-to-application security problem. The idea of OAuth is that it gives the user of a service or web site a way to conditionally grant access to another application. OAuth makes it possible for a human user to individually grant other APIs or sites access to their account without sharing their actual password. It works by giving a “token” to each API or site that will access the account, which the user may revoke at any time they wish.

For instance, if web site FooBar.com wants access to the Twitter API on behalf of John Smith, then OAuth specifies the protocol that FooBar.com must go through to get an OAuth token for the Twitter API. Part of this process requires John Smith to log in to his Twitter account using his normal username and password (which, in the OAuth protocol, are never seen by FooBar.com) and authorize FooBar to access his account. The result is that FooBar.com will have an OAuth token that gives it access to John Smith’s account. If, later on, John Smith decides he no longer trusts FooBar.com, he has the option to revoke that OAuth token without affecting his regular password or any other accounts.

This process makes OAuth the ideal for communication from one application to another – for instance, allowing MySpace to post photos to your Twitter account without requiring that you enter your Twitter password every time you want to do it – but it can be used for any kind of API communications as well.
 
Two-Way SSL, X.509, SAML, WS-Security...

Once we leave the world of “plain HTTP” we encounter many other ways of authentication, from SAML, X.509 certificates and two-way SSL, which are based on secure distribution of public keys to individual clients, to the various WS-Security specifications, which build upon SOAP to provide... well, just about everything.

An API that will primarily be used by “enterprise” customers – that is, big IT organizations – might consider these other authentication mechanisms such as an. X.509 certificate or SAML for more assurance over authentication process than a simple username/password. Also, a large enterprise may have an easier time accessing an API written to the SOAP standard because those tools can import a WSDL file and generate an API client in a few minutes.

The idea is to know your audience. If your audience is a fan of SOAP and WSDL, then consider some of the more SOAP-y authentication mechanisms like SAML and the various WS-Security specifications. (And if you have no idea what I’m talking about, then your audience probably isn’t in that category!)

 
Rolling Your Own

In between OAuth, HTTP Basic, and the basic API key are many alternatives. It seems that there are as many other API authentication schemes as there are APIs. Amazon Web Services, Facebook, and some Google APIs, for instance, are some big APIs that combine an API key with both public and secret data, usually through some sort of encryption code, to generate a secure token for each request.

The issue – every new authentication scheme requires API clients to implement it. On the other hand, OAuth and HTTP Basic authentication are already supported by many tools. The big guys may be able to get away with defining their own authentication standards but it’s tough for every API to do things its own way.
 
SSL

Most authentication parameters are useless, or even dangerous, without SSL. For instance, in “HTTP Basic” authentication the API must be able to see the password the client used, so the password is encoded – but not encrypted – in a format called “base 64.” It’s easy turn this back into the real password. The antidote to this is to use SSL to encrypt everything sent between client and server. This is usually easy to do and does not add as much of a performance penalty as people often think. (With Apigee’s products, we see a 10-15 percent drop in throughput when we add SSL encryption.)

Some security schemes, such as OAuth, are designed to be resistant to eavesdropping. For instance, OAuth includes a “nonce” on each request, so that even if a client intercepts an OAuth token on one request, they cannot use it on the next. Still, there is likely other sensitive information in the request as well.

In general, if your API does not use SSL it also potentially exposes everything that your API does.

Next time, we'll make some recommendations among different options - or get our full paper here.

Do you need API keys?  API Identity vs. Authorization »

(This is part 3 in our series on "Is your API naked? 10 API Roadmap considerations")

We’ve seen very few API providers with a completely open API – almost all employ at least one of these:

  • Identity - who is making an API request?
  • Authentication - are they really are who they say they are?
  • Authorization – are they allowed to do what they are trying to do?

Do you need them all?  Maybe not.  Some APIs only need only to establish identity and don’t really need to authenticate or authorize.

API Identity vs. Authentication - Compare Google Maps and Twitter  

Take Yahoo and Google maps – they are fairly open.  They want to know who you are but they aren’t concerned what address you are looking up. So they use an “API key” to establish identity, but don’t authenticate or authorize. So if you use someone else’s API key, it’s not good but not a serious security breach.

The API key lets them identify (most likely) who is making a API call so they can limit on the number of requests you can make. Identity is important here to keep service volume under control.

Then take Twitter’s API -  open for looking up public information about a user, but other operations require authentication. So Twitter supports both username/password authentication as well as OAuth. Twitter also has authorization checks in its code, so that you cannot “tweet” on behalf of another user without either their password or an OAuth key to their account. This is an example of an API that implements identify, authentication and authorization.

The “API Key” – do you need one? 

API keys originated with the first public web services, like Yahoo and Google APIs.  The developers wanted to have some way to establish identity without having the complexity of actually authenticating users with a password, so they came up with the “API key,” which is often a UUID or unique string. If the API key doesn’t grant access to very sensitive data, it might not be critical to keep secret, so this use of the API key is easy for the consumers of the API to use however they invoke the API.

An API key gives the API provider a way to (most of the time) know the identity of each caller to maintain a log and establish quotas by user (see the last section). 

Usernames and Passwords – again, see Twitter

With more sensitive data a simple, API key is not enough, unless you take measures to ensure users keep the key secret. An alternative is username/password authentication, like the authentication scheme supported by the vast majority of secure web sites. 

It’s easiest to use “HTTP Basic” authentication that most websites use. The advantage of using this technology is that nearly all clients and servers support it. There is no special processing required, as long as the caller takes reasonable precautions to keep the password secret.

Twitter simplifies things for their users by using usernames and passwords for API authentication.  Every time a user starts a Twitter client, it either prompts for the username and password to use, or it fetches them from the disk (where it is somehow scrambled or encrypted where possible). So here it makes a lot of sense to have the same username / password for the Twitter API that it used for the web site.

Usernames and passwords also work well for application-to-application communications. The trick - the password must be stored securely, and if it’s being used by a server, where do you store it?  If you are running an application server that uses a database, you already have solved this same problem, because the database usually requires a password too. Better application server platforms include a “credential mapper” that can be used to store such passwords relatively securely.

There is a lot we'd like to write about around security so we'll split this up into a couple entries.  Next time:  Session-based authentication, OAuth, SSL and WS-Security, and rolling your own.

We’re joining the Cloud Security Alliance »

Today, Sonoa is joining the Cloud Security Alliance. Why are we doing this?

The first reason is because we’ve talked to hundreds of companies who are building APIs and web services both internally and externally, and for the most part they are using cloud services from other companies, or they are planning to expose their own web services to others on the Internet, or they are running their own infrastructure in the cloud – or all of the above. Cloud computing is a big part of what we do, and we want to make it succeed. The Cloud Security Alliance is a great group of experienced security architects working on solving the most vexing problem faced by companies hoping to take advantage of cloud computing – security.

We encounter security issues all the time when we talk to customers about their own experiences with APIs and cloud services. For instance, there seem to be as many ways to authenticate API users as there are companies publishing APIs. There are venerable standards like HTTP authentication, WS-Security, and two-way SSL, new ones like OAuth and OpenID, and the countless other schemes that API providers also come up with. How does a new API provider deal with all those standards? How does a company consuming some of all of these APIs deal with the proliferation of authentication mechanisms?

In this area, we are not necessarily looking for the CSA to define new standards, but to spend some time identifying best practices for producers and consumers of these APIs, and helping them choose when its necessary to make a choice.

We also encounter security issues when companies are looking to take advantage of cloud computing – especially when they are planning to run some or part of their infrastructure on a public cloud platform. What is the most effective way to connect services on a public cloud with services running behind the traditional corporate firewalls? What kinds of data can be sent to a public cloud platform and what data must remain in a corporate-owned data center? What are the best practices around data encryption, authentication, data retention, and the maze of legal requirements about all this? On these areas, the CSA has already shown itself to be leading the field, and we would like to help.

-Greg