OAuth 1.0 (the current spec version is 1.0a, which fixes a security problem with 1.0) solves an important problem in the world of APIs -- how one web application can give another application API access without requiring that the user give out their password. OAuth 1.0 solves this problem in a clever way through a secure handshake, via API calls, between the two applications. This has allowed APIs to go in places where they could never go before.
OAuth 1.0 works by ensuring that the API client and server share a token, which is like a username, and a token secret, which is like a password. The client must generate a "signature" on every API call by encrypting a bunch of unique information using the token secret. The server must generate the same signature, and only grant access if both signatures match.
The advantage of this approach is that there is no way to find out the token secret, because it is always encrypted when it's sent over the network, and only the client and server have the keys. It doesn't matter if the data is eavesdropped on a WiFi network in Starbucks or captured by a proxy like Apigee -- there's no way to see the secret, so there's no way to impersonate the client based on what's sent over the network. All of this is done without requiring SSL, since SSL can slow down the client and server alike, and make deployment of the API server more complex.
However, both client and server developers found the complexity of generating and validating signatures to be too much. There are many tricky things that a developer must get right, down to exactly what type of "URL Encoding" is used (it's not exactly the same as it's used in other places). If the client or server makes a single tiny mistake in the signature, it's invalid and it's hard to figure out what went wrong.
OAuth 2.0 promises to simplify this stuff in a number of ways:
1. SSL is required for all the communications required to generate the token. This is a huge decrease in complexity because those complex signatures are no longer required. 2. Signatures are not required for the actual API calls once the token has been generated -- SSL is also strongly recommended here. 3. Once the token was generated, OAuth 1.0 required that the client send two security tokens on every API call, and use both to generate the signature. OAuth 2.0 has only one security token, and no signature is required. 4. It is clearly specified which parts of the protocol are implemented by the "resource owner," which is the actual server that implements the API, and which parts may be implemented by a separate "authorization server." That will make it easier for products like Apigee to offer OAuth 2.0 support to existing APIs.
For these reasons, OAuth 2.0 has already been adopted by companies like Facebook, which uses the draft spec in its Graph API. Of course, it's a new spec, which means there are new requirements and use cases that make it more complex. For instance, OAuth 2.0 also clearly lays out how to use OAuth entirely inside a browser using JavaScript that has no way to securely store a token, and it explains at a high level how to use it on a mobile phone or even on a device that has no web browser at all.
Finally, although the developers of the world will not miss generating OAuth 1.0 signatures, they served a purpose, because they allowed a client to send its token and secret securely to a server without requiring SSL. For APIs and devices that do not want to support SSL for performance or complexity reasons, signatures are still a good choice. Right now, signatures have been removed from the OAuth 2.0 spec, but they'll be added to a separate extension spec at some point.
So should you use OAuth 2.0 today? Here are a few things to consider:
Spec changes. OAuth 2.0 has not reached a stable IETF draft yet. If you implement it today, are you prepared to change your implementation every few weeks until the committee has agreed on a stable version? OAuth 1.0a, on the other hand, is already a well-defined standard that's not going to change any time soon.
Implementations. There are a number of code libraries for both client and server that support 1.0a today. There aren't as many for 2.0, so you're going to have to build more stuff on your own.
Complexity. There's no doubt that 2.0 is easier to implement both on the client and server side.
Performance. If you are unwilling or unable to use SSL for all of your API traffic, then OAuth 2.0 is not a good choice until some sort of signature extension is added to the spec. OAuth 1.0a already supports signatures, which are complex but allow you to securely exchange tokens without requiring the use of SSL.
You can also check out my blog on "When to Use OAuth" for more, and we'll continue to explore this issue as it evolves.
In our daily talks with customers about their APIs, OAuth has become a major consideration. It's quickly replacing the various proprietary "authentication token" schemes used by early APIs (such as the Amazon Web Services APIs) with a flexible standard; and it solves for new use cases emerging from mashups and Web 2.0. Still, there is a lot of confusion about OAuth out there.
Based on the current state of the technology, we know that there are different API use cases. Depending on your security needs, sometimes OAuth is the only security technology an API can realistically support. In others cases, it may be overkill, or something to add alongside another scheme.
OAuth Overkill?
OAuth is overkill for an API that doesn't require strong authentication. Many "catalog" and "search" APIs that access public data fall into this category. The goal of an API like this is to gain adoption. There's no reason for such an API to require individual users to authenticate to retrieve information from a product catalog, or search public data. (Unless, of course, your API's business model revolves around data that is so valuable that you can get away with charging for every data access. In today's world, that's not common.) A simple API key, however, is a great idea for these kinds of APIs, integrated into your API analytics so you and your developers can see what's going on.
The Case for OAuth
OAuth is the only realistic choice for a web application that itself uses another web application's API on behalf of the user. For instance, consider a web application that integrates with Twitter. (Perhaps it's a geolocation app like Foursquare that offers the ability to tweet where you are and what you're doing.) Today, it is unacceptable for such a web application to store its users' Twitter passwords. OAuth was designed precisely for this use case -- it gives the web app a secure way to get an access token for Twitter, which the user can revoke at any time, without ever revealing that Twitter password to the web app.
Basic Auth- Still Important
In most cases OAuth should be one of two or three security choices for most APIs. Again, consider Twitter. It makes perfect sense for OAuth to be used by other web apps as a way to access Twitter. But Twitter has long supported "Basic" HTTP authentication as well, using a username and a password. While this is bad news for a web application client, for a mobile or a desktop client that is used by an individual user, it's just fine as long as the client takes some care with the password.
And importantly, Basic authentication is easy. If your API requires secure authentication, and you want it to be easy to integrate and test, then offering Basic authentication means that the barrier to entry is low.
Now, once the decision to use OAuth has been made, the question is whether to use the current version 1.0a, or the still-under-development OAuth 2.0. We'll talk about that next.
We launched a new tool at Apigee- the API lifecycle management platform we offer free to the community. We've created a Facebook Graph API Console- a whole new way to interact with, learn and debug the Graph API that lets you easily view requests/responses to the API, share what you are seeing and dig into errors.
The console supports OAuth 2.0 so you can log in using your Facebook credentials- check out the video:
InsideFacebook has some great coverage by Josh Constine out today on the console and how it works; you can also hop over to the Apigee blog for more details.
We've been following the fast-moving debate in the IETF regarding OAuth 2.0. OAuth, for those of you who have not encountered it already, is a set of authentication technologies for the Internet designed around the concept of an access token.
Access tokens, in the words of Eran Hammer-Lahav, are like valet keys -- they give the holder access to a specific function, for a specific amount of time. For instance, you might use OAuth to give another web site the ability to read photos from your Flickr profile, but not to modify them. OAuth lets you do this, it lets you go back to Flickr and revoke the web site's permissions at any time, and it does it without requiring that you give the site your Flickr password.
The current spec, OAuth 1.0a, is implemented in lots of places, and it solves a lot of problems. However, implementing it is no picnic for either the API provider (the server) or for the developer who builds the client. (There are libraries, of course, not to mention products such as our own that simplify this process.)
OAuth 2.0 introduces many changes. The most important is that a client may now use a "bearer token." That's a fancy IETF way of saying that an access token can just be a string that the server gives you. On every request, the client passes that token back to the server, the server checks to see if the token is valid, and you're done. This is much simpler to implement than OAuth 1.0a, but it is only secure if you use SSL for every request. Applications that won't or can't use SSL may still use the old way of transmitting each token, which encrypts the token so that it is safe even if SSL is not used or even if it is intercepted by a proxy like Apigee.
However, OAuth 2.0 is far from complete. It is currently undergoing lots of discussion on the IETF mailing list, and the spec draft changes daily.
That's why I was surprised to read today that Facebook is using OAuth 2.0 to authenticate its own API. Now, some of the key players in OAuth work at Facebook, and they have chosen to use only a part of the spec, and the part that's arguably the least complicated. I'm sure that they feel that taking this calculated risk now is in the best interest of Facebook and its developer community, but the possibility remains that the spec will change and Facebook will have to change its implementation to match.
(In fact, at the moment I write this, they do not -- the name of the query parameter that holds the token is "access_token" in the Facebook documentation and "oauth_token" in the latest version of the spec repository.)
In the meantime, developers building on top of these APIs may have to contend with OAuth 1.0a (the current spec), OAuth 1.0 (an older version that some sites may still use), the draft form of OAuth 2.0 as implemented by Facebook, and even "WRAP," which introduced some of the concepts used in OAuth 2.0.
So the good news is that there are lot of good standards being written that can make it easier to produce and consume powerful and secure APIs. The bad news is that those standards are still changing. So stay tuned, and be careful!
Last week we wrote a bit on OAuth as an option for API security. But today I wanted to bring up a related OAuth issue - how do you securely manage all those keys?
With traditional username / password authentication, good security practices require you don't just have a big database on the back end with a list of unencrypted passwords. Instead, a hash of the password is stored, preferably using a salt. So someone who can read the password file can verify they have the right password, but cannot see the actual password.
It is still critical to protect access to these encrypted passwords. Otherwise, an attacker can mount a dictionary attack to try and crack them. However, even if someone gains access to your entire database of encrypted passwords, they can still only easily gain access to lousy passwords. At least users who choose secure passwords are relatively safe. (It is also critical to protect access to the cleartext password, but at least this mechanism doesn’t require that it be stored in a database for all to see.)
As networking and middleware people, we spend a lot of time thinking about the security of our network protocols, and especially ensuring that someone eavesdropping on a network cannot grab our passwords and other sensitive data as they fly by. But how many times have we heard of a security breach caused by a stolen laptop? I would argue that protecting so-called “data at rest” is just as important, or maybe even more important, as protecting the data flying around your laptop.
Now, back to OAuth. Each “user” in OAuth holds something called an “access token,” which is like a username, and a “token secret,” which is like a password. When a request is sent over the network containing an OAuth authentication token, a bunch of data in the token is encrypted using the token secret, but the secret itself is never sent over the network. That way, regardless of whether SSL is in use, there is no way to gain access to the token secret by sniffing the network.
However, on the server side, in order to validate the OAuth token, the server must make the same calculation that the client made when it encrypted the data to put in the token. That means that both the client side and the server side in OAuth must be able to read the unencrypted token secret from some sort of database. Without it, OAuth doesn’t work. There’s no set of standard ways for storing those keys like there are for passwords, so presumably different implementations are storing them in different ways.
As a result, any client and any server that uses OAuth has to take extra-special care with all those token secrets. Otherwise, anyone who gets access to the database of tokens and secrets used by the back end servers immediately has access to all the OAuth-enabled accounts.
I am not suggesting a change to the OAuth protocol here — it solves an important problem. However, I am suggesting that anyone who implements either the “service provider” or “consumer” side of OAuth take very special care of those tokens!
For instance:
If they’re on a regular disk file, protect them using filesystem permissions, make sure that they’re encrypted, and hide the password well.
If they’re in a database, encrypt the fields, store the key well, and protect access to the database itself carefully.
If they’re in LDAP, do the same.
Come to think of it, perhaps the world needs a standard LDAP schema for storing OAuth secrets in a secure way. Anyone care to make a proposal?
If you have an “open” API - one that exposes data you’d make public on the Internet anyway - consider issuing non-sensitive API keys. These are easy to manipulate and still give you a way to identify users. Armed with an API key, you have the option of establishing a quota for your API, or at least monitoring usage by user. Without one, all you have to go on is an IP address, and those are a lot less reliable than you might think. (Why don’t web sites issue “web site keys?” Because they have cookies.)
For example, the Yahoo Maps Geocoding API issues API keys so it can track its users and establish a quota, but the data that it returns is not sensitive so it’s not critical to keep the key secret.
Use username / password authentication for APIs based on web sites:
If your API is based on a web site, support username/password authentication. That way, users can access your site using the API the same way that they access it using their web browser. For example, the Twitter API supports username/password authentication, so when you access it using a Twitter API client like Spaz or TweetDeck you simply enter the same password you use when you use the twitter.com web site.
However, if you do this, you may want to avoid session-based authentication, especially if you want people to be able to write API clients in lots of environments. It is very simple to use “curl,” for instance, to grab some data from the Twitter API because it uses HTTP Basic authentication. It is a lot more complex if I instead have to call “login,” and extract a session ID from some cookie or header, and then pass that to the real API call I want to make...
OAuth for server-to-server APIs that are based on web sites:
If your API is based on a web site (so you already have a username / password for each account) and the API will also be used by other sites, then support OAuth in addition to username / password authentication. This gives users a lot of peace of mind, because they can authorize another site to access their sensitive data without giving that site their password permanently.
Use SSL for everything sensitive:
Unless your API has only open and non-sensitive data, support SSL, and consider even enforcing it (that is, redirect any API traffic on the non-SSL port to the SSL port). It makes other authentication schemes more secure, and keeps your user’s private API data from prying eyes – and it’s not all that hard to do.
Don’t roll your own!
If the above suggestions still don’t apply to you, then keep looking – between OAuth, OpenID, SAML, HTTP authentication, and WS-Security, there are a lot of authentication schemes, and each has its pros and cons.
So wrapping up API security in our series on 10 API roadmap considerations. Here are some suggested questions you may want to ask when putting together your API security roadmap:
How valuable is the data exposed by your API?
If it’s not that valuable, or if you plan to make it as open as possible, is an API key enough to uniquely identify users?
If it is valuable, can you reuse username and password scheme for each user?
Are you using SSL? Many authentication schemes are vulnerable without it.
What other schemes and websites will your API and users want to integrate with?
If your API will be called programmatically by other APIs, or if your API is linked to another web site that requires authentication, have you considered OAuth?
If you have username/password authentication, have you considered OpenID?
Can you make authorization decisions in a central place?
What other expectations might your customers have?
If your customers are enterprise customers, would they feel better about SAML or X.509 certificates?
Can you change or support more than one your authentication approach for diverse enterprise customers?
Do you have an existing internal security infrastructure that you need your API to interact with?
Up next: API Data Protection and thanks to Torbin H. for the photo.
Session based Authentication – cumbersome with RESTful APIs
Lots of APIs support session-based authentication. In these schemes, the user first has to call a “login” method which takes the username and password as input and returns a unique session key. The user must include the session key in each request, and call “logout” when they are done. This way, authentication is kept to a single pair of API calls and everything else is based on the session.
This type of authentication maps very naturally to a web site, because users are used to “logging in” when they start working with a particular site and “logging out” when they are done.
However, session-based authentication is much more complex when associated with an API. It requires that the API client keep track of state, and depending on the type of client that can be anything from painful to impossible. Session-based authentication, among other things, makes your API less “RESTful” - an API client can’t just make one HTTP request, but now a minimum of three.
The Role of OAuth
Today, many APIs also support OAuth authentication. OAuth was designed to solve the application-to-application security problem. The idea of OAuth is that it gives the user of a service or web site a way to conditionally grant access to another application. OAuth makes it possible for a human user to individually grant other APIs or sites access to their account without sharing their actual password. It works by giving a “token” to each API or site that will access the account, which the user may revoke at any time they wish.
For instance, if web site FooBar.com wants access to the Twitter API on behalf of John Smith, then OAuth specifies the protocol that FooBar.com must go through to get an OAuth token for the Twitter API. Part of this process requires John Smith to log in to his Twitter account using his normal username and password (which, in the OAuth protocol, are never seen by FooBar.com) and authorize FooBar to access his account. The result is that FooBar.com will have an OAuth token that gives it access to John Smith’s account. If, later on, John Smith decides he no longer trusts FooBar.com, he has the option to revoke that OAuth token without affecting his regular password or any other accounts.
This process makes OAuth the ideal for communication from one application to another – for instance, allowing MySpace to post photos to your Twitter account without requiring that you enter your Twitter password every time you want to do it – but it can be used for any kind of API communications as well.
Two-Way SSL, X.509, SAML, WS-Security...
Once we leave the world of “plain HTTP” we encounter many other ways of authentication, from SAML, X.509 certificates and two-way SSL, which are based on secure distribution of public keys to individual clients, to the various WS-Security specifications, which build upon SOAP to provide... well, just about everything.
An API that will primarily be used by “enterprise” customers – that is, big IT organizations – might consider these other authentication mechanisms such as an. X.509 certificate or SAML for more assurance over authentication process than a simple username/password. Also, a large enterprise may have an easier time accessing an API written to the SOAP standard because those tools can import a WSDL file and generate an API client in a few minutes.
The idea is to know your audience. If your audience is a fan of SOAP and WSDL, then consider some of the more SOAP-y authentication mechanisms like SAML and the various WS-Security specifications. (And if you have no idea what I’m talking about, then your audience probably isn’t in that category!)
Rolling Your Own
In between OAuth, HTTP Basic, and the basic API key are many alternatives. It seems that there are as many other API authentication schemes as there are APIs. Amazon Web Services, Facebook, and some Google APIs, for instance, are some big APIs that combine an API key with both public and secret data, usually through some sort of encryption code, to generate a secure token for each request.
The issue – every new authentication scheme requires API clients to implement it. On the other hand, OAuth and HTTP Basic authentication are already supported by many tools. The big guys may be able to get away with defining their own authentication standards but it’s tough for every API to do things its own way.
SSL
Most authentication parameters are useless, or even dangerous, without SSL. For instance, in “HTTP Basic” authentication the API must be able to see the password the client used, so the password is encoded – but not encrypted – in a format called “base 64.” It’s easy turn this back into the real password. The antidote to this is to use SSL to encrypt everything sent between client and server. This is usually easy to do and does not add as much of a performance penalty as people often think. (With Sonoa’s products, we see a 10-15 percent drop in throughput when we add SSL encryption.)
Some security schemes, such as OAuth, are designed to be resistant to eavesdropping. For instance, OAuth includes a “nonce” on each request, so that even if a client intercepts an OAuth token on one request, they cannot use it on the next. Still, there is likely other sensitive information in the request as well.
In general, if your API does not use SSL it also potentially exposes everything that your API does.
Next time, we'll make some recommendations among different options - or get our full paper here.
The first reason is because we’ve talked to hundreds of companies who are building APIs and web services both internally and externally, and for the most part they are using cloud services from other companies, or they are planning to expose their own web services to others on the Internet, or they are running their own infrastructure in the cloud – or all of the above. Cloud computing is a big part of what we do, and we want to make it succeed. The Cloud Security Alliance is a great group of experienced security architects working on solving the most vexing problem faced by companies hoping to take advantage of cloud computing – security.
We encounter security issues all the time when we talk to customers about their own experiences with APIs and cloud services. For instance, there seem to be as many ways to authenticate API users as there are companies publishing APIs. There are venerable standards like HTTP authentication, WS-Security, and two-way SSL, new ones like OAuth and OpenID, and the countless other schemes that API providers also come up with. How does a new API provider deal with all those standards? How does a company consuming some of all of these APIs deal with the proliferation of authentication mechanisms?
In this area, we are not necessarily looking for the CSA to define new standards, but to spend some time identifying best practices for producers and consumers of these APIs, and helping them choose when its necessary to make a choice.
We also encounter security issues when companies are looking to take advantage of cloud computing – especially when they are planning to run some or part of their infrastructure on a public cloud platform. What is the most effective way to connect services on a public cloud with services running behind the traditional corporate firewalls? What kinds of data can be sent to a public cloud platform and what data must remain in a corporate-owned data center? What are the best practices around data encryption, authentication, data retention, and the maze of legal requirements about all this? On these areas, the CSA has already shown itself to be leading the field, and we would like to help.