We've been following the fast-moving debate in the IETF regarding OAuth 2.0. OAuth, for those of you who have not encountered it already, is a set of authentication technologies for the Internet designed around the concept of an access token.
Access tokens, in the words of Eran Hammer-Lahav, are like valet keys -- they give the holder access to a specific function, for a specific amount of time. For instance, you might use OAuth to give another web site the ability to read photos from your Flickr profile, but not to modify them. OAuth lets you do this, it lets you go back to Flickr and revoke the web site's permissions at any time, and it does it without requiring that you give the site your Flickr password.
The current spec, OAuth 1.0a, is implemented in lots of places, and it solves a lot of problems. However, implementing it is no picnic for either the API provider (the server) or for the developer who builds the client. (There are libraries, of course, not to mention products such as our own that simplify this process.)
OAuth 2.0 introduces many changes. The most important is that a client may now use a "bearer token." That's a fancy IETF way of saying that an access token can just be a string that the server gives you. On every request, the client passes that token back to the server, the server checks to see if the token is valid, and you're done. This is much simpler to implement than OAuth 1.0a, but it is only secure if you use SSL for every request. Applications that won't or can't use SSL may still use the old way of transmitting each token, which encrypts the token so that it is safe even if SSL is not used or even if it is intercepted by a proxy like Apigee.
However, OAuth 2.0 is far from complete. It is currently undergoing lots of discussion on the IETF mailing list, and the spec draft changes daily.
That's why I was surprised to read today that Facebook is using OAuth 2.0 to authenticate its own API. Now, some of the key players in OAuth work at Facebook, and they have chosen to use only a part of the spec, and the part that's arguably the least complicated. I'm sure that they feel that taking this calculated risk now is in the best interest of Facebook and its developer community, but the possibility remains that the spec will change and Facebook will have to change its implementation to match.
(In fact, at the moment I write this, they do not -- the name of the query parameter that holds the token is "access_token" in the Facebook documentation and "oauth_token" in the latest version of the spec repository.)
In the meantime, developers building on top of these APIs may have to contend with OAuth 1.0a (the current spec), OAuth 1.0 (an older version that some sites may still use), the draft form of OAuth 2.0 as implemented by Facebook, and even "WRAP," which introduced some of the concepts used in OAuth 2.0.
So the good news is that there are lot of good standards being written that can make it easier to produce and consume powerful and secure APIs. The bad news is that those standards are still changing. So stay tuned, and be careful!
In my last post I wrote about how the book digitizing effort is trying to monetize underutilized books online.
For anyone contemplating exposing data or capabilities via APIs to create new revenue streams, there are some important implementation lessons that can be learned.
Have control
In Norway’s Bookshelf project, their free online books can only be read online and only in Norway, and cannot be downloaded or printed out. Similarly, with APIs you need to have a way to control who can access your data and from where. API Identity, API authorization and other API security considerations are a must.
Make it (at least something) free
Google must operate under copyright laws, but it has found a way to freely expose extracts or snippets under ‘fair use’. Similarly, you cannot expect uptake of your content or capabilities if some of it cannot be consumed easily and en gratis. Give at least some of what you offer away for free to spur adoption. Technically, this means being able to control what can be consumed at a granular level.
Get started!
Google, Bookshelf, and Europeana have ideas about how they’ll monetize books they put online, but they are still experimenting with the right business models. Executives shouldn’t wait for an iron-clad business model to present itself… you won’t figure out the right model unless you are ‘in the game’.
"The only reason you'd have only a SOAP API is because you hate 80% of your addressable market." - @sramji
There's usually little argument that a REST API is easier to use than a SOAP API.
But how important is it to be 'truly' or 'strictly' RESTful? That is, adhering to standard HTTP operations or 'verbs' - GET, PUT, DELETE, POST - on well defined resources, as opposed to the common practice of embedding 'verbs' or operations as methods in a GET URL.
Typically, security is cited as the big advantage of 'true REST' (with some good discussions here and here).
However, a truly RESTful API may help you boost developer adoption. For example, imagine a 'shopping cart' API:
While the above API isn't 'truly RESTful', it's not that hard to use. But you do have to learn the individual operations and this can get cumbersome if there are a lot of them or as the API evolves.
Instead, this 'true REST API' may be easier to learn and predict as you use more features.
Operation
Operation
URL
Insert new item into the cart
POST *
http://api.shopping.com/carts/X.xml
Delete item from the cart
DELETE
http://api.shopping.com/carts/X/item/Y.xml
List everything in the cart
GET
http://api.shopping.com/carts/X.xml
Get an item in the cart
GET
http://api.shopping.com/carts/X/item/Y.xml
Delete the whole cart
DELETE
http://api.shopping.com/carts/
What if we want to list all the shopping carts in the system at any one time? We would add that via an HTTP GET to:
http://api.shopping.com/carts.xml
Query parameters can still serve a purpose - making it possible to specify additional options. For instance, imagine a very large shopping cart, and you want to "paginate" the results. To look at items 20-29, you might use a URL like:
It's a lot easier to build reports that segregate and analyze traffic by URL than to build logic that tries to do this by methods or combination of methods. Good API analytics helps you optimize features, debug problems, and weed out traffic that can slow down your service.
What if your 'non-strict' REST API is already out there?
It might not be that big a deal if your API is very simple or 'read-only' API with information that isn't too sensitive (such as a free search API). Or you can map a non-RESTful API into a 'truly RESTful" API with custom code or API management tools thatperform API transformations.
* A note on POST VS. PUT
* One way to insert an item in the shopping cart is to use POST to update the shopping cart by sending it a new item. In this case, we are using POST to send the server an instruction that essentially tells it to insert some new content to the existing resource. This is why we use POST -- it is like an "update" in a database.
Alternately, we could add the item by using PUT to a new URL, such as:
But if we do this, then we need to somehow give our item some sort of unique URL by picking a value for "Y". This is kind of a strange thing to do, so it may be more natural to use POST and have the response include the URL for the new item, so that we may retrieve it later using GET or delete it using DELETE. Still, sometimes using PUT like this makes sense.
This all comes down to the difference between PUT and POST in the HTTP spec. POST modifies something that already exists, and how that thing is modified is up to the server. PUT replaces the entire contents of the URL with new data. Plus, like GET, HEAD, and DELETE, PUT is idempotent, which means that if you call it more than once, it has the same result every time, whereas POST may keep doing what you ask it to do over and over again
A recent article "Why REST Security Doesn't Exist." postulates, "REST does not have predefined security methods, so developers define their own."
Some good points in here (such as 'don't roll your own') but I might not completely agree with the premise.
One of the fundamental principles of REST is that it builds on the HTTP protocol -- and the HTTP protocol very much does have "predefined security methods."
The basic HTTP protocol supports a way to plug in different security schemes. It also supports OAuth, two-way SSL, and many other mechanisms. Not only does HTTP allow for many security schemes, many of which like HTTP basic are defined by IETF standards, but it also supports a mechanism that allows a server to identify when a request was rejected, if the request was rejected because the security credentials were invalid or because an authorization check failed, and whether the rejection was permanent or temporary. HTTP also includes a mechanism that allows the server to issue a "challenge" that asks the client to re-send a request with a particular type of credentials if it has them. This all adds up to a security method that has proven quite robust over the years and has been extended with new methods such as OAuth when new problems arise.
Also, don't forget that different APIs demand different security requirements. An API that offers product catalog information, for instance, with no way to update the information, does not require strong authentication if the owner of the data intends that data to be public anyway. A simple "developer key" that uniquely identifies the sender of the request -- yes, a username "without a password" -- is just fine for that type of API because it is used to identify the user for various tracking purposes, and is not designed to prevent unauthorized users from gaining access to the data.
Cool article and great to see some discussion on this!
Next in our series of tech talks on cloud security issues, Greg and Ryan Bagnulo, Security Architect for ASPECT-i discuss how scalability can change security requirements and how cloud computing offers new opportunities to fend off attacks on services including.
security at high scale - how to preserve the resilency of the busines
cloud powered security - using elastic cloud resources at the edge to protect core services
protecting against bot attacks and spikes through security policy enforcement and caching
1. Malicious Code Injection: exploits backend services that use SQL/LDAP/ XPATH/ XQuery statements from user-supplied input. Servicenet ‘s Malicous Code Injection Detection policy can filter SQL,LDAP, XPATH, XQUERY injection or use Custom Regular Expression, XPATH and XSD technologies to filter the request further. It also can integrate with anti-virus products to scan for virus in the API requests especially in the attachements or mime contents.
2. DOS Attacks: Denial of Service (DoS) intends to prevent an API or Service from serving normal user activity. These malicious attacks includes mega-message and entities attack, recursive element attack, request flooding, larger volume of invalid requests etc. The ServiceNet Message Payload protection policy detects various kind of DOS attacks and protect the backend from the attacker.
3. Service Information Leakage: APIs can unintentionally leak information about their configuration, internal workings, or violate privacy through a variety of service errors. For example verbose and informative error messages may result in data leakage, and the information revealed could be used to formulate the next level of attack. ServiceNet response Message control policies can customize fault/response message reaching the client which can weed out this attack.
4. Broken Authentication, Session id and Keys: Proper authentication, API key and session management is critical to service security. Flaws in this area most frequently involve the failure to authenticate (weak or multiple adhoc authentication schemes), weak session/key tokens that helps attacker to replay or fake the keys or tokens. ServiceNet’s authentication and API key management policies provides single point strong authentications and key generation techniques that frees-up API developer from attack risks.
5. Failure to protect API and corresponding Data access: Frequently, authorization is based only on base URI or operation of API. An attacker can try passing various parameters to this API operation and get access to the data that he not authorized to access. ServiceNet fle xible authorization policies supports authorization based on various request parameters/data not just URI or Operation name.
6 API Data snooping: Failure to encrypt sensitive API communications means that an attacker who can sniff traffic from the network will be able to access the conversation, including any credentials or sensitive information transmitted. Servicenet’s SSL or XML encryption policies can be used to secure the API data from getting snooped in the communication path.
7. API Request and Response tampering: The API data tampering attack is based on the manipulation of API request and response parameters exchanged between client and services in order to modify application data, such as user credentials and permissions, price and quantity of products, etc. Usually, this information is part of HTTP URI or Header or Body(XML or non-xml). Servicenet’s SSL or XML signature policies can be used to secure the API request and response message from getting tampered in the communication path.
8. Request Burst: Spikes in API requests might bring down the backend server. Spike Arresting and caching helps the backend services to perform better under various load conditions.
9. Auditing: If your API is going to be handling money, you may be required by law to adhere to certain security practices and regulations. One important regulation is auditing every (full or part of) request or response from authorized and unauthorized users. ServiceNet auditing policy supports very flexible way to log API audit data in various formats to different destinations like Local disk, NFS, Syslog, JMS or Web Services.
10. Threat Detection and Analysis: Analyzing the threat data is important to find the failures and fix those failures on the API infrastructure. ServiceNet’s analytics policy provides capability to visualize and analyze various API errors or failures. It can also provide various patterns or rates of these failures that help an architect or developer to fix the problem in his or her API.
For more on API security and threat protection, check out our compliation of API roadmap issues - Is your API Naked? And let us know if you like to see the demo of this policy pack in action.
(Senthil Doraiswamy is a product manager at Sonoa Systems.)
Greg recently sat down with Ryan Bagnulo, Security Architect for ASPECT-i, to discuss a number of cloud security concerns and issues.
We captured these discussions in six short videos, each focusing on a topic. Here are the first two on PII, data filtering, and audit and regulatory concerns, (see the full series here.)
In this first video, Greg and Ryan set things up with discussions on:
Challenges in deploying cloud, starting with: should you trust your cloud administrator?
Good data for early cloud adoption (such as public data like news, stocks)
Lots of news on the Twitter attack last week. There’s general consensus that it was a distributed denial-of-service attack (DDOS) that it targeted a particular account.
DDOS attacks are tricky things. There’s no one technique, product, or protocol that will stop them. That’s what makes them so nasty. To defend against one, a company needs to be able to quickly take countermeasures at all the different levels of the protocol stack, including firewalls, routers, load balancers, and even in the application itself. And even more importantly, you need to have experienced, well-trained operations people who know how to quickly identify an attack and come up with a way to minimize its effects.
When a DDOS attack affects a web service – and there’s an good chance that the Twitter attack made use of the Twitter API – then proper traffic management at the API level is an important part of this stack of protection.
For instance, if a DDOS attack targets a particular user account, or API key, it may be possible to block those particular requests before they get to the back-end application servers, so that the effect of the attack is limited. Or, if the attack always sends some of the same parameters in an HTTP query string, or in the body of an HTTP POST, it may be possible to detect those patterns and reject the requests, again before they reach the back-end application servers. In the worst case, it may even be necessary to shut off access to a particular web service or web service operation, allowing other services to function as best they can while the attack is neutralized by other means.
In other words, just as it may be possible to stop a DDOS attack by blocking a range of IP addresses, or redirecting certain URLs, or configuring traffic shaping in a router or an ADC, it may sometimes be necessary to stop a DDOS attack closer to the application level. The ability to inspect requests at the web services level and take action based on the request content gives an operations team one more weapon to use against an attack.
The Twitter API already has a usage quota on each user account, a rate limit on each IP address, and extensive caching. Imagine how much easier a DDOS attack might have been if they didn’t even have those things.
Last week we wrote a bit on OAuth as an option for API security. But today I wanted to bring up a related OAuth issue - how do you securely manage all those keys?
With traditional username / password authentication, good security practices require you don't just have a big database on the back end with a list of unencrypted passwords. Instead, a hash of the password is stored, preferably using a salt. So someone who can read the password file can verify they have the right password, but cannot see the actual password.
It is still critical to protect access to these encrypted passwords. Otherwise, an attacker can mount a dictionary attack to try and crack them. However, even if someone gains access to your entire database of encrypted passwords, they can still only easily gain access to lousy passwords. At least users who choose secure passwords are relatively safe. (It is also critical to protect access to the cleartext password, but at least this mechanism doesn’t require that it be stored in a database for all to see.)
As networking and middleware people, we spend a lot of time thinking about the security of our network protocols, and especially ensuring that someone eavesdropping on a network cannot grab our passwords and other sensitive data as they fly by. But how many times have we heard of a security breach caused by a stolen laptop? I would argue that protecting so-called “data at rest” is just as important, or maybe even more important, as protecting the data flying around your laptop.
Now, back to OAuth. Each “user” in OAuth holds something called an “access token,” which is like a username, and a “token secret,” which is like a password. When a request is sent over the network containing an OAuth authentication token, a bunch of data in the token is encrypted using the token secret, but the secret itself is never sent over the network. That way, regardless of whether SSL is in use, there is no way to gain access to the token secret by sniffing the network.
However, on the server side, in order to validate the OAuth token, the server must make the same calculation that the client made when it encrypted the data to put in the token. That means that both the client side and the server side in OAuth must be able to read the unencrypted token secret from some sort of database. Without it, OAuth doesn’t work. There’s no set of standard ways for storing those keys like there are for passwords, so presumably different implementations are storing them in different ways.
As a result, any client and any server that uses OAuth has to take extra-special care with all those token secrets. Otherwise, anyone who gets access to the database of tokens and secrets used by the back end servers immediately has access to all the OAuth-enabled accounts.
I am not suggesting a change to the OAuth protocol here — it solves an important problem. However, I am suggesting that anyone who implements either the “service provider” or “consumer” side of OAuth take very special care of those tokens!
For instance:
If they’re on a regular disk file, protect them using filesystem permissions, make sure that they’re encrypted, and hide the password well.
If they’re in a database, encrypt the fields, store the key well, and protect access to the database itself carefully.
If they’re in LDAP, do the same.
Come to think of it, perhaps the world needs a standard LDAP schema for storing OAuth secrets in a secure way. Anyone care to make a proposal?