Thoughts on API Best Practices API Management and Infrastructure Blog

Consider an API to unlock ‘dead data’

Book digitizing made the news last week, as Google announced a deal to digitize up to a million books from Italian libraries. Despite some author and publisher resistance, similar efforts are proceeding in Europe, including Norway’s Bookshelf project and Europeana.

What is the motivation for digitizing vast libraries of books?

Yngve Slettholm of Kopinor summarizes it well, “The vast majority of books are out of print and can be considered commercially dead.” “This creates an extra source of revenue for older books.”

What does this have to do with APIs?

How much ‘commercially dead’ data, content, or services is your company sitting on? Are you fully monetizing your assets? If you gave yet-known 3rd parties access to data or content locked in your company, what new business opportunities and revenue streams would be created?

That’s what an open API does. It unlocks data and content, giving it the potential to be consumed in new, innovative ways, and monetized. Via an API you can get the broadest possible distribution for the lowest cost.

One example of this is BestBuy’s Remix program; BestBuy opened the Remix API and now BestBuy can be accessed via mobile apps and in new ways and places where its customers are. Another example is Sears which has opened it's vast product catalog, and TransUnion, which made its core credit report and credit monitoring services available to 3rd party financial companies.

It is not always clear who will use the capabilities you expose or how. But one thing is for certain, if they are not accessible, they will not be used.

The book digitization efforts also hold some lessons for how to manage such a program, which I’ll cover in my next post.

 

Do you need API keys?  API Identity vs. Authorization

(This is part 3 in our series on "Is your API naked? 10 API Roadmap considerations")

We’ve seen very few API providers with a completely open API – almost all employ at least one of these:

  • Identity - who is making an API request?
  • Authentication - are they really are who they say they are?
  • Authorization – are they allowed to do what they are trying to do?

Do you need them all?  Maybe not.  Some APIs only need only to establish identity and don’t really need to authenticate or authorize.

API Identity vs. Authentication - Compare Google Maps and Twitter  

Take Yahoo and Google maps – they are fairly open.  They want to know who you are but they aren’t concerned what address you are looking up. So they use an “API key” to establish identity, but don’t authenticate or authorize. So if you use someone else’s API key, it’s not good but not a serious security breach.

The API key lets them identify (most likely) who is making a API call so they can limit on the number of requests you can make. Identity is important here to keep service volume under control.

Then take Twitter’s API -  open for looking up public information about a user, but other operations require authentication. So Twitter supports both username/password authentication as well as OAuth. Twitter also has authorization checks in its code, so that you cannot “tweet” on behalf of another user without either their password or an OAuth key to their account. This is an example of an API that implements identify, authentication and authorization.

The “API Key” – do you need one? 

API keys originated with the first public web services, like Yahoo and Google APIs.  The developers wanted to have some way to establish identity without having the complexity of actually authenticating users with a password, so they came up with the “API key,” which is often a UUID or unique string. If the API key doesn’t grant access to very sensitive data, it might not be critical to keep secret, so this use of the API key is easy for the consumers of the API to use however they invoke the API.

An API key gives the API provider a way to (most of the time) know the identity of each caller to maintain a log and establish quotas by user (see the last section). 

Usernames and Passwords – again, see Twitter

With more sensitive data a simple, API key is not enough, unless you take measures to ensure users keep the key secret. An alternative is username/password authentication, like the authentication scheme supported by the vast majority of secure web sites. 

It’s easiest to use “HTTP Basic” authentication that most websites use. The advantage of using this technology is that nearly all clients and servers support it. There is no special processing required, as long as the caller takes reasonable precautions to keep the password secret.

Twitter simplifies things for their users by using usernames and passwords for API authentication.  Every time a user starts a Twitter client, it either prompts for the username and password to use, or it fetches them from the disk (where it is somehow scrambled or encrypted where possible). So here it makes a lot of sense to have the same username / password for the Twitter API that it used for the web site.

Usernames and passwords also work well for application-to-application communications. The trick - the password must be stored securely, and if it’s being used by a server, where do you store it?  If you are running an application server that uses a database, you already have solved this same problem, because the database usually requires a password too. Better application server platforms include a “credential mapper” that can be used to store such passwords relatively securely.

There is a lot we'd like to write about around security so we'll split this up into a couple entries.  Next time:  Session-based authentication, OAuth, SSL and WS-Security, and rolling your own.