Loading Search...

API Best Practices Blog

Mobile patterns are different from Web patterns »

Mobile application patterns are different from web application patterns.  There are a consistent, discrete set of differences in how they access cloud services.  There are consistent reasons why they’re favored over websites as well, primarily based on implicit intent and purposive computing experience, but that’s a subject for a future blog entry. 

For now, let’s assume that like web applications, mobile applications use HTTP to access their services, but unlike old-school web applications, they use REST and SOAP as the basis of their service protocols.

Difference 1: Bandwidth is expensive

Bandwidth always costs two unique things in mobile applications – time and battery.  Jeffrey Sharkey has a great talk about battery usage and good citizenship.  In some areas and for some users, bandwidth also hits their data plan, which makes bandwidth cost real money as well 

Difference 2: Bandwidth is inconsistent

Disconnections are part of everyday life when using the mobile internet to access websites or cloud services. When your local cell area becomes overloaded with requests or your service loses track of where you are between towers, when you are in even a momentary cell shadow, your connection is gone.  If this is in the middle of a data connection, that connection is reset and has to start over.

Difference 3: Local processing matters

First, non-trivial requests for data from cloud services often results in large datasets being returned to the device.  These chunks can not only be hard to process, but may be more information than the user will bother to access.  A request that returns hundreds of row-equivalents worth of responses may be mostly wasted processing if the user is only going to glance at the first few displayed screens’ worth.

Second, local applications have differentiated access to devices – awareness of onboard camera, location, or other services.  They also have differentiated preferences about data.  For example, the iPhone operating system is fluent in XML processing, and many iPhone applications transmit XML dialects to their cloud services.  However, XML is more expensive to the iPhone than PLISTs (a JSON-like simple data format) – roughly 4-5 times more expensive in compute cycles.  Other mobile devices have their own variations based on operating system version and device services.

Difference 4: Welcome to the hit-driven app economy

Media has been a hit-driven economy for decades, with winners and losers being made and broken overnight based on the wisdom or madness of crowds.  With the fantastic potential of mobile applications and the inconsistent actual experiences, collaborative filtering and editorial selection are producing a hit-driven “Top 25” economy for mobile applications – the equivalent of being “above the fold” in a website.  It’s a steady climb to get in this elite area, but once your application gets to this point you might as well have been written up by Walt Mossberg or Slashdot – traffic, downloads, and usage of the cloud services backing your app will all surge dramatically.

Difference 5: Concurrent usage by millions of nomadic users

Just combining Difference 2 (inconsistent bandwidth) with the fact that most mobile connections use HTTP 1.0 means that many more connections are being made, dropped and suffer an expensive reset not just of application state but of the HTTP connection itself.  Adding Difference 4 (the hit-driven app economy) to this means that concurrency – including “shadow concurrency”, the load of the dropped and restarted connections – has an even bigger role in mobile applications than traditional web computing.

Solving for mobile application patterns in cloud computing

There are probably ways to solve each of these problems individually.  What we’ve seen with some of our key customers is that all of these can be solved by applying a cloud service controller to manage the connections between mobile applications and their cloud services.

With a cloud service controller in the middle of the application and cloud service interactions, they’ve done the following things:

  • Compressed service request and response data by 6-10x
  • Accelerated service response time 5-7x through intelligent caching
  • Carved large service responses into chunks likely to be used by the mobile user
  • Translated service responses into formats easily  processed by the mobile device
  • Reduced total network airtime usage by 15-20x
  • Reduced battery drain on the mobile device
  • Reduced dropped connection experiences for the mobile application
  • Scaled caching and response capacity dynamically to match growth and spikes in usage


In one example, a customer took the network request/response time from 17 seconds to 1 second, and took local processing on the mobile device from 17 seconds to one second.  This reduced total application response time from 34 seconds to 2 seconds – an acceptable, even exciting level of responsiveness for that application’s users.  This was all achieved in a few weeks without rewriting either the mobile application or the cloud service upon which the application depended.

They did this by taking our core product (Apigee Enterprise), writing policies that let it route, cache, accelerate, paginate, and format their cloud services.  Since Apigee Enterprise is available as an .ami (Amazon EC2’s virtual machine format)  we've deployed it as a cloud service that expands and contracts its use of computing resources to match the load.  This way they haven’t been caught unprepared when their app made the top 25 list on the iPhone App Store and their legions of new users had the same responsive application experience as the users who popularized it in the first place.  Finally, future devices and application platforms will be easier to support from a single cloud service through construction of new formatting or pagination policies that match the needs of those device platforms.

Mobile acceleration may seem like a standalone thing for Apigee, but really it’s an example of using policy to solve an application pattern challenge.  More on that – and policy-oriented programming – in a future blog entry.  There are many more domains of use for this approach in cloud computing.

Here's a video we posted today describing our Mobile App Acceleration service.

In the cloud, scale means concurrency »

In enterprise computing, scale has traditionally meant “lots of transactions per second."  On Wall Street for many years, “20,000 TPS” was the magic number as it was the rate of a typical market data feed.  Infrastructure like TIBCO’s UDP-based information bus and then IBM’s MQSeries became the base platforms for much of this scale of computing, and are still heavily used alongside modern JMS and MSMQ implementations.
 
Relatively little attention was paid to concurrent connections.  Enterprise environments tend to be well-regulated, and most applications will have under 1000 simultaneous users (whether human or machine driven).  As a result, application servers and related technologies evolved to support high transaction throughput at limited concurrency.
 
The web on the other hand brought in much higher concurrency requirements, and platforms like WebLogic became default components of web computing environments for sites serving 1,000s people at the same time.  This was a breakthrough and led to significant market success in a short time period.
 
With the rise of cloud computing, two things change.  First, mobile applications and the API economy are driving an order of magnitude increase in the number of simultaneous users.  Second, these users are often machines rather than people, and therefore aren’t limited to the demand patterns of humans users clicking links or refreshing their pages.
 
This produces a new set of demand patterns which increase both total throughput and peak concurrency.  As an example, travel sites like Kayak.com and Bing.com/travel issue hundreds of API requests to airline reservation system backends as a result of a single human-driven query.  Furthermore, these requests are being made not just by desktop or web applications but by mobile applications – especially iPhone applications.  As most people are aware, the next 10 billion devices that come online will be mobile devices (phones, MIDs, GPS, game units, media players).  Each of these is prized for its native application experiences.  Each of these devices will be making user-driven and automated calls to cloud services in order to deliver those experiences.
 
Where backend systems are not protected from this demand, they are being penalized in performance and load management.  This causes either outright outages, “web brownouts” where the core website that uses the same backend slows down, or erratic performance across both the web and cloud properties.  Again, mobile access exacerbates the issue due to the intermittent nature of mobile internet connectivity, which multiplies the number of connections that need to be set up and torn down as the device comes on and off the network.
 
So the explosion of concurrent usage is already beginning, as the traffic and backend impact is expanding.  To manage this and maintain stability of existing infrastructure, a new layer of infrastructure is emerging, much as HTTP load balancers have evolved to serve the needs of web computing.  What we’re seeing is the rise of cloud service controllers, a category of infrastructure that works well with existing systems and builds on top of the strengths of application servers, enterprise messaging systems, and application delivery controllers.

mLocal:  iPhone app API monitoring and analytics »

Apigee isn't only for API providers - if you use APIs in your mashup, mobile, or social app you can monitor those APIs as well.

Why?  You might want to find out before your users if an API is slow or down, leaving big holes in your app where content should be.  Or verify any terms of use or bill you get from an API provider.

For example, if you're an iPhone developer you know nobody will use a slow iPhone app.

Shorepoint systems is one iPhone shop using Apigee for this purpose on their iPhone apps. Their mLocal app is great for creating and sharing local listings.

mLocal makes heavy use of RESTful APIs for content and especially to communicate to a back-end content app hosted on AWS. Shorepoint uses Apigee for monitoring and debugging of these API calls between the iPhone client and AWS (in both QA and production), and especially to monitor response times and proactively find out if any API call is slowing down iPhone app performance. 

Thanks to Rajan of mLocal for all the feedback on Apigee and check out mLocal here!

How Apigee calculates API response time »

API response time and latency

Earlier this week we showed how to caculate the latency that Apigee adds to your total API transaction (vs. not using Apigee as a proxy).

This is different than the API 'response time' that you see in your Apigee API analytics dashboard graphs and tables.

If you've set up your Apigee proxy and have run some traffic, you'll see graphs for "response time" in your Apigee dashboard.  (If you haven't seen a dashboard yet, these metrics include messages, developers, error rates, data in/out, and response time)

For example, the widget above and charts below are some of the data I see for the before/after latency test I ran against the Yahoo Local API in the previous  API latency blog entry, API response time and latency

This 'response time' is the roundtrip latency from time time your request enters Apigee - hits the target - waits for the target's response and that response leaves Apigee on the way back to your client.

Here is how your Apigee dashboard cacluates "response time".       For the timestamps:

    • T1 – when message arrives at Apigee
    • T2 – when message leaves Apigee to go to the target (in this case the Yahoo Local API)
    • T3 – when message arrives at Apigee from the taget
    • T4 -  when messages leaves Apigee to the Client

The Apigee 'response time' graph/table shows  T4 - T1 = Response times of the Apigee Proxy including time taken by the Target.  In this case, the Yahoo Local API response time was about 23 ms.

Hope this is useful and we'd love your comments or questins here or on our support and feedback forum.

API Scalability, part 2 - caching, rate limits, and offloading »

(Following from Tuesday's blog entry on API Scalability and Caching.

Last time we wrote about 3 things to think about when planning how to scale your API.

  • Caching
  • Rate limiting and threat protection
  • Offloading expensive processing

and then talked about caching at length, so let's finish up with:

Rate Limiting and Threat Protection

Another aspect of scaling is just keeping unnecessary traffic away from your application servers and databases. Some of the techniques that we've discussed previously, such as rate limits and threat protection, apply here as well.

For instance, an API's performance can drop precipitously if a client, on purpose or by accident, sends too much traffic. A rate limit helps a lot here!

Bad requests can kill API performance too. XML threats, which we discussed in the last episode, are one example of a way that a bad request from a client can cause performance problems or even a crash on the server side. It's a lot easier to maintain scalability if you can stop these kinds of problems before they can hurt your servers.

Server Processing Offloading

Finally, consider the things that you can offload from your application server tier. The more you can offload to more efficient platforms, the less load your application servers have to handle. Plus, the more things you can offload, the simpler those application servers and their applications become, which means they're easier to manage and easier to scale.

For example:

SSL. Load balancers and ADCs like F5 and NetScaler products, not to mention web service proxies like Sonoa ServiceNet, can process SSL more efficiently than most application servers.

HTTP Connections. Those same products are highly optimized to handle tens of thousands of simultaneous connections from HTTP clients, and operate a smaller pool of connections to the back-end application servers. Offloading HTTP connection handling to another tier can free up a lot of server resources.

Authentication. If you perform authentication, a proxy like Sonoa ServiceNet can handle all your authentication for you, freeing your application servers to worry only about properly-authenticated requests. And if you're using SOAP, a product like ServiceNet can process many of the SOAP headers, such as WS-Security headers for authentication, then remove them so that the application server doesn't even need to see them.

Validation. If your API depends on XML input, it may run more efficiently if it only accepts valid XML requests. Turning on XML schema validation can hurt performance of most application servers - products like ServiceNet can do it more efficiently.

So to finishing up, key Questions to ask for your API scalability roadmap might include:

  • What kind of volume are you expecting?
  • Are you prepared if you get 10, 100, or 10,000 times that amount of volume with little warning?
  • Do you have a way to shut a user off if they consume too much volume?
  • Do you have a way to control API traffic in case you are unable to handle the volume (see Traffic Management)
  • Are your back end servers capable of handling tens of thousands of concurrent connections?
  • Are your back end services cacheable? Do you have a cache that you can use to reduce response times?
  • Are you monitoring response times and tracking them to gauge customer satisfaction?

(next time:  API user management and oboarding)

Testing API latency and response time with Apigee »

There were some good comments last week on TechCrunch on the pros and cons of using a proxy for analytics and protection (or any operational or business policy) on your API.

Biggest concerns discussed were: latency, single-point-of-failure, and loss-of-control.   All great points.

We wanted to talk about latency first.  (and address the other two in a later post.)

A proxy definitely adds latency.   Both for the additional server hop and processing time of the proxy software.    So any proxy needs to minimize latency and add enough value (capability, time-to-market, etc.)  to justify this extra hop.
 
Our conservative estimate for Apigee is to expect 200-400 ms of latency.   This is mostly due to the extra hop and includes the 20-40 ms of latency due to Apigee's proxy 'think time.' (More detail our latency FAQ)

Your mileage might vary based on message size, the policies you are enforcing, and where you are hosted.  For example, our estimates are based on a 5K message size.  If you proxy Twitter with it's small messaages, your latency will likely be less, and if you are processing big message sizes (such as inserting ads into email), it will likely be higher.  
 
Test it yourself

Soon we'll introduce a tool to test your Apigee proxy's latency during the proxy setup process. In the meantime, you can test this yourself with Apache Workbench (or cURL) by:

    1. Set up your Apigee proxy (or feel free to use my Yahoo Local API proxy in the steps below)
    2. Open up a terminal.
    3. Run a *before* test -  get the latency *without* apigee.  Run this Apache Workbench command (for 10 test requests). 

      For this example, I'm using the Yahoo Local API's example API methods. 
       
      ab -n 10 http://local.yahooapis.com/LocalSearchService/V3/localSearch?appid=YahooDemo&query=pizza&zip=94708&results=2

      (This is an apache workbench command where -n 10 specifies 10 iterations)

      You should get a results set in this format (where the "10" was for running the test 10 times). 

       So you can see - just hitting Yahoo Local without a proxy I get a latency of 250ms for all 10 requests.

      4. Next, I get the latency *with* Apigee using my Apigee proxy URL.  (Feel free to use this URL yourself, don't worry, I rate limited it in Apigee)
       
      ab -n 10 http://yahoo-local-1.apigee.com/LocalSearchService/V3/localSearch?appid=YahooDemo&query=pizza&zip=23662&results=2
       
      In this case my results are:

      In this case, my longest response with Apigee is 357ms.

      5.  Subtract (3) from (4) and there is your approximate latency for the proxy.   Here the latency was roughly 357 ms - 250 ms = 107 ms for my 10 requests, on my verizon card outside Berkeley's Cafe Roma.  (thanks to Yahoo Local's great API for the recommendation.)
       
      Run this a couple times to make sure your responses are consistent, and also mixing up your API query parameters so you don't accidentally compare a cached vs non-cached response time. For example, I changed zip codes in my Yahoo Local requests.