Does open source still matter in cloud computing?

Open source has a role in all types of computing.
It has always reduced barriers to entry to a particular platform or programming model.
But in cloud computing open source plays a new role - driving the rate of evolution of the infrastructure to match pace with the rate of evolution of an early market.
Open Source in Cloud Platforms
Open source platforms like Xen, OpenNebula, and Eucalyptus enable universities and early-stage adopters to deploy the core platforms without financial risk. No up-front commitment in a cash-poor environment like the current economy means that pilot projects can still move forward as companies and institutions explore the potential of cloud computing.
Xen and OpenNebula enable development of on-premise cloud environments. Eucalyptus goes a step beyond this to emulate Amazon's EC2 APIs for cloud infrastructure. Eucalyptus effectively remedies a hole in Amazon's strategy - a zero-cost entry point for EC2 pilot projects - and adds weight to the de facto standard stature of the EC2 APis.
Open Source in Cloud Programming models
Cloud computing is expanding much more rapidly than service-oriented architectures, largely due to the accessibility of the model ; it can be understood with little effort by those who currently have skills in web programming, and is implemented in a range of popular programming platforms - Ruby on Rails, PHP Symfony, Apache Axis2 - as REST services.
There are SOAP cloud services as well. But the REST model is easier to pick up, implement and integrate. Again, the ubiquity of open source through the entire cloud stack - from programming layers down to HTTP and TCP/IP - means that the time and cost barriers to entry to this market are very low, which has led to its rapid expansion.
Open Source in Evolution of the Cloud Computing Market
The most important aspect of open source in cloud computing is the open collaboration model which is a property of the communities developing both infrastructure and frameworks.
There are fundamental differences in cloud-scale computing - such as how to manage writes, reads and searches across all that data now that it's consolidated in a single place. Hadoop and Cassandra represent two very different solutions to different aspects of this problem.
As the largest cloud infrastructure platforms (Facebook, Digg, Yahoo, Microsoft, Google, and others) continue to demolish the envelope of scale, new approaches must emerge. Hadoop is a "stand the problem on its head" approach to rapidly searching through huge repositories for particular pieces of data - evolving rapidly due to the multiple parties contributing to it as they solve for their individual business problems. Cassandra is a reimagining of the nature of large-scale reads and writes outside of a traditional relational model that avoids the architectural bottlenecks that exist in the RDBMS genre.
Programming models will also continue to evolve as the infrastructure and platforms evolve - and the crucial aspect of open source community development will continue to show its value as framework and application developers continue to codify their understanding of the new patterns and practices of cloud computing into existing and new application frameworks.
Commercial open source in cloud computing
Commercial opportunties surround these technologies - both in adapting to new de facto standards, and delivering faster or more efficient infrastructure.
A few good examples of these include Heroku - an EC2-based fabric for Ruby on Rails, fully compliant with the evolving standards of the Rails community, yet providing an implementation of Rails functions that scale efficiently on Amazon's AWS - and Cloudera - again, an EC2-based packaging and support offering that makes it simple to deploy Hadoop, which is powerful but notoriously hard to get up and running. I expect that savvy storage vendors will find ways to build Cassandra-compatible infrastructure that enhances scalable operations and analysis of next-gen cloud applications.
Does open source matter in cloud computing?
Some have said - including Tim O'Reilly - that open source doesn't matter in the cloud. I think that what was meant was that once your computing workloads are hosted on a remote service, the most important things to you as a customer are reliability, uptime, scalability, interoperability and manageability - not whether or not the cloud infrastructure or application provider is running on open source. I think this is true.
Cloud computing does not make open source irrelevant. The enormous value that cloud computing is generating, and the rate at which it is changing, demands that open source plays a ubiquitous role. It will not always be a flashy role and may not seek the limelight, but it will be the glue that lets the whole system evolve and mature.
COMMENTS (1) | Add a comment
November 03, 2009 at 4:06 pm Sam Boonin
Sam-
You may want to check out Roman’s blog on a similar topic last month: Small Pieces Tightly Joined: Open Source in the Cloud
Good discussion there…
http://roman.stanek.org/2009/07/09/open-source-in-the-cloud/#comments
*required ADD A COMMENT