Tuesday, August 24

SaaSGrid: Decoupled, shared storage in grids

SaaSGrid Shared Content Storage

When you're working in a grid environment like Apprenda's SaaSGrid, many people are uncertain how to deal with the need for shared state and shared binary content.

This is not unlike Amazon's EC2, though it should be noted that EC2 and SaaSGrid are not in the same category of offering -- you can run a SaaSGrid environment on top of Amazon's IaaS offering: EC2 VM images. SaaSGrid itself is a distributed application server and runtime for SaaS. EC2 is a cloud of virtual machines, part of Amazon's larger AWS suite.

With EC2, you do not have persistent storage locally on your virtualized AMIs -- when your AMI reboots or powers down, all local state is lost. Instead, Amazon offers Elastic Block Storage (EBS) for non-volatile storage of data and content used when bringing your AMIs online, database repos, etc.

Likewise, you cannot assume persistent local storage on any given SaaSGrid node -- any given request to SaaSGrid may take multiple paths to reach its destination, and any given node may leave or join the grid environment at any time. This is why SaaSGrid application development, and most other SOA architecture, best current practices dictate stateless design and implementation as much as possible. Some amount of state can be persisted in an out-of-process cache, such as memcached, but there is still often a need for a binary content repository accessible to all nodes.

I diagrammed this concept over on the SaaSGrid Developer Blog yesterday.