Twitter and Google offer case studies in spanning distributed systems

Looking at the tools these web giants build to span data centers and geographic distance to deliver real-time services will help make the web more resilient and scalable. Sure, for the layman, his Twitter photos are in “the cloud,” so of course they are available. But this idea of “the cloud” as this monolithic place that has everything on the internet is as simplistic as saying that milk comes from cows. Yes, your photos are in the cloud and yes, your milk does come from a cow, but the process of checking out a picture uploaded to Twitter or getting pasteurized milk to you is a complicated process that has required years of investment and innovation.

Twitter made news yesterday by announcing a new system for hosting photos shared on its service, but the most interesting tidbits came from an in-depth look on how the company is managing software and infrastructure to be able to deliver photos that users upload from multiple locations around the world without introducing a lot of latency. Google, Facebook and other web giants are also facing this problem.

And in Twitter’s case, its Blobstore announcement also pulls back the curtain on an increasingly important element of serving a global audience on distributed networks.

Building Blobstore for more than just photos

From Twitter’s post on the topic:

“When a user tweets a photo, we send the photo off to one of a set of Blobstore front-end servers. The front-end understands where a given photo needs to be written, and forwards it on to the servers responsible for actually storing the data. These storage servers, which we call storage nodes, write the photo to a disk and then inform a Metadata store that the image has been written and instruct it to record the information required to retrieve the photo. This Metadata store, which is a non-relational key-value store cluster with automatic multi-DC synchronization capabilities, spans across all of Twitter’s data centers providing a consistent view of the data that is in Blobstore.”

Twitter doesn’t have one data center, it has several (it doesn’t own all of them, but it has servers in several facilities). So figuring out where a picture is stored on those many servers and delivering it quickly when someone tweets it or tries to look at it requires a lot of code and lot of decisions about how Twitter’s engineers make the tradeoff between reliability and cost as well as speed and cost.

Blobstore Components

Let’s talk about reliability. For Twitter’s purposes copies of the photo should exist in multiple places, but not in so many places that it’s spending too much in money and equipment to keep the photo copies on file. It also has to consider how and where it stores the information about the photo’s location on Twitter’s servers.

With tons of pictures uploaded to the service, that’s a lot of metadata about a picture and its home on a server. If the server storing that data went down it would essentially render those photos irretrievable. But you can’t keep a copy of those locations in too many places because it’s hard to keep them synched and they take up room. So Twitter decided to find a medium that allows for the most reliability but also a fast replication time if a server containing one of its storage nodes dies.

It built a library called libcrunch to do this, and apparently it plans to share the details of that tool in a later post. Twitter says libcrunch “understands the various data placement rules such as rack-awareness, understands how to replicate the data in way that minimizes risk of data loss while also maximizing the throughput of data recovery, and attempts to minimize the amount of data that needs to be moved upon any change in the cluster topology (such as when nodes are added or removed).”

Smarter replication meets cheap (but slow) storage

It sounds complicated because it is. On the speed side, Twitter has chosen to use cheap hard drives to store the actual photo for posterity, but because finding an image on a hard disk takes so much time (that disk needs to spin up) it stores the location of photos on SSDs. Then when a user requests a photo, there’s no searching the hard drives, only retrieval. This is akin to using a search engine as opposed to trying to read the entire web in order to find a piece of information.

So what the heck does this have to do with Google? Twitter is building tools to solve a problem that’s becoming more common to webscale businesses as we toss more information onto the cloud and consumer more web services. When networks are distributed, replicating and finding information across them quickly is a challenge. Google’s spanner database, which syncs information across five databases and requires Google to place an atomic clock and GPS clock on a rack of servers to make sure things stay in synch, is the search giant’s answer to this problem.

Twitter’s is Blobstore. But the point is that these tools are going to help do for distributed computing and services what refrigerated trucks did for milk production. They are going to better deliver a homogenous product to more places at a lower cost. And I bet we see more of these enablers on the horizon.


GigaOM