Why Uncle Sam might be ready for Hadoop in the cloud

The federal government has been gung ho over cloud computing in the past few years, but is it ready to do big data in the cloud? Federal contractor GCE Federal thinks so: It’s offering a GCE Cloud service that’s based on Hadoop and is designed for federal agencies to outsource their data analytics workloads.

The big obstacle for GCE is that it’s essentially asking federal agencies — a cautious bunch if ever there was one — to try two entirely new technologies at once. Cloud computing isn’t commonplace among agencies yet, and using Hadoop to try and make sense of their previously underutilized and unstructured data certainly is a new concept for many agencies. But GCE Founder and CEO Ray Muslimani has a plan.

He’s hoping federal budget constraints lower inhibitions against trying out new technologies, and possibly will lead agencies to look at Hadoop as a way to save money. The business case for big data in the cloud is easy to make because many agencies aren’t investing in the big infrastructure necessary to do those jobs, Muslimani told me. Ultimately, government IT decisionmakers just want to know what a vendor can give them and what it will cost, but “that’s not the conversation you would have had five years ago.”

And early on, GCE is focusing on two types of data that actually could end up saving agencies money if they’re properly analyzed: financial management data and procurement data. GCE creates a central data repository through which agencies can analyze their own data against other agencies’ data to determine how and where to best spend their money. Thus far, Muslimani says, procurement data (e.g., who’s doing business with what vendors, the size of the contracts, etc.) has been more popular, likely because there are still walls to break down around the centralization of financial data, especially with a third party.

Just a couple years into the GCE Cloud business, Muslimani acknowledges it’s still just a “small fraction” of GCE’s total revenue, but he thinks it will actually grow faster than the rest. “The budget crunch is forcing everyone’s hand,” he explained, which is fueling the appetite for a shared big data offering.

If GCE is alone in trying to sell federal agencies on a cloud-based approach to big data, it probably won’t be for long. The government already has a cloud-first mandate in place when it comes to provisioning resources for new applications, and big data is growing in popularity even among certain agencies. Cloudera, for example, has highlighted how numerous federal entities already are using Hadoop to tackle workloads ranging from intelligence to search engines. The NSA has been building its own distributed intelligence database on Hadoop since 2009.

But shared, multitenant services like GCE is building will make Hadoop more attainable for agencies that don’t have the IT knowledge and big budgets of intelligence agencies, and there should be lots of business to be had addressing their big data needs.

Image courtesy of Flickr user Rob Crawley.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.

  • Infrastructure Q1: IaaS Comes Down to Earth; Big Data Takes Flight
  • Finding the Value in Social Media Data
  • Defining Hadoop: the Players, Technologies and Challenges of 2011



GigaOM