The state of Hadoop: Strong and poised to explode

Now six years old, the Apache Hadoop platform for storing and processing huge amounts of data — perhaps the catalyst of the current big data movement — appears ready for its closeup. According to the companies leading the Hadoop charge, they’re already beating away customers with a stick. Continual improvements to make Hadoop consumable by mainstream business users and applications are only going to make things better.

As with any new technology, the big question surrounding Hadoop as a viable market is whether enterprises will adopt it. The answer seems to be a resounding “Yes.” Already, Hortonworks CEO Rob Bearden told me, “We are seeing Hadoop in almost every Fortune 500 in either a proof of concept or a pilot.” Bearden doesn’t mean that his company has accounts with everyone in the Fortune 500, though, just that the majority of those companies are looking into Hadoop.

Cloudera, the first company to commercialize Hadoop (all the way back in 2008), certainly has a lot of those premier accounts. Cloudera CEO Mike Olson says interest in his company’s software and services is “absolutely skyrocketing” and it has many more deals in the pipeline that it ever has before. That’s on top of the big deployments (such as those at Nokia, Samsung and Chevron) his company already has in place.

Even MapR — a Hadoop startup that hasn’t been in the public eye as long as Cloudera and doesn’t have Yahoo roots to tout like Hortonworks does — claims to be killing it. It has flagship customers such as comScore and Boeing, as well as an OEM deal with Hadoop frenemy EMC Greenplum that MapR VP of Marketing Jack Norris told me is driving a lot of deals. EMC resells MapR’s M5 Hadoop distribution under the EMC Greenplum Hadoop MR Edition moniker.

That MapR is able to sell licenses for Hadoop — something most of its competitors give away (even MapR has a free version called M3) — says a lot about demand for Hadoop. “My guess is that we’ll be the license revenue leader within the next quarter,” Norris said. “We have higher M5 licensing and use than you would expect.”

But is Hadoop a bubble?

However, despite all the enterprise interest in Hadoop, some critics worry that it’s an overhyped technology that is bound to disappoint companies that put too much stock in it.

It’s easy to see where someone would get that idea considering how “Hadoop” rolls off the tongue as soon as a discussion turns to big data or analytics. And then there’s the money: Hadoop distribution vendors such as Cloudera, Hortonworks and MapR raise venture capital in increments of $ 10 million, and it seems as if every startup claiming some connection to Hadoop is able to raise at least a few million.

It all seems too good to be true. We’ve seen this story play out before with technologies that never really caught on (such as virtual desktops) and industries that collapsed despite the promise of a technological savior (think Java and the dot.com era). With many of those Hadoop installations still in the pilot phase, there’s still time for the companies testing it out to back away when it doesn’t pan out.

Nope. Here’s why.

But that skepticism appears misplaced when it comes to Hadoop, which has everything going in its favor right now. At the foundational level, where even Hortonworks’ Bearden acknowledges Hadoop “is not [yet] 100 percent intuitive,” the story is getting better. As it gets easier to deploy and manage, IT departments tasked with running Hadoop clusters are going to put up less of a fight.

Rob Bearden

And it is getting easier. Reference architectures? Check. Cluster management software? Check. Preconfigured software-hardware stacks? Check. “We have to evolve Hadoop to become an enterprise data platform,” Bearden said, and all these things — along with buy-in from the world’s largest IT companies — will help make that happen.

Oh, and now VMware wants to make Hadoop run on virtual machines to help make it more resource-efficient and dynamic. For startups, Hadoop is available in myriad formats as a cloud service, which means teams with small IT teams or budgets don’t need to own or manage a cluster at all.

Actually running analytics jobs is also getting a lot easier, especially for companies that want to extend their current practices into bigger, badder datasets. Basic analytic functions are becoming child’s play thanks to Hadoop-focused startups such as Karmasphere, Datameer and Platfora. Nearly every analytic database and business intelligence product on the planet also now connects with Hadoop. So does Microsoft Excel, an integration Cloudera’s Olson said “is going to make the the biggest change in Hadoop [adoption], generally.”

But it’s really about apps

Probably the most exciting sign of Hadoop’s prospects, though, is the number of entirely new applications it’s enabling for companies creative enough to spot the opportunities. I spent two days at Hadoop Summit this week, and while talks by Twitter and Facebook stole the show, I thought some of the most interesting (in theory, if not in practice) we’re around using Hadoop to do things like improve online education or search online forums that often house the only available information on super-niche topics.

We’ve covered many more on GigaOM through the years, ranging from better targeted advertising to better customer service to more-intelligent health care. And then there are tools such as Spire, a high-performance SQL database from startup Drawn to Scale that’s based on HBase, an open source database built atop the Hadoop Distributed File System.

Olson thinks the availability of applications — especially those built by sofware vendors and targeting specific uses within specific industries — will spur a flood of Hadoop adoption. For example, he said, an application for financial risk analysis “will be very easy to sell into hedge funds.” And as Olson told entrepreneurs considering building Hadoop-based apps at our Structure: Data conference in March, “Call me, I’ll connect you with funding. The money is out there.”

I have no reason to doubt it is. Hadoop Summit brought together about 2,200 people of all stripes that are working with this six-year-old technology in some manner. Earlier in June, Cloudera put on a product launch party in San Francisco that rivaled any IT event I’ve ever seen in terms of sheer swankiness. If Hadoop isn’t poised to become a multi-billion-dollar market very soon, it’s putting on one heck of a facade.

Feature image courtesy of Shutterstock user Carlos Caetano; Mike Olson photo by Pinar Ozger.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.

  • Why service providers matter for the future of big data
  • Amazon’s DynamoDB: rattling the cloud market
  • 2012: The Hadoop infrastructure market booms



GigaOM