Drawn to Scale wants to make MongoDB scale like Hadoop

If you love MongoDB but are tired of trying to scale it past a handful of machines and a few hundred gigabytes, database startup Drawn to Scale says it has you covered. The company has expanded the functionality of its distributed data platform from SQL to MongoDB, meaning users of the popular NoSQL database can import their data to Spire and see high performance on hundreds of terabytes.

Drawn to Scale’s flagship product, called Spire, is a distributed data platform that’s built atop an optimized version of the Hadoop-based HBase database. HBase is what lets Spire scale cheaply and easily across. Its fully distributed index is what lets Spire read and write data at speeds that other approaches to scaling databases (e.g., sharding) can’t handle while maintaining the ability to handle rich queries.

To date, the company has been focused on letting users run massive SQL databases, but it has finally completed a lengthy process of rewriting parts of MongoDB to work with Spire, Founder and CEO Bradford Stephens (who’ll be participating in our Structure: Data event this week in New York) told me. The company had been keeping the work under tight wraps “because we didn’t know how long it was going to take to build,” he added.

SpireMongo-230x300“Our big vision for the market is providing people with a universal data platform,” Stephens said. After SQL — which accounts for the vast majority of databases in existence — MongoDB is a logical next step (although Spire also supports queries using Hadoop MapReduce). It’s the most-widely used NoSQL database by a longshot, but although many users love its functionality and tooling, the database is notoriously poor at scaling to meet the demands of big data or high performance.

“You just sort of top out once you max out the memory,” Stephens explained, adding that MongoDB often starts getting inefficient as it’s forced to scale across 50 or 10 servers. “[T]hat’s where we start getting efficient.”

Now, without changing a single line of code, he claims, MongoDB users can import their data onto Spire and start handing 200-plus terabytes with ease. Of course, he noted, this doesn’t mean MongoDB users will abandon the database entirely. It might be they keep it for running applications that don’t require it to scale beyond a single server, and then use Spire to store big data for analytical purposes.

Initially, Spire will just support data importation and the basic CRUD (create, read, update, delete) functions of MongoDB, Stephens said. Later this year, assuming users want it, Drawn to Scale will implement MongoDB’s native MapReduce functionality as well as its management features.

As data volumes and data stores continue to proliferate, though, Drawn to Scale isn’t the only startup trying to provide a one-stop shop experience. At least for analytics, Citus Data is building a Postgres-based database capable of analyzing SQL, Hadoop and MongoDB data, although each data store remains external. And there’s a whole group of companies merging SQL and Hadoop for analytic workloads that might be wise to consider supporting operational data stores such as MongoDB, as well.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.

  • A near-term outlook for big data
  • Defining Hadoop: the Players, Technologies and Challenges of 2011
  • With Scalable Data Stores Around, Is NoSQL a Non-Starter?


GigaOM