Ford’s data chief joined many other top executives who are bemoaning the lack of simple tools to solve big data problems — namely the fact that running Hadoop clusters or performing analytics is still a job that requires a specialist. If we want to big data revolution to scale, then we need to make it as easy as Netscape made the web surfing experience. Here are 7 startups making that happen.
Ford’s data chief John Ginder, did an interview with ZDNet in which he says:
“That’s a great endpoint I’d love us to move toward,” said Ginder, “but there aren’t enough of us and there aren’t enough of those tools out there to enable us to do that yet. We have our own specialists who are working with the tools and developing some of our own in some cases and applying them to specific problems. But, there is this future state where we’d like to be where all that data would be exposed. [And] where data specialists — but not computer scientists — could go in and interrogate it and look for correlations that might not have been able to look at before. That’s a beautiful future state, but we’re not there yet.”
Datahero: This startup is all about visualization — namely making it easy to take data and turn it into pretty pictures that can then generate new understanding or convince someone to take action. Users bring their datasets files and Datahero does the rest.
Prior Knowledge: Relative newcomer Prior Knowledge is the brainchild of MIT grads who wanted to let non data scientists play around with data. The company offers a service that lets people upload their data and hook into PK’s database API. The service then assess the information for correlations as well as helps app developers build predictive models. It’s raised $ 1.4 million in funding from Founders Fund and angels.
Platfora: Hadoop is everyone’s favorite big data batch processing platform, but it’s not easy enough for everyone to use. Like others Platfora wants to make Hadoop so easy even I could use it, through an intuitive user interface that has advanced data science functions built in, rather than making users perform queries. It has raised $ 5.7 million and its product will be out next year.
ClearStory: Big names back this startup, which is also a service as opposed to software. Google Ventures, Andreeseen Horowitz, and Khosla Ventures have funded ClearStory, which aims to help funnel data from a variety of source (including Hadoop!) into one place, where employees can then use a GUI to interact with and visualize that data.
Karamasphere: The Karmasphere product is designed to ease the process of developing Hadoop workloads and applications, even from the desktop. It lets users write SQL-like queries while also connecting to their favorite BI tools and analytics software to the software to perform analysis.
Datameer: Like others on this list Datameer is out to make Hadoop more relatable to non nerds. In this case it does this by creating a familiar spreadsheet overlay so businesspeople can analyze their Hadoop jobs and then let’s people create visualizations and draw correlations. It’s closest to Karamsphere, but its latest feature that allows someone to run it on a single machine is a differentiator.
BigML: Much like Prior Knowledge, BigML is a startup that combines data with machine learning to help give normal people access to the smarts to help them answer questions with their data. It hopes to let people do machine learning in four easy steps: set up a data source; create a dataset; create a model; and generate predictions. It’s in private-beta mode right now.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
- 2012: The Hadoop infrastructure market booms
- Infrastructure Q2: Big data and PaaS gain more momentum
- The importance of putting the U and I in visualization