A big data top 20 for 2012

It has been a watershed year in the world of data. Whether it’s technically “big” or not, the technologies, uses and understanding of data analysis have grown a lot. I’ve been lucky enough to spend the year covering the space and writing a lot about it, and I’ve come across some interesting people, companies and trends doing so.

Here are the 10 most-popular posts I’ve written this year on the topic, as well as 10 of my personal favorites. Maybe they’ll jog your memory about some ideas you’ve forgotten, or perhaps you’re reading them for the first time. Either way, enjoy this look back at in 2012 in data.

The most-popular posts of the year

  • 5 low-profile startups that could change the face of big data (January 28): These companies aren’t so low-profile anymore (or in all cases using the same name), but they’re they’re just as cool.
  • What it really means when someone says Hadoop (February 6): I should revisit this regularly. The Hadoop ecosystem just gets more confusing.
  • Exclusive: The brains behind Hive launch on-demand Hadoop service (June 6): This might be the first of many startups from Facebook’s big data experts.
  • How Facebook keeps 100 petabytes of data online (June 13): Speaking of Facebook, it’s storing a lot of data. Here’s how it migrated it to a new data center without downtime.
  • How India’s favorite TV show uses data to change the world (August 11): The show, Satyamev Jayate, and its host, Aamir Khan, are tackling difficult social issues and generating incredible audience response.
  • How Disney built a big data platform on a startup budget (September 16): Or at least a startup mindset. It used lots of open source — databases, Hadoop and then some — to serve an entire company.
  • Forget your fancy data science, try overkill analytics (September 21): The winner of our inaugural Kaggle challenge used simple models and supplemented with lots of cloud resources to predict what readers want.
  • Why becoming a data scientist might be easier than you think (October 14): Coursera and its online education brethren are teaching the foundational skills needed for some new-age data analysis.
  • 5 trends that are changing how we do big data (November 3): Machine learning, Hadoop applications and artificial intelligence are among the things advancing how we consume big data technologies.
  • A programmer’s guide to big data: 12 tools to know (December 18): Just what it sounds like. This is a collection of tools for easily analyzing application data or building data-centric applications.

The best of the rest

  • Supreme Court sidesteps digital privacy … for now (January 24): United States v. Jones was an important decision for citizens’ Fourth Amendment rights, and might have set the stage for even broader protections.
  • Forget the EU: How to really empower users on privacy (January 26): Google, Facebook and others had some big privacy snafus this year. Here are some ideas for overcoming the tension between free services and privacy.
  • Under the covers of eBay’s big data operation (January 31): A look at the technologies eBay uses to do its big data work, ranging from Teradata to HBase and all in between.
  • Satellite imagery and Hadoop mean $ 70 for Skybox (April 17): This is one of the coolest startups around — at least in terms of shooting for the stars (literally).
  • Hey, Los Angeles, Xerox thinks it can clear traffic on I-10 (July 20): It’s not easy to predict traffic patterns or parking availability, but Xerox and its partners are trying in one of the toughest places around.
  • Why data should be our guiding light on public policy (July 27): Researchers are building computer models and algorithms to help analyze all sorts of difficult problems, yet they remain ignored.
  • 5 ideas to help everyone make the most of big data (September 17): This is a handful of bright ideas from bright people on how to think about data analysis and exploration in an age of big data.
  • A startup asks, ‘What if you didn’t have to analyze data at all?’ (November 20): Although it claims to not use machine learning, BeyondCore shows what’s possible when software finds the connections first before users start exploring.
  • Data isn’t just the new oil, it’s the new money. Ask Zoë Keating (November 20): The idea that data can be as valuable as money is gaining acceptance with everyone from major retailers to independent musicians.
  • How Obama’s data scientists built a volunteer army on Facebook (December 8): We covered a lot about data science and politics in 2012, but Obama for America’s ability to predict how to reach which voters helped deliver the ultimate prize.

Feature image courtesy of Shutterstock user tuulijumala.


GigaOM