How Hadoop can help keep your money in the bank

Banks and their customers are under constant threat from cybercriminals trying to get at their money, but at least one bank is using Hadoop to help ensure those attacks aren’t successful. For years, the security team at Zions Bancorporation has been using big data techniques to keep track of what’s happening on its network, but Hadoop has changed the face of its analytics efforts.

I spoke recently with Zions Chief Security Officer Preston Wood, who explained how his team came to Hadoop and how it’s using it to keep all of the company’s stakeholders secure.

A long journey to Hadoop

In the 14 years he’s been with Zions, Wood said the security team has always had a big data mindset. First, it was security information management, or SIM, tools that aggregated logfiles and let users sort through and analyze them. However, they were based on relational database technology that soon reached its limits on performance and scale.

Around 2005, Wood said, his team made the move to a massively parallel processing system that was designed for log management but that his team bent and hammered into a data warehouse for analytics. “We adopted a business intelligence mindset,” he said, “but slanted toward security.” He brought in some data-analysis specialists, and they started mining data and searching for patterns, a process made easier by the new higher-powered and more scalable system. But it, too, reached its limits as unstructured data from myriad new sources began streaming in.

Unlimited scale means unlimited possibilities

In 2010, Zions started its transition to Hadoop and has been running its big data workloads exclusively on that platform since late 2011. Wood said he’s loading about 130 data sources into Hadoop, including server logs, web logs and customer transactions. Now, he explained, his team is able to analyze massive amounts of data — and fast — to detect everything from malware and spear phishing attempts to account takeovers. The latter is similar to credit card theft, only instead of discovering anomalous spending, Zions is able to detect anomalous transfers from customers’ bank accounts.

And Wood doesn’t worry about outgrowing his Hadoop cluster, which means his team can keep innovating on new ways to detect criminal behavior. If you’re monitoring network traffic, for example, Wood said, you “have to get down to 0s and 1s in packets to look for the needle in the haystack.” That means storing and analyzing everything in its raw form.

Previously, he said, his team had to massage data and perform a fairly formal ETL process before putting information into its data warehouse because the system required structured data. Since cost and performance constraints imposed limits, there was a glass ceiling on the size of the data store as well. Hadoop, however, is designed to scale cheaply (and into the thousands of nodes) using commodity servers, meaning that Wood doesn’t have to pick and choose what he keeps around.

Zions Bancorporation’s primary partner in its Hadoop effort is Zettaset, the Hadoop-management startup that also has a product, its Security Data Warehouse, that targets the security use case specifically. Wood said his team also uses Datameer, the spreadsheet interface for Hadoop, for performing certain analytic tasks in conjunction with its own home-grown tools.

ipTrust’s architecture for the rating safety of IP addresses

As I have explained before, Hadoop is becoming a popular tool for security organizations, although primarily vendors concerned with building services that track down malware. Wood said his sense is that “a lot of security organizations [within the financial services space] have just been slow to adopt a data-centric mindset,” but he thinks they’re coming around. If companies want to monitor the whole gamut of potential risks, he said, they have to look inward at their own data and not just rely on vendors.

Feature image courtesy of Flickr user quinn.anya.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.

  • Dissecting the data: 5 issues for our digital future
  • Connected world: the consumer technology revolution
  • Defining Hadoop: the Players, Technologies and Challenges of 2011



GigaOM