Conventional wisdom has it that a surefire way of getting your credit card black-flagged is to fill up two or more vehicles with gas and purchase expensive sneakers within a short space of time. That's because this type of spending behavior is rare, and it's also indicative of a card thief who treats himself to new footwear and then fills up his own and his friends' tanks before the card is cancelled.

This type of anomalous behavior gets spotted thanks to Big Data analysis used extensively in large enterprises such as banks to detect card fraud.

But Big Data analysis has many uses beyond fraud detection, and one of the uses that is filtering down from government circles into the enterprise is to detect anomalous network behavior that is indicative of a security breach.


It's difficult to know exactly what's being done in the most secure government installations because that type of information is not made readily available, but Chris Donaghey, vice president of corporate development at KEYW Corporation, a security company that does business in government circles, hints that what the government has goes far beyond what is available for most large enterprises. "The reason their systems are better than what's available in the commercial world is that they have very big budgets," he explains.

Analyzing for Anomalies

But in the near future KEYW has plans to sell a security system to large enterprises that uses similar Big Data technology to that used by the government, and other companies are bound to follow suit. "What we are planning to do is cherry pick the best concepts to create a commercial product," he says. In other words, one which is not quite as good -- but with a price more suited to a commercial organization than the "best security money can buy" approach which some government agencies feel they have no choice but to adopt.

The larger an enterprise, the more useful this Big Data security technology is likely to be, according to Donaghey. "The bigger  you are, the more acquisitions you have probably made, so you'll have heterogeneous infrastructure connecting disparate networks within your enterprise," he explains. "The only way to get the big picture about what is going on on your networks is to look at very large amounts of data."

The total volume of data stored  -- mostly logs from firewalls and other network devices -- could be very large indeed. That's because while some types of anomalous behavior can be detected by analyzing a couple of weeks'  worth of data, other types need six months' worth to be available before they can be detected, says Donaghey. "It really comes down to how much storage you want to pay for. We would encourage you to keep as much log data as you can afford."

The purpose of amassing this Big Data is anomaly detection, just like the banks do to detect credit card fraud. There are a number of different vendors who already analyze Big Data for security purposes -- some from a Big Data analysis background, and others from a log management background. These include:

Once anomalous behavior on the network is detected the next stage is to establish if there really is a threat, or whether the alert has been thrown up because of an unusual but harmless event -- perhaps a user just doing something odd. This is where KEYW's system promises to differ from those of Splunk and others.

KEYW's system will handle this with what Donaghey calls an automated countermeasure system, which launches a forensic scan of the event to find out more. "When someone gets on to a network to exploit it, they leave a trail of digital exhaust. If you know how to trace it you can see how the intruder got in, where they went, and what -- if anything -- was exfiltrated."

Once the countermeasure manager has gathered the data it needs to determine what sort of security breach, if any, has occurred, it will launch appropriate software to remove the threat if it can. Donaghey  claims the system will remediate 80 percent of detected threats automatically, with the rest sent to humans to handle. New threats will be reported back to KEYW so that other customers' systems can handle them better if they encounter them, he adds.

On a Smaller Scale

Big Data analysis solutions tend to be expensive; KEYW's costs "a six figure sum" for a typical deployment, for example. But smaller companies can get some security benefit from Big Data using a solution such as SourceFire's fireAMP, which takes data from endpoints and analyzes it in the cloud.

Smaller organizations generate far less data on their endpoints than large organizations do with all their network devices, but Zulfikar Ramzan, Sourcefire's cloud technology group chief scientist, says that by analyzing all of its customers' endpoint data together it can identify trends that might indicate malware attacks. "The endpoint is the scene of the crime; that's where a network gets compromised. We use the cloud for the heavy lifting and reporting."

Among other techniques, Sourcefire analyzes large numbers of known good and know bad files and runs machine learning algorithms over them to come up with rules that recognize malicious files that if can share with its customers.

Big Data analytics is unlikely ever to be a replacement for existing security measures like IPS and firewalls -- not least because something has to generate the Big Data before it can be analyzed. But its value lies in the fact that it can reveal breaches that might otherwise have gone undetected.  And in a world where network compromise is more a question of when than if, this can be very valuable information indeed.

Paul Rubens has been covering enterprise technology for over 20 years. In that time he has written for leading UK and international publications including The Economist, The Times, Financial Times, the BBC, Computing and ServerWatch.