Security: A Big Data Problem?
Last week, I had the opportunity to spend a few days with a truly interesting and diverse group of practitioners, who I think are on to the future of information security.
It will come as no surprise to anyone who follows this blog that I believe that future is centered on the ability to do more and do it better with deeper, more accurate and more timely insight into a wider variety of data.
Many of our legacy approaches to security simply are not working – or at least, are certainly not working the way we expect them to. Adversaries recognize the static and siloed nature of defense in many environments. They also recognize the limitations of awareness in most organizations. Most of us simply do not have good enough radar in enough dimensions to truly understand the reality of our situation.
The group I met last week, whom I had the privilege to meet at a workshop in New York, is actively pursuing solutions to these problems – and in every case, all roads lead to greater awareness:
- In defense: Conversations with workshop attendees confirm my expectation that countermeasures of the future will revolve around approaches that are more flexible and adaptive in responding to threat intelligence. Because these approaches will be driven more by data than by technology designed for a certain class of attack, they may well challenge the concept of market segments and silos that have become so well established in security. After all, if most defenses are reduced to detection, blocking, or assuring some level of control, why not simplify and deploy a few key measures that can be generalized to deal with a wider range of issues, provided these tactics can be tuned on demand, in response to insight?
- Part of the answer is that we have not had the ability to handle large volumes of threat and activity data responsively enough to deliver the needed level of tuning. Two things in particular could change that, perhaps dramatically: the ability to do a better job with large and diverse data sets, and the ability to harness scalable and elastic compute capability – both in terms of optimized on-premises platforms and as a hosted model (or “in the Cloud” if you prefer) – to convert insight into response more effectively. On-premises approaches are still in their early evolution – but providers of hosted security technologies are well positioned to capitalize on such trends.
- In strategy, intelligence and security management: Many organizations recognize that wider insight would give them a better handle on the reality of their security posture, and the true nature of threats facing their organization. Many also realize that their current approaches simply don’t cut it. In many cases, their tools for data collection and management are too rigid or siloed, too difficult to deploy and use, or simply do not manage a wide variety of data well enough. Some have pursued efforts such as security data warehousing to try and seize the opportunity – but until the rise of today’s techniques for “Big Data” management, many approaches were limited in how flexible they could be, or in how much data they could effectively handle.
“Pfft,” you say. “Security just isn’t a Big Data problem. We’re not talking about a Google or a Facebook or an eBay here. Hardly any security operation that doesn’t sport a TLA could consider itself a ‘Big Data’ shop – and maybe not even them.”
Well let me ask you something about your current approach:
- Are you overwhelmed with the volume of data you’re already collecting?
- Are you unable to integrate, correlate and synthesize the variety of data sources you really need in order to address your requirements, or to truly understand the reality of your situation – including both structured and unstructured data, from both internal and external sources?
- Are your reporting or analysis tools unable to keep up with information production demands? What about data ingestion or ETL (extract/transform/load) functions? Do you currently struggle with these issues of data velocity?
These, as EMA’s Shawn Rogers and John Myers advise, are the “three V’s” of Big Data. Size matters, but it’s only part of the equation. If you’re wrestling with any of these issues – or any combination thereof – then yes, Virginia, your data problem is big enough.
A little color, however, before we get too hasty: As John noted in our discussion of these issues, there is a “floor” that should be considered before calling your problem a Big Data issue. For example, if your approach currently depends on sub-standard infrastructure or software, you shouldn’t automatically jump to the conclusion that your problem is “big data.” A bare-bones SQL database running on a dated x86 platform should not be confused with the problems of an architecture run on compute platforms or DBMS systems designed for higher performance.
It is also not uncommon for the three V’s to collude with each other. The things that choke volume and velocity may themselves make it impossible to deal with greater data variety.
But if you have considered these aspects and remain frustrated at the inability to gather, synthesize, or generally do as much with as many types of data as you’d like – regardless whether to improve the speed or accuracy of your situational awareness, or to make defense more realistic and effective – then I have news for you:
Infosec may indeed be a Big Data problem. And you may be suffering from it, whether you acknowledge it or not.
The three V’s all contribute to complicating the data management challenge – and, as Shawn observes, how vendors as well as enterprises deal with them will define the next evolution of data-driven intelligence and analytics. Given the diverse nature of data we handle, and our need for a variety of insights into that data – from proactive forensic analysis to event alerting, fraud detection, metrics for program management, and reporting – I expect that security will be no exception.
In an upcoming post, I’ll offer some specifics of how the tools of Big Data can serve security, too.