Machine Learning: Providing a Much Needed Assist for Cyber Security

23 Sep

Machine Learning: Providing a Much Needed Assist for Cyber Security

in Blog, Perspectives, Technology

by Nikfar Khaleeli

I thought the Rise of the Machines article was a great read. It provided some important background about artificial intelligence and machine learning, its use by companies to solve complex problems and some emerging concerns. Having recently seen Ex Machina (a thought-provoking movie!), I feel those concerns could be valid.

A number of industries are benefiting from the application of machine learning, as illustrated in the Rise of the Machines article. For example, retailers have used past customer behavior (e.g., purchased products, movie ratings, etc.) as input for a machine learning-based recommendation engine. The results were impressive: in 2006 Amazon was reporting that 35 percent of sales came from recommendations.

But what about applying machine learning to cyber security? Clearly a new approach is needed –despite consistent and significant investment in defensive technologies, security breaches continue unabated with alarming regularity. That’s because attacks are evolving, and multi-stage attacks are becoming far more common. Could machine learning be the solution?

That’s challenging, because cyber security is a unique beast. While retail customers have the intention to repeat behavior, cybercriminals are constantly changing their behavior in order to stay ahead of defenders. Therefore, machine learning-based models that work well for retail don’t for cyber security because cyber attacks patterns are not static. This is why it’s naïve to think that static IDS signatures and SIEM rules can detect a sophisticated adversary.


In spite of cyber security being a challenging problem, machine learning can help – but only if both supervised and unsupervised techniques are used together. Supervised machine learning uses labeled training sets that contain both normal and anomalous (i.e., attacks) samples to create the predictive models. Supervised algorithms are useful when there is some information known about the attack vectors. Unsupervised machine learning does not require training data, working on the assumption that most behavior is normal, with only a small percentage – the deviations – representing anomalies. Unsupervised algorithms are self-learning and can identify attacks even as the behavior of attackers change. When combined, unsupervised and supervised methods provide security teams with the accurate detection needed to defend against advanced attacks but – and this is important – without contributing to alert white noise or requiring any rules to be created.

The indicators of advanced attacks are buried in the massive volumes of network and security data (i.e., packets, flows, logs, packets, files, alerts and threat feeds) that are constantly inundating organizations. Identifying the relationships between weakly connected actions is necessary to surface these advanced attacks. But this is not a task that humans could do in any reasonable amount of time. Clearly, machine learning-based cyber security systems, which detect advanced attacks that get past perimeter defenses, would benefit security teams.

So, if machine learning-based systems automatically detect attacks on the inside, does that mean security practitioners are out of a job? Nope! Effective security requires human intervention because there isn’t a replacement for human knowledge and intuition, which is needed for combating advanced attacks. As internationally renowned security technologist Bruce Schneier very aptly said in his 2001 testimony on Internet Security to the United States:

It's a simple matter of regaining a balance of power: human minds are the attackers, so human minds need to be the defenders as well.

Machines provide assistance by teasing out the relationship between the indicators buried in network and security data to surface potential threats. Threats are constantly evolving and security professionals, using their intuition, will always be required to examine system recommendations with the proper context in order to determine whether accurate results are being produced.

True artificial intelligence (like in Ex Machina) is long a way off.

Tags: Blog, Perspectives, Technology