- Security TWENTY
- Women in Security
Most IT security specialists readily admit that the future almost certainly contains the union of AI and automated cyberattack systems. While that conjures up potential scenarios of Terminator-style ‘Skynet’ disasters, the reality is both a lot more fantastical and mundane, writes Justin Smith, CSO for Product at the cloud-based software firm Pivotal.
The notion that there are numerous defensive security solutions already in the marketplace that use some effective form of AI is false. Many vendors use the term very loosely to mean they can automate or orchestrate detection or rules or scans, and thus think they have accomplished some sort of near version of AI. But this isn’t true, and is simplistic pattern matching masquerading and marketed as AI. True AI uses data to make decisions and branches out from those decisions to something that wasn’t immediately obvious or predictable.
My team recently made a discovery that illustrates how AI can be used in a novel way to solve a very real problem. The hypothesis was borrowed from an education theory called transfer of learning: where knowledge of one context is applied to a completely different scenario. My team hypothesised that it could teach Google’s TensorFlow to detect when data is confidential or private, like a password, and when it wasn’t. The idea was to interpret lines of code as an image of grayscale pixels, as a mysterious contextual link exists between AI’s ability to recognise a picture of a flower and a picture of a password in text.
Google’s AlphaZero found a similar link between Go and Chess. AlphaZero used AI to teach a machine how to play Go, and a year after Alpha won a game against the top Go player, it won a match against the top-rated computer chess program. What’s remarkable about the latter feat is that the AI program taught itself in just a few hours to learn how to play at grandmaster level. No one programmed the machine the typical opening moves of chess, or famous games of the past. It took one set of rules from the game of Go and extended them into another – chess. My team presumed that a similar notion could apply to the very real problem of leaked credentials in log files and source repositories. To understand the approach, its best to consider a typical IT security engineer’s day.
Alerts are coming in constantly about all sorts of anomalies: many of them are false positives. Screening these is a complex and tedious human task, and that is what many defensive AI-type tools are trying to help. Many vulnerabilities happen because of human errors too, and these situations rarely trigger an alert. An engineer copies a sensitive file to a cloud storage bucket and sets access rights to ‘anyone’. Or a set of personal data is accidentally copied to a log file and stored as plain text on some external server. These situations aren’t always obvious but are very problematic and can threaten the entire organisation.
With many companies using a variety of automated tools to facilitate patching of systems, or managing common infrastructure situations such as spam handling, and system configuration, why not take things to the next level with more advanced AI tools? While AI isn’t yet the answer to everything, and requires a joining of machine and human learning to be effective, it can be a lot more useful with the right set of applications in the security space, and particularly to defend against adversarial AI-based attacks.
In my team’s research, TensorFlow was provided with a series of matrices that represent textual data, such as a series of private encryption keys, or a list of passwords, or something similar. Software was then used to figure out ways to recognise when data was confidential and when it wasn’t. This was connected to a live series of logs that contained both private and public kinds of data and trained it accordingly. The conclusion? The program could detect confidential data much better than with regular expressions beforehand. Just as Google AlphaZero was able to beat the best computer chess player (and chess programmer) in less than a day, AI demonstrated the promise of being better than regular expression authors.
Granted, transfer of learning as it relates to AI is mysterious. But as organisations learn how to apply AI in the defensive context, attackers are also trying to figure out how to weaponise it. Industry must carve out the necessary time to experiment with AI and learn.