Machine Learning and AI in Cybersecurity: Expert Interview

Practically every form of cyber-attack increased in 2021. Ransomware, in particular, saw a steep uptick, but the costs could one day be measured in more than just financial terms: as networked systems run more of our world’s critical infrastructure, their failure can have life and death consequences. At the same time, the growing adoption of networked systems means cybersecurity professionals have that much more to guard.

“Data can come from any sensor in a distributed system,” says Radha Poovendran, PhD, a professor at the University of Washington and founding director of the Network Security Lab. “The human expert is not going to be able to sit down and sort through it all. You need something at cyber-speed, and that’s a machine learning (ML) or artificial intelligence (AI) system enabled by hardware and computational algorithmic advances.”

ML and AI are not just the future: they’re here, in use, in huge numbers, all around us, behind the scenes. Your browser will have quietly and quickly scanned this website for malicious items before connecting to it. Your antivirus software is passively scanning for incoming threats. And your email is being systematically sorted into categories that range from the important and personal, to the harmless but unhelpful, to the purposefully malicious or dangerous.

It’s not a human that’s doing all that, of course: it’s network protocols that are integrated with databases using ML and AI. To many, it seems like magic.

“With the hardware, software, and algorithmic improvement, we’re able to do so much more with ML than we could in the ‘90s,” Dr. Poovendran says. “But there’s still so much we don’t know. The applications of ML and AI in cybersecurity contain many rich problems for professionals and students to address.”

Cybersecurity professionals are well aware that ML and AI are not actually magic, nor are they perfect. In the wrong hands, they can even be dangerous, as ML and AI can be deceived into making incorrect decisions and harmful predictions. When your grandmother ends up in your Gmail spam folder, or your Google search results appear in a foreign language when traveling abroad, the consequences of ML/AI failure may not seem particularly steep. But in the world of network security, they can be the difference between a catastrophic attack and a regular Tuesday.

To learn more about the basics of machine learning (ML) and artificial intelligence (AI) in cybersecurity and to get a sense of where the field is going next, read on.

Meet the Expert: Radha Poovendran, PhD

Dr. Radha Poovendran is a professor in the Department of Electrical & Computer Engineering at the University of Washington. He is the founding director of the Network Security Lab and is a founding member and associate director of research for the UW’s Center for Excellence in Information Assurance Research and Education. He has also been a member of the advisory boards for Information Security Education and Networking Education Outreach at UW.

In collaboration with NSF, he served as the chair and principal investigator for a Visioning Workshop on Smart and Connected Communities Research and Education in 2016.

Dr. Poovendran’s research focuses on wireless and sensor network security, adversarial modeling, privacy and anonymity in public wireless networks, and cyber-physical systems security. He co-authored a book titled Submodularity in Dynamics and Control of Networked Systems and co-edited a book titled Secure Localization and Time Synchronization in Wireless Ad Hoc and Sensor Networks.

Dr. Poovendran is a Fellow of IEEE and has received various, awards including the Distinguished Alumni Award, ECE Department, University of Maryland, College Park (2016); NSA LUCITE Rising Star (1999); NSF CAREER (2001); ARO YIP (2002); ONR YIP (2004); PECASE (2005); and Kavli Fellow of the National Academy of Sciences (2007).

Main Categories of ML/AI Intrusion Detection Systems

A modern intrusion detection system (IDS), which is tasked with detecting and preventing cyber-attacks, can either be network-based (examining data packets traveling through the network) or host-based (examining information at the software environment level). Beyond that, there are generally two classifications for an IDS: misused-based systems and anomaly-based systems.

Misuse-based systems detect attacks based on the signature of those attacks (i.e., IP address or other identifying metadata associated with malicious vectors), and these techniques are sometimes referred to as signature-based analytics. This method is effective and doesn’t create an overwhelming number of false positives, but it requires frequent manual updates of databases with rules and signatures. This method also is ineffective against zero-day exploits; misuse-based techniques are for preventing known attacks.

“If I develop an ML algorithm only based on a database of threats, I’ll have 100 percent failure for every new threat,” Dr. Poovendran says. “You will defend against what you know, but you will have no clue what you don’t know.”

The second type of IDS is the anomaly-based system. This takes a model for normal network and system behavior, and then identifies and isolates any behaviors that deviate from that model. Each application, system, and network can be customized with its own profile of normal behavior, increasing security further. This method is effective against some zero-day exploits, and data derived from attacks against anomaly-based systems can be used to define signatures for misuse-based methodologies. However, this method can result in false positives, where legitimate but new system behaviors are flagged as potentially hazardous.

“It would be a big mistake if every time something new happens, we classify it as a threat,” Dr. Poovendran says. “We want to cluster it separately, flag it for further inspection, then let the algorithm learn from the outcome.”

In contemporary network design, very few organizations utilize either solely misuse-based or solely anomaly-based techniques, instead opting for a hybrid-based approach. Hybrid models attempt to leverage the best of both misuse-based and anomaly-based methodologies, accumulating knowledge, reducing false positives, and increasing signature intelligence.

The Role of the Human Expert in ML/AI Cybersecurity

“Our adversaries are highly intelligent, and they work very hard as well,” Dr. Poovendran says. “Given that these detection methods are there, what are the gaps that will allow them to bypass the threats? This is where the scientific thinking of the student, or of the forensics professional, or the cybersecurity professional, comes in. They have to be able to reason beyond what is in front of them.”

Machine learning algorithms in cybersecurity can generally be put into three categories: unsupervised, semi-supervised, and supervised:

In unsupervised machine learning, the algorithm itself is tasked with finding patterns within a new data set.
In semi-supervised machine learning, a human assists the algorithm by labeling some portion of the incoming data set.
And in supervised machine learning, a human labels all incoming data while tasking the algorithm with identifying underlying patterns.

“ML and AI are powerful tools, but they are just tools,” Dr. Poovendran says. “It’s still significantly vulnerable to bias, and it’s easy for an adversary to fool an AI. That’s why the human expert has an important role to play in enhancing the quality of the decision-making.”

Effective ML and AI systems in cybersecurity will work to offset human weakness; similarly, human operators will work to offset the weaknesses of ML and AI algorithms. Dr. Poovendran offers examples to illustrate: a set of human eyes might not be able to easily or quickly ascertain all the differences between, say, a set of identical twins, while an iris scan would be able to find innumerable subtle differences instantaneously; at the same time, object recognition, such as in a CAPTCHA, is rudimentary for a human but still confounding to ML algorithms.

“I will say to the future forensics professionals: don’t be afraid of AI and ML in cybersecurity,” Dr. Poovendran says. “Use them as a tool, but be aware that you cannot trust them blindly. You have to develop methods to interpret and explain what you see.”

The Challenges and Opportunities for ML and AI in Cybersecurity

ML and AI make cybersecurity professionals more effective than ever before, with the algorithms acting as another member of the team. These algorithms can either partially or fully automate several cybersecurity processes, including vulnerability detection and attack disruption. As a result, cybersecurity professionals can more accurately detect and respond to potential attacks.

However, those ML systems may incidentally create new attack vectors of their own. In this respect, one of the most transformative effects of ML and AI in cybersecurity is altering the threat landscape, both on offense and defense.

“It’s important to understand the types of threats that are emerging in very advanced AI systems,” Dr. Poovendran says. “Google may have a very nice AI model available, but anyone can take that model and retrain it so that it will work normally in almost all cases, except for a few triggers, which make it an AI trojan.”

Some experts believe ML and AI will continue to provide incremental advantages for cybersecurity professionals, but without making a transformational leap into a new paradigm.

That is still good: in its current state, ML and AI could provide more cybersecurity professionals and organizations with the bare minimum of what ML and AI have to offer today, and the industry would still see noticeable benefits. But ML and AI are based on iteration and learning, and those who will work with them in cybersecurity will need to be constantly learning, too.

“I tell my students, don’t think of an adversary as an enemy, think of an adversary as a person who challenges you to think,” Dr. Poovendran says. “They are throwing puzzles at you. Can you see the puzzle, understand the puzzle, and solve the puzzle? It’s not easy, but if students commit themselves to learning about networking, about algorithms, about quantifying risk, they can collectively make significant contributions to this field.”

Further Resources on Machine Learning and AI in Cybersecurity

Machine learning, AI, and cybersecurity are each their own rapidly evolving fields, sitting at the forefront of what is possible. To learn more about where the three intersect, and where they’re collectively going, check out some of the resources below.

Center for Security and Emerging Technology (CSET): Machine Learning and Cybersecurity: Hype and Reality (2021)
Institute of Electrical and Electronics Engineers (IEEE): A Survey of Data Mining and Machine Learning Methods for Cybersecurity Intrusion Detection (2016)
University of Washington: Network Security Lab (NSL)

Writer

Matt Zbrog

Matt Zbrog is a writer and researcher from Southern California. Since 2018, he’s written extensively about the increasing digitization of investigations, the growing importance of forensic science, and emerging areas of investigative practice like open source intelligence (OSINT) and blockchain forensics. His writing and research are focused on learning from those who know the subject best, including leaders and subject matter specialists from the Association of Certified Fraud Examiners (ACFE) and the American Academy of Forensic Science (AAFS). As part of the Big Employers in Forensics series, Matt has conducted detailed interviews with forensic experts at the ATF, DEA, FBI, and NCIS.

Machine Learning and AI in Cybersecurity: Expert Interview

Search For Schools

Meet the Expert: Radha Poovendran, PhD

Main Categories of ML/AI Intrusion Detection Systems

The Role of the Human Expert in ML/AI Cybersecurity

The Challenges and Opportunities for ML and AI in Cybersecurity

Further Resources on Machine Learning and AI in Cybersecurity

Five Companies with Their Own Digital Forensics Labs

National Cybersecurity Awareness Month 2022: An Expert's Advocacy Guide

The Future of Cybersecurity: Five Predictions from an Expert

Social Engineering: How Hackers Trick People Into Giving Up Secure Data

Mobile Forensics: How Digital Forensics Experts Extract Data from Phones