AI AND LARGE LANGUAGE MODELS: GAME-CHANGING TECH OR OVER-HYPED?
Chief Strategy Officer, Proofpoint
Some saw the tool as an interesting gimmick. Others predicted it would soon replace the human workforce. The cybersecurity industry was just as divided, with opinions split between 'nothing to see here’ and 'this is a cybercriminal's silver bullet'.
As always, the reality is somewhere in the middle. So, as the hype begins to fade, let’s discuss the role of AI, ML and LLMs in the hands of both cybercriminals and cyber defenders.
In short, not really. It's good for a limited number of things, most of which were already being done by machine learning predecessors. But ultimately, it's not going to be a panacea for attackers or defenders. It doesn't magically fix everybody's problems, as some suggested during its initial release.
LLMs do have potential as interfaces and end-user-facing applications. They are not bad at aggregating information and presenting it more digestibly.
If you need to collect a lot of complex information and present it in a way that can be understood by someone with little to no cybersecurity or technical knowledge, it can certainly help you do that, whether you’re an attacker or defender.
Similarly, generative AI can communicate warnings and errors more effectively. Rather than static error codes or one-off alerts, LLMs could offer more interactive advice and draw user attention to mistakes before any damage is done.
Think of it like ‘Clippy’ - the Microsoft help paperclip. But instead of 'it looks like you're writing a letter', an intelligent bot could intervene with 'it looks like you're violating a security process'. If users can communicate back and forth and receive guidance in realtime, they may be less likely to make careless mistakes.
There’s been a gravitation towards worrying specifically about phishing attacks with the release of ChatGPT. But at this stage, and based on its current capabilities, the concern is overblown for a few reasons. Phishing volumes globally have not spiked, though quality levels have improved, especially in non- English languages. Also, many social engineering emails aren’t designed to be “perfect” – they’re intentionally written poorly to find people who are more likely to engage - that’s one part of the threat.
Even where there would be a substantial benefit to having better crafted emails, like many business email compromise scenarios, there is a lot of other information the threat actor needs to have access to. They need to know who is paying what money to whom and when, which they probably have already accessed in a different way (probably by compromising a Microsoft 365 account without the help of AI).
They don’t necessarily need ChatGPT when they already have access to that person's inbox and can merely copy an old email. As usual, cybercriminals will take the path of least resistance.
AI and LLMs may not be particularly transformative at this stage, but that's not to say they don't have valuable use cases. There are some things that these tools lend themselves to really well.
Semantic analysis of messages is quite useful as an augmentation to other detection techniques, including those built with AI. This is particularly true for threats that have no traditional payload (like many BEC attempts) or are sent from compromised legitimate accounts (like much supplier fraud). We can analyse communication patterns and build relationship graphs to identify anomalies that people may not immediately pick up, particularly if they are in contact over a long period.
Before LLMs, we could do much of the same with rules-based ML engines and ML models trained on thousands or even millions of malicious messages, links, macros and more.
The ability to quickly determine whether the message semantically lines up with an attempt to, for example, change bank account details, regardless of how the request is phrased, is a significant development.
As email is the number one attack vector leading to successful compromise, it’s even more important to invest in enhancements in threat protection. One such solution is semantic threat detection through the implementation of LLMs trained on a well-labelled corpus of threats. In an industry first, the Proofpoint People Protection platform leverages LLM analysis to provide pre-delivery protection against social engineering attacks before they can do harm.
The key to getting the most from generative AI from a cybersecurity standpoint is to find ways to apply it in more places without significantly driving up costs.
We can't just feed expensive LLMs like GPT4 enormous volumes of data. Even if we get it down to one-tenth of a cent to analyse a message, the cost when applied to billions of messages is prohibitive for any organisation. In many cases, it is also not the most accurate, fastest or effective way to analyse those messages.
Instead, we need to use other tools like IP reputation, static analysis and sandboxing to block malicious messages rapidly and effectively – particularly for malware, phishing attempts and more obvious BEC variants.
Then, we can apply AI and LLMs where they are most valuable and cost-effective, while keeping false positive rates low, which is a common challenge with AI-only detection approaches.
LLMs can help cybercriminals produce messaging for phishing and BEC attacks faster, in greater volumes and sometimes to a higher standard. But more significantly, this technology also allows threat actors to easily translate content and widen the scope of their attacks.
Japanese organizations are already feeling the consequences of this development. For a long time, BEC was given little focus in Japan, as most threat actors are not Japanese, nor do they understand the language, communication style or business customs.
However, LLMs have removed many of those barriers, and the country is now seeing an increase in this type of malicious messaging.
Ultimately, cybercriminals who are highly successful in the Englishspeaking world can now target many more markets in their local languages, introducing BEC and other sophisticated attacks into user populations that may not be equipped to spot and deter such threats.
To hear Ryan's conversation in full, tune into the Risky Biz Soap Box Podcast: BEC actors embrace LLMs to attack Japan.
Listen now
Artificial intelligence (AI) in cybersecurity has been around for well over a decade, and in that time a lot has changed. Years of trial and error using real-world data have helped researchers refine and advance AI models and algorithms to the point that, in many ways, today’s AI is a miracle of engineering.
Read the eBook