AI Cyber Threats: Forescout Reveals Why Generative Models Still Can’t Launch Real-World Exploits

How Real Is the Risk of AI-Powered Cyberattacks?

Fears around AI-driven hacking have skyrocketed in recent months, fueled by dramatic headlines and a growing sense that generative AI might soon enable autonomous, large-scale cyberattacks. A particularly charged term circulating in tech circles is “vibe hacking”, referring to AI systems capable of autonomously discovering vulnerabilities, crafting exploits, and infiltrating systems. But are these fears justified?

Forescout’s Vedere Labs took a grounded, empirical approach to the issue. Through over 50 simulations using a range of large language models (LLMs), both commercial and underground, their research probes the true capabilities of generative AI in cyberattacks. The results are sobering—and surprisingly reassuring. While the models demonstrate some potential for aiding in vulnerability research (VR) and exploit development (ED), they’re far from capable of executing a full cyberattack pipeline on their own.

Generative AI in Cybersecurity: Current Limits and Realities

Forescout’s research paints a detailed picture of the limitations of today’s AI models when applied to offensive cybersecurity tasks:

Limited Success in Exploitation Tasks

Across four main tasks—two each for vulnerability research and exploit development—failure rates were consistently high. Nearly half of the models failed basic vulnerability research tasks, and two-thirds failed early exploit development tasks. Only a minuscule number of commercial models could even produce a working exploit.

Unreliable Performance

Even when models managed to provide a usable output, their performance was erratic. Many produced different results for the same input, some timed out entirely, and others generated responses that were technically incorrect or unusable.

No Model Could Execute a Full Attack

Perhaps most importantly, not one of the tested models could complete an end-to-end cyberattack. This includes identifying a vulnerability, crafting a reliable exploit, and executing the attack. The absence of such end-to-end functionality significantly limits the real-world applicability of generative AI for cybercrime—for now.

Underground Models vs. Commercial Tools

Interestingly, commercial LLMs outperformed underground and open-source models, but only slightly. Just three commercial models managed to produce a functional exploit. Meanwhile, underground models showed promise but were plagued by issues related to output quality and usability, making them unreliable for serious use.

Cybersecurity Fundamentals Still Hold

According to Michele Campobasso, a senior researcher at Forescout, the rise of AI doesn’t radically change the cyber risk landscape. “An AI-generated exploit is still just an exploit,” he said. Standard cybersecurity measures like patching, Zero Trust architecture, and network segmentation continue to be highly effective against even AI-assisted threats.

What Undercode Say:

A New Frontier, but Not a New Battle

The core takeaway from this research is that AI has not yet revolutionized cyberattacks. The breathless media coverage around “AI superhackers” is, for now, more science fiction than fact. While generative models have shown flashes of potential in vulnerability research and exploitation, their instability and inability to execute complex, multi-step attacks make them unfit for reliable offensive use.

The Barriers to Fully Autonomous Attacks

To carry out a successful cyberattack, a system must perform several coordinated tasks: reconnaissance, vulnerability discovery, exploit crafting, payload delivery, and system penetration. Today’s generative AI models lack the contextual understanding, consistency, and operational integration required for this chain of events. Even where underground models had fewer restrictions, their performance remained subpar.

Model Reliability Is a Major Obstacle

The most common issue observed was unpredictability. For cybersecurity tools, reliability is paramount—one unstable exploit could not only fail but also alert the target. The inability of LLMs to deliver consistent, repeatable outputs makes them a poor choice for mission-critical offensive tools.

Commercial Tools Dominate, but Only Slightly

It’s no surprise that commercial models outperformed open-source and black-market options. However, their superiority didn’t translate into real-world viability. Only three models produced working exploits—an unimpressive number considering the scope of testing. This underscores the gap between academic curiosity and operational feasibility.

Reinforcing Cyber Hygiene and Strategy

If there’s one lesson here for organizations, it’s that defensive fundamentals still matter most. Zero Trust, proper segmentation, patch management, and dynamic enforcement of access controls remain your best bet. These approaches neutralize even AI-assisted exploits, provided they’re properly implemented.

Don’t Underestimate AI’s Potential

While the research suggests that AI isn’t a major threat today, it would be naïve to assume it will never be. Generative models are improving rapidly, and as their contextual understanding deepens, so will their capacity to assist in advanced attacks. The window of low risk is closing, and proactive investments in adaptive cybersecurity will pay dividends in the near future.

Regulatory and Ethical Oversight Will Be Crucial

One of the overlooked dimensions of this debate is governance. As open-source LLMs proliferate and underground communities refine their tools, the need for international standards on AI misuse grows urgent. Without guardrails, the line between legitimate security research and malicious experimentation will blur further.

AI Could Lower the Entry Barrier

Even if current models are too weak for solo attacks, they can still reduce the technical barrier to entry for novice hackers. A model that generates even partial code or simplified instructions could empower script kiddies to become mid-tier threats, increasing the frequency of low-skill attacks.

🔍 Fact Checker Results:

✅ No model could complete a full attack pipeline

✅ Only 3 commercial LLMs produced a working exploit

✅ High failure rates were confirmed in all AI-driven cybersecurity tasks

📊 Prediction:

Expect a steady rise in AI-assisted attacks, particularly from low-level actors using models to generate snippets of malicious code or find known vulnerabilities. While full autonomy remains a distant goal, AI will likely act as an accelerant—helping attackers move faster, not necessarily smarter. As these tools mature, so must our defenses.

References:

Reported By: www.itsecurityguru.org
Extra Source Hub:
https://www.stackexchange.com
Wikipedia
OpenAi & Undercode AI

Image Source:

Unsplash
Undercode AI DI v2

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeNews & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin

Listen to this Post