The good news is that progress in this area has been dramatic. The conversation has moved from whether AI hallucinations can be managed to how best to manage them in operational contexts.
The Improving Landscape
Model capabilities have improved significantly over the past eighteen months. Independent benchmarks now show leading models achieving hallucination rates below five percent on factual tasks, with some approaching sub-two percent on structured queries. Research published in late 2024 demonstrated that state-of-the-art models like GPT-4o achieved hallucination rates of 1.5 percent, while Claude 3.5 Sonnet achieved 4.6 percent on standardised assessments[1].
More importantly, techniques for reducing AI hallucinations in operational contexts have matured. Retrieval-augmented generation, where models are grounded in verified data sources before responding, has been shown to reduce hallucination rates by up to 71 percent when implemented correctly[2]. The combination of better models and better architectures means that AI hallucinations are now a manageable risk rather than a fundamental barrier.
Separating Hallucination from Repeatability
One source of confusion in discussions about AI hallucinations is the conflation of two distinct concerns: factual accuracy and deterministic repeatability. These require different solutions.
Hallucination is about the model generating incorrect information. Repeatability is about getting consistent outputs from consistent inputs. A model might be factually accurate but produce slightly different phrasing each time. Conversely, a model could consistently produce the same wrong answer.
In security operations, both matter but for different reasons. You need factual accuracy so that investigation findings are correct. You need repeatability so that audit trails are consistent and processes are predictable. Addressing AI hallucinations requires tackling both dimensions.
Technical Approaches to Mitigation
Several architectural patterns have proven effective at reducing AI hallucinations in security contexts. Retrieval-augmented generation grounds the model in your actual data, whether that is threat intelligence feeds, asset inventories, or historical incident records. Vector databases enable semantic search over structured knowledge bases. Graph-based retrieval can traverse relationships between entities to provide richer context.
Prompting strategies also matter significantly. Few-shot prompting, where you provide examples of correct outputs, dramatically improves accuracy on domain-specific tasks. Chain-of-thought prompting, where the model is asked to reason step by step, reduces errors on complex analysis. Multi-agent architectures, where different AI components verify each other's work, catch errors that single-agent systems miss.
We have seen this directly in detection engineering work. When using generative AI to write detection logic, schema hallucinations are common, particularly on platforms like Splunk where data models are highly variable. Providing example schemas and using few-shot prompting reduces these errors substantially. The same principle applies across security operations: ground the AI in your specific context.
Architecture Over Model Selection
A key insight from operational experience is that architecture matters more than model selection for managing AI hallucinations. The difference between a well-architected system using a good model and a poorly architected system using the best model is substantial.
At Bridewell, our approach combines deterministic workflows with AI at specific decision points. Evidence gathering follows defined procedures that ensure completeness and consistency. AI analyses the gathered evidence, but its outputs include confidence scores and source attribution. Human analysts review recommendations before execution, particularly for high-impact actions.
This hybrid architecture means that even if the AI component produces an occasional hallucination, the overall system catches it. Deterministic evidence gathering ensures the AI is working from accurate data. Confidence scoring flags uncertain outputs. Human review provides a final verification layer. The result is a system where AI hallucinations are contained rather than propagated.
Moving Forward
AI hallucinations are a real concern but not an insurmountable one. The question for security leaders is not whether to use AI but how to implement it with appropriate safeguards. Model improvements continue to reduce baseline hallucination rates. Architectural patterns like RAG and multi-agent verification provide additional layers of protection. And human-in-the-loop processes ensure that AI outputs are validated before consequential actions are taken.
The organisations seeing the best results from AI in security operations are those that have invested in architecture, not just tools. They have built systems where AI amplifies human capability while humans verify AI outputs. That balance is where reliable, trustworthy AI in security operations becomes achievable.