Behind the Build: Why We Hard-Coded a "Triple-Gate" Into Our AI Agents

Let's be honest about the state of AI in 2026.

We have GPT-5 (released last August). We have the new Claude Opus 4.6 (released last week). The "intelligence" problem is effectively solved. These models are brilliant.

But they are still probabilistic engines.

Raw hallucination rates on simple Q&A have dropped to roughly 2–3% for frontier models. That's impressive. But it masks a deeper problem: Agentic Drift — the compounding of small errors across chained autonomous tasks.

When you chain models together to perform complex tasks — researching a company, finding a verified contact, and drafting a hyper-personalised email — those small error rates multiply.

The Compounding Problem

A 98% accuracy rate across 3 autonomous steps produces a ~6% failure rate (0.98³ = 0.941). Send 1,000 emails at that rate, and 60 hallucinated messages land in the inboxes of CEOs. For a sales agent representing your brand, that risk profile is unacceptable.

For Ghost Doctor, our automated sales agent, we cannot rely on "smarter models" to fix this. We had to build a deterministic Triple-Gate — hard-coded guardrails that override the model's probabilistic nature at every critical juncture.

Here is the architecture.

The design principle: probabilistic core, deterministic shell

We don't ask the AI if it wants to behave. We force it to.

The model generates content (probabilistic). The gates decide whether that content ships (deterministic). No gate, no output. There is no override, no "soft fail," no "send anyway."

The Identity Protocol

Entity-RAG

Failure mode → "The John Smith Problem"

Even GPT-5 struggles with entity resolution when data is messy. It finds a "Nexus Group" in Singapore but scrapes data for a "Nexus Group" in London. The name matches. The entity doesn't.

The fix: We built a hard-coded pre-processor using Entity-RAG. Before the agent generates a single token of copy, it must match the target's Business Registration Number (UEN) and domain IP geolocation. The entity is resolved deterministically — not inferred probabilistically.

Rule: No unique entity match = hard stop. The agent is forbidden from proceeding.

The Adversarial Auditor

Chain-of-Thought Verification

Failure mode → "The Manufactured Insight"

Modern models like Gemini 3 Pro are so eager to be helpful that they will sometimes invent a problem just to offer a solution. They might claim a client's website load time is "slow" when it's actually 0.8s — just to pitch an optimisation service.

The fix: We run a secondary, adversarial agent (on a cheaper model like Llama 4-70B) whose only job is to disprove the draft. It re-scrapes the source URL and compares every factual claim against live HTML.

Rule: If the Auditor cannot find the exact evidence string cited in the draft, the email is killed.

The Sentinel Gate

99% Confidence or Bust

Failure mode → "Subtle Toxicity"

We aren't just worried about factual errors — we're worried about tone. A pushy, overconfident AI sales email is worse than no email at all. It damages your brand in ways no retraction can fix.

The fix: Every draft passes through a specialised Sentiment & PII Classifier. We don't just score for "Safe/Unsafe" — we score for "Commercial Humility." The tone must be helpful, specific, and human. Not aggressive, not sycophantic.

Rule: Score ≥ 99.0% → sent automatically. Score < 99.0% → routed to Human Review queue. No exceptions.

The human-in-the-loop reality

You might ask: "Why do you still have humans reviewing emails in 2026?"

Because Human-in-the-Loop isn't a bug. It's a feature of premium service.

Ghost Doctor currently operates at roughly 85% full autonomy. The remaining 15% — the complex, edge-case scenarios that fail one of the three gates — are routed to our team for review. This ensures that when a prospect receives an email from us, it feels human. Because in the moments that mattered, it was verified by one.

"Full autonomy is a vanity metric. Controlled autonomy — where the system knows when to stop and ask — is what separates production-grade agents from demos."

The takeaway for your own AI deployment

If you're building or deploying AI agents for your business, stop waiting for a "perfect" model. GPT-6 won't fix agentic drift. Neither will any single model improvement.

Reliability doesn't come from the model. It comes from the guardrails.

The architecture pattern is straightforward: let the model do what it's good at (generating), but wrap every output in deterministic checks that enforce your standards — identity verification, factual grounding, and tone control. The model proposes; the gates dispose.

This is what we mean by AI Governance at the operational level. Not a policy document. Not a compliance checkbox. A system that makes your AI accountable by design — because the alternative is sending hallucinations to your customers and hoping nobody notices.

The design principle: probabilistic core, deterministic shell

The human-in-the-loop reality

The takeaway for your own AI deployment

Building AI Agents for Your Business?

Related Reading

The Ghost Doctor in the Machine

Data Stagnation: Why the Feb 8 Discover Update is Erasing Static Brands