The Ghost Doctor in the Machine

When a major Singapore hospital asked us to audit how AI platforms represent their organisation, they expected to find a few inaccuracies. Maybe a wrong address. Perhaps an outdated department name. What they didn't expect was a doctor who doesn't exist.

What we were looking for

The engagement was straightforward: query major AI platforms — ChatGPT, Gemini, Perplexity, and others — about the hospital, compare the responses against the hospital's own records, and document every factual discrepancy.

The hospital's internal team had been doing this manually for about three months. They'd found 23 errors — mostly minor things like outdated phone numbers and discontinued services still being mentioned.

Key Finding

Our system identified 68 factual errors in 14 days — compared to 23 found manually over 3 months. That's a 3x improvement in detection rate at 7x the speed.

How does a ghost doctor appear in AI?

The most striking finding was a fabricated medical professional. When asked about specialists at the hospital, ChatGPT confidently named a doctor who has never worked there. The name appeared to be a composite — elements of real doctors' names, blended into a plausible-sounding but entirely fictional person.

This isn't a bug. It's how large language models work when they lack sufficient ground truth. The hospital's website didn't have comprehensive structured data about its medical staff. So the model filled the gap — and filled it with fiction.

"AI doesn't stay silent when it doesn't know. It guesses. And in healthcare, a confident guess about a doctor who doesn't exist isn't a minor error — it's a liability."

What categories of error did we find?

Fabricated personnel — doctors, department heads, and specialists who don't exist at the hospital
Service misrepresentation — services described that the hospital doesn't offer, or hasn't offered for years
Address and contact errors — wrong building numbers, outdated phone lines, incorrect operating hours
Pricing fabrication — specific prices quoted for procedures that don't match any published rate
Entity confusion — the hospital being conflated with similarly-named institutions in other countries

Liability Note

Under Singapore's CPFTA, businesses can be held liable for misleading representations made to consumers — including those generated by AI. When a patient books an appointment based on fabricated information, the question of liability is not hypothetical.

Why manual detection fails

The hospital's team was diligent. But manual auditing has structural limitations that make it inadequate for this kind of problem:

Prompt sensitivity — AI gives different answers depending on how you phrase the question. Manual teams test a handful of phrasings; our system tests hundreds.
Model variation — ChatGPT, Gemini, and Perplexity all fabricate differently. Manual audits typically check one or two.
Confidence gating — our triple-gate verification system distinguishes between confident fabrication and hedged uncertainty. Humans struggle to systematically categorise this at scale.

What we did about it

For each error, we delivered a specific remediation: structured data additions (JSON-LD schema for medical staff, services, and contact information) designed to give AI platforms the ground truth they're missing. The goal isn't to argue with the models — it's to make the correct answer easier for them to find than the fabricated one.

The broader implication

This case isn't unique to healthcare. Every industry where AI platforms serve as an information intermediary — legal, financial services, education, professional services — faces the same structural problem: if your website doesn't provide machine-readable ground truth, AI will fill the gaps with plausible fiction.

The difference in healthcare is that the stakes are patient safety. But the mechanism is identical everywhere: information gaps create fabrication risk, and fabrication risk creates liability.

If you want to know what AI is saying about your organisation, the free hallucination scan is a good starting point. For a comprehensive assessment of your regulatory exposure, Nexus Guard maps the full picture.

What we were looking for

How does a ghost doctor appear in AI?

What categories of error did we find?

Why manual detection fails

What we did about it

The broader implication

What Is AI Saying About Your Business?

Related Reading

The ASEAN AI Split: One Region, Three Rulebooks

When AI Lies About Your Business, Who Pays?