How LexCodex avoids hallucinations — verified citations and primary sources
Hallucinations are the biggest practical risk when using generative AI for legal work. When a language model invents a case, cites the wrong section of a statute, or attributes a doctrinal passage to someone who never wrote it, the worst case is a lawyer who files a brief built on imaginary precedent.
It has happened in real cases. The best-known is Mata v. Avianca in New York in 2023, where attorney Steven Schwartz was sanctioned after citing six fabricated ChatGPT cases. Similar sanctions followed in the UK (Felicity Harber, 2023) and Canada (Zhang v. Chen, 2024).
The trouble is that a single hallucinated citation out of a hundred is worse than no citations at all. The reader has no fair way to tell which references actually exist. You can't paper over the problem. It has to be engineered out.
What a hallucination looks like in practice
In LLM contexts, hallucinations are not random errors. They are statistically plausible but factually false outputs. The model produces text that looks like a real case: correct format, reasonable year, plausible judge name. Just one detail: the case doesn't exist.
Common forms in legal AI:
- Fabricated cases, for example "NJA 2018 p. 423" that doesn't exist, or where the number is real but the substance is wrong
- Incorrect statutory references, like "Contracts Act § 36(2)" where the subsection doesn't apply to the question
- Quotes from doctrine that sound plausible but don't appear in the cited author's work
The cause isn't that the model is trying to lie. LLMs are language models, not databases. Ask a raw model about Swedish tort law and you get the most likely sequence of words based on training data. That is a different result from a search against a verified legal source.
How LexCodex is built to reduce the risk
We combine three techniques. None of them solves the problem on its own, but together they remove most of the situations where hallucinations tend to slip in.
Primary sources are pre-mapped, not guessed
Every legal claim in a LexCodex analysis points to a URL in a primary source. The URL patterns are defined in code. The model cannot invent a link unless it matches a verified pattern, and any URL that doesn't match is checked by a separate citation verifier before the analysis is shown.
When you click a cited source, you land on the original document, not on a cache held by us. You verify against the primary source, and we don't store the text.
The model is instructed to say "I don't know"
The system prompt tells the model to say "I'm uncertain, consult the primary source" rather than fabricate a reference. That goes against an LLM's default behaviour, which is to produce fluent text even when the underlying knowledge is thin.
In practice you more often see answers like "I don't have specific case law on this point, check Sveriges Domstolar" or "this provision is interpreted differently in different contexts, the primary source provides full context". That is the difference between a model that sounds confident and a model that is usable in client work.
The model gets to reason in steps before writing the answer
We use the underlying model's multi-step reasoning feature. The model reasons internally before producing its final response. For complex questions like compliance analysis and AI Act classification, the difference in output quality is clear. For simple factual questions it matters less.
Tests you can run yourself
These are concrete tests you can run against LexCodex or any legal AI:
- Ask about a point where published case law is unlikely to exist. The model should say "I don't have specific case law on this", not invent one.
- Click every cited source. It should lead to a verifiable document, not a hub page or a 404.
- Ask the same question with different phrasings. If the answer flips between opposing recommendations, the model is uncertain, and it should say so.
- Ask for the interpretation of a known doctrinal passage. A correct citation or "I don't reproduce specific passages" are both acceptable. Inventing one is not.
What this does not solve
The protection does not eliminate all risk. It removes the room where hallucinations usually appear. The rest is on you as the user: verify, click, check before you take an answer further to a client or a court.
In work where a wrong reference has real consequences, hallucination handling is not optional. It is the precondition for the tool being usable at all. That is why we have put so much effort into this part of the platform.