Version: 2025-02-13

Explaining the Confidence Score

The confidence score associated with an answer measures the degree of statistical certainty that the extracted result is detected correctly. As Lazarus' LLMs are probabilistic, the confidence score can vary with the same document and prompt.

The confidence score is influenced by a variety of factors:

The presence of handwriting
The resolution of an image, scan, or pdf
Data science metric of model’s confidence

A confidence score around or above 70% indicates that the model has sufficient context to address your prompt. Low confidence scores can be a sign of document-related problems (low image quality, confusing handwriting, etc.) or prompt-related problems (vague queries, confusion about which data to reference), so you may want to investigate these possibilities. Importantly, a high confidence score is not an indication of an answer’s validity; it merely indicates that the model believes it has sufficient context to approach your prompt.

When the confidence score is high but the answer is wrong, tweaking the prompt can lead to improved results. See our Prompting Guide for more help with prompting.