DOCUMENT SURFACE · documents.doloop.io

Your AI invented a number
while it was reading.

This is the surface for pulling data out of documents. It runs a deterministic donkey that extracts the tables from a PDF and ties every single cell back to the spot on the page it came from. Nothing is guessed, and you can prove where each number came from.

What it catches → Try it on a PDF

What it catches

When an AI reads a document and gets bored, it fills the gaps. The donkey does not.

INVENTED

Numbers that were never there

Ask a model to read a statement and it will confidently return figures that do not appear in the document. It fabricates to finish rather than admit it could not find them.

MISREAD

The wrong cell

A total lands in the wrong row, a column shifts, a footnote merges into a value. The number is real but attached to the wrong thing, which is just as wrong.

UNPROVABLE

No way to check it

Even a correct extraction is useless to an audit team if you cannot point at where it came from. No provenance means no sign-off.

The donkey here

WYSIWYD. Deterministic: same PDF in, the same numbers out, every time. Live today.

• WYSIWYD

What you see is what you download

Why: a model will invent a number to finish the job; a script cannot.
How: it detects the table structure on the page and extracts each cell mechanically, then ties every value back to its place on the page. 100% reproducible across 90 extractions, 0 errors on 2,332 cells. Try it on your own PDF.

See it work

An invoice in, a sourced table out. Every cell knows where it came from.

$ check invoice.pdf

  verdict: PASS      cells: 162      sourced: 162 / 162

  every value traced to a box on the page:
    "Subtotal  1,240.00"   page 1, box 84    ✓
    "Tax         99.20"    page 1, box 91    ✓
    "Total     1,339.20"   page 1, box 97    ✓   (= subtotal + tax)

  run it again on the same PDF: byte-identical output.

Same PDF in, the same numbers out, every time, with a source for each.

Two ways to use it

Call the donkey on a file, or run the surface inside a machine that remembers. The difference is state.

CALL IT DIRECTLY

The donkey, standalone

Send a PDF, get the sourced table back. Stateless and simple: same file in, the same numbers out. Try it free right now, no sign-up. Nothing remembered.

• THROUGH THE MACHINE

Stateful and remembered

Connect your own AI to the doloop machine in document mode and the donkey runs inside the loop: your AI reads, the donkey ties every number to the page and rejects anything it cannot source, and only verified data ships. The machine learns your document templates.

Want this on your document pipeline? Talk to us, or see the other surfaces.