Local-First Document Structure

Document Structure Before AI Inference

Sourcetrace is Lumen & Lever’s local-first document-structure layer for AI pipelines where privacy, layout, tables, source evidence, and diagnostics matter.

It exists for organisations that need to know what survived extraction before documents are chunked, embedded, retrieved, or passed to a model.

The Sourcetrace Layer

The Source Structure Has to Survive

Most AI document workflows convert files into text, split the text into chunks, embed the chunks, and ask a model to reason over the result.

That sequence is useful only if the source structure survives extraction.

Sourcetrace is designed to preserve and expose the evidence that ordinary text extraction often loses.

01

Sourcetrace RTF

Read RTF as structure, not just text.

Sourcetrace RTF is powered by rtfstruct, a free open-source Python parser for converting Rich Text Format into a structured document AST.

It preserves paragraphs, inline styles, lists, tables, links, fields, annotations, notes, images, metadata, diagnostics, and source evidence where available.

Apache-2.0
Free open source

View Sourcetrace RTF
02

Sourcetrace PDF

Local-first PDF structure extraction.

Sourcetrace PDF is powered by pdfstruct, a source-available PDF extraction layer for converting born-digital PDFs into traceable layout-aware JSON and Markdown.

It preserves page evidence, text positions, reading-order candidates, tables, annotations, metadata, and diagnostics where available.

Source-available
Free local evaluation and non-production use

View Sourcetrace PDF
Commercial Use

Sourcetrace Sits Under the Lumen & Lever Control Model

Sourcetrace is not a generic document-conversion tool.

It exists to support AI control work where document structure, privacy, layout, tables, diagnostics, and source traceability matter.

Sourcetrace RTF is open source because it is a credibility and interoperability layer.

Sourcetrace PDF is source-available because PDF extraction for commercial, legal, financial, and regulated workflows requires ongoing maintenance, support, and licensing clarity.

For organisations using Sourcetrace in production AI pipelines, Lumen & Lever offers document-ingestion review, commercial licensing, custom extractor packs, and broader structural AI architecture work.

View Document Structure Review Discuss Commercial Use