

Who we are.
Digital Originators
We have a robust repository of primary source content not available on the web. Specialized digitization from private and public archives, includes newspapers, city directories, yearbooks, books, historic documents, maps, and more.
Content Aggregators
We offer a substantial dataset of public domain and copyrighted works for language datasets. We bundle curated and formatted text from digitized rare collections.
Licensing Intermediaries
We initiate conversations between the owners and data purchasers to establish licensing rights.

What we do.
Raw Scans (TIFF & JPEG): High-resolution imagery for historical preservation and archival research. Industrial-grade images were scanned at 300-400 DPI and advanced Optical Character Recognition (OCR).
Structured Text (JSON/XML): Formatted text for LLM training, including metadata like author, year, and genre.
Custom Syndication: Collection bundles organized by subject or date.