deepdoctection - A Document AI Package

deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for fine-tuning, evaluating and running models.
This pipeline consists of a stack of models powered by e.g. Detectron2 for layout analysis and Table Transformers for table recognition. OCR will be provided as well. You can process an image or even a PDF-document. Up to nine pages can be processed.

Please note: The models for layout detection and some pipeline components that are showcased in this space are not open sourced. When you start using deepdoctection you will get models that have been trained on less diversified data and that will perform worse. OCR isn't open sourced either: It uses AWS Textract, which is a commercial service. Keep this in mind, before you get started with your installation and observe dissapointing results. Thanks.

https://github.com/deepdoctection/deepdoctection

Upload a document and choose setting

Original Image

Examples

Number of pages in multi page PDF

Will stop after 9 pages

1 8

Outputs

Contiguous text

Textbox

Layout sections

Textbox

Image Gallery

Table as HTML