← Back to all products

Document AI Toolkit

$49

PDF/document parsing pipelines, OCR integration, table extraction, summarization chains, and structured data extraction.

📁 7 files🏷 v1.0.0
PythonYAMLTOMLJSONMarkdownLLM

📁 File Structure 7 files

document-ai-toolkit/ ├── LICENSE ├── README.md ├── config.example.yaml ├── pyproject.toml └── src/ └── document_ai_toolkit/ ├── __init__.py ├── core.py └── utils.py

📖 Documentation Preview README excerpt

Document AI Toolkit

PDF/document parsing pipelines, OCR integration, table extraction, summarization chains, and structured data extraction.

Contents

  • config.example.yaml
  • pyproject.toml
  • src/document_ai_toolkit/__init__.py
  • src/document_ai_toolkit/core.py
  • src/document_ai_toolkit/utils.py

Quick Start

1. Extract the ZIP archive

2. Review the README and documentation

3. Customize configuration files for your environment

4. Follow the setup guide for your specific use case

Requirements

  • Python 3.10+ (for Python scripts)
  • Relevant CLI tools for your platform
  • Access to your target environment

License

MIT License — see LICENSE file.

Support

Questions or issues? Email megafolder122122@hotmail.com

---

Part of [Ai Llm Toolkit](https://inity13.github.io/ai-builder-pro/)

📄 Code Sample .py preview

src/document_ai_toolkit/core.py """ Document AI Toolkit — Core Module Production-ready implementation. """ from typing import Any, Dict, List, Optional from dataclasses import dataclass, field from datetime import datetime import json import logging logger = logging.getLogger(__name__) @dataclass class Config: """Configuration for Document AI Toolkit.""" name: str = "document-ai-toolkit" version: str = "1.0.0" debug: bool = False log_level: str = "INFO" output_dir: str = "./output" settings: Dict[str, Any] = field(default_factory=dict) @classmethod def from_file(cls, path: str) -> "Config": with open(path) as f: data = json.load(f) return cls(**data) def to_dict(self) -> Dict[str, Any]: return { "name": self.name, "version": self.version, "debug": self.debug, "log_level": self.log_level, "output_dir": self.output_dir, "settings": self.settings, } class DocumentAiToolkit: """Main class for Document AI Toolkit.""" def __init__(self, config: Optional[Config] = None): self.config = config or Config() self._setup_logging() self._results: List[Dict[str, Any]] = [] logger.info(f"Initialized {self.config.name} v{self.config.version}") def _setup_logging(self): # ... 40 more lines ...