gptpdf

Using VLLM (like GPT-4o) to parse PDF into markdown.

Our approach is very simple (only 293 lines of code), but can almost perfectly parse typography, math formulas, tables, pictures, charts, etc.

Average cost per page: $0.013

This package use GeneralAgent lib to interact with OpenAI API.

Process steps

Use the PyMuPDF library to parse the PDF to find all non-text areas and mark them, for example:

Use a large visual model (such as GPT-4o) to parse and get a markdown file.

DEMO

See examples/attention_is_all_you_need/output.md for PDF examples/attention_is_all_you_need.pdf.

Installation

pip install gptpdf

Usage

from gptpdf import parse_pdf
api_key = 'Your OpenAI API Key'
content, image_paths = parse_pdf(pdf_path, api_key=api_key)
print(content)

See more in test/test.py

API

parse_pdf(pdf_path, output_dir='./', api_key=None, base_url=None, model='gpt-4o', verbose=False)

parse pdf file to markdown file, and return markdown content and all image paths.

pdf_path: pdf file path
output_dir: output directory. store all images and markdown file
api_key: OpenAI API Key (optional). If not provided, Use OPENAI_API_KEY environment variable.
base_url: OpenAI Base URL. (optional). If not provided, Use OPENAI_BASE_URL environment variable.
model: OpenAI Vision Large Model, default is 'gpt-4o'. You also can use qwen-vl-max, GLM-4V by change the OPENAI_API_BASE or specify base_url.
verbose: verbose mode
gpt_worker: gpt parse worker number. default is 1. If your machine performance is good, you can increase it appropriately to improve parsing speed.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
docs		docs
examples		examples
gptpdf		gptpdf
test		test
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gptpdf

Process steps

DEMO

Installation

Usage

API

About

Releases

Packages

Contributors 8

Languages

License

CosmosShadow/gptpdf

Folders and files

Latest commit

History

Repository files navigation

gptpdf

Process steps

DEMO

Installation

Usage

API

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages