Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
CosmosShadow committed Jun 28, 2024
1 parent 91ffc53 commit 99cdf8e
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ This package use [GeneralAgent](https://github.com/CosmosShadow/GeneralAgent) li

## Process steps

1. Use the PyMuPDF library to parse the PDF and extract all non-text rectangular areas (tables, pictures, icons, etc.)
1. Use the PyMuPDF library to parse the PDF and extract all non-text rectangular areas.
2. Convert all non-text rectangular areas on the PDF into pictures and number them
3. Mark each page of the PDF with a red rectangle and number and save it as an image, similar to the following:

![](docs/demo.jpg)

4. Based on the picture in step 3, use a large visual model (such as GPT-4o) to parse and get the markdown content (including pictures, tables, formulas, etc.)
4. Based on the picture in step 3, use a large visual model (such as GPT-4o) to parse and get the markdown content.



Expand Down
2 changes: 1 addition & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

## 处理流程

1. 使用 PyMuPDF 库,对 PDF 进行解析,提取所有非文本的矩形区域(表格、图片、图标等)
1. 使用 PyMuPDF 库,对 PDF 进行解析,提取所有非文本的矩形区域(包括表格、图片、图标等)
2. 将 PDF 上所有非文本的矩形区域转成图片,并进行编号
3. 在每页PDF上标记好红色矩形框和编号,保存为图片,类似如下:

Expand Down

0 comments on commit 99cdf8e

Please sign in to comment.