Skip to content

Commit

Permalink
prompt should be dict
Browse files Browse the repository at this point in the history
  • Loading branch information
zRzRzRzRzRzRzR committed Jul 3, 2024
1 parent c27d0c3 commit 9efa386
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 8 deletions.
14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,19 @@ See more in [test/test.py](test/test.py)

### parse_pdf

**Function**: `parse_pdf(pdf_path, output_dir='./', api_key=None, base_url=None, model='gpt-4o', verbose=False, gpt_worker=1)`
**Function**:
```
def parse_pdf(
pdf_path: str,
output_dir: str = './',
prompt: Optional[Dict] = None,
api_key: Optional[str] = None,
base_url: Optional[str] = None,
model: str = 'gpt-4o',
verbose: bool = False,
gpt_worker: int = 1
) -> Tuple[str, List[str]]:
```

Parses a PDF file into a Markdown file and returns the Markdown content along with all image paths.

Expand Down
18 changes: 14 additions & 4 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,7 @@

## 样例

有关
PDF,请参阅 [examples/attention_is_all_you_need/output.md](examples/attention_is_all_you_need/output.md) [examples/attention_is_all_you_need.pdf](examples/attention_is_all_you_need.pdf)
有关 PDF,请参阅 [examples/attention_is_all_you_need/output.md](examples/attention_is_all_you_need/output.md) [examples/attention_is_all_you_need.pdf](examples/attention_is_all_you_need.pdf)

## 安装

Expand All @@ -50,8 +49,19 @@ print(content)

### parse_pdf

**函数
**`parse_pdf(pdf_path, output_dir='./', api_key=None, base_url=None, model='gpt-4o', verbose=False, gpt_worker=1)`
**函数**
```
def parse_pdf(
pdf_path: str,
output_dir: str = './',
prompt: Optional[Dict] = None,
api_key: Optional[str] = None,
base_url: Optional[str] = None,
model: str = 'gpt-4o',
verbose: bool = False,
gpt_worker: int = 1
) -> Tuple[str, List[str]]:
```

将 PDF 文件解析为 Markdown 文件,并返回 Markdown 内容和所有图片路径列表。

Expand Down
6 changes: 3 additions & 3 deletions gptpdf/parse.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import os
from typing import List, Tuple, Optional
from typing import List, Tuple, Optional, Dict
import logging

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
Expand Down Expand Up @@ -168,7 +168,7 @@ def _parse_pdf_to_images(pdf_path: str, output_dir: str = './') -> List[Tuple[st

def _gpt_parse_images(
image_infos: List[Tuple[str, List[str]]],
prompt: Optional[str],
prompt: Optional[Dict] = None,
output_dir: str = './',
api_key: Optional[str] = None,
base_url: Optional[str] = None,
Expand Down Expand Up @@ -224,7 +224,7 @@ def _process_page(index: int, image_info: Tuple[str, List[str]]) -> Tuple[int, s
def parse_pdf(
pdf_path: str,
output_dir: str = './',
prompt: Optional[str] = None,
prompt: Optional[Dict] = None,
api_key: Optional[str] = None,
base_url: Optional[str] = None,
model: str = 'gpt-4o',
Expand Down

0 comments on commit 9efa386

Please sign in to comment.