Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/custom changes #2095

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .docker/router.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ http:
routers:
ollama-router:
rule: "PathPrefix(`/`)"
service: ollama
service: ollama
2 changes: 1 addition & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ local_data
terraform
tests
Dockerfile
Dockerfile.*
Dockerfile.*
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,4 @@ Please describe the tests that you ran to verify your changes. Provide instructi
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream modules
- [ ] I ran `make check; make test` to ensure mypy and tests pass
- [ ] I ran `make check; make test` to ensure mypy and tests pass
1 change: 0 additions & 1 deletion .github/workflows/actions/install_dependencies/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,3 @@ runs:
- name: Install Dependencies
run: poetry install --extras "ui vector-stores-qdrant" --no-root
shell: bash

4 changes: 2 additions & 2 deletions .github/workflows/fern-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
- main
paths:
- "fern/**"

jobs:
fern-check:
runs-on: ubuntu-latest
Expand All @@ -18,4 +18,4 @@ jobs:
run: npm install -g fern-api

- name: Check Fern API is valid
run: fern check
run: fern check
2 changes: 1 addition & 1 deletion .github/workflows/generate-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,4 +80,4 @@ jobs:

- name: Version output
id: version
run: echo "version=${{ steps.meta.outputs.version }}" >> "$GITHUB_OUTPUT"
run: echo "version=${{ steps.meta.outputs.version }}" >> "$GITHUB_OUTPUT"
6 changes: 3 additions & 3 deletions .github/workflows/publish-docs.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: publish docs

on:
push:
branches:
on:
push:
branches:
- main
paths:
- "fern/**"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release-please.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ jobs:
- uses: google-github-actions/release-please-action@v3
with:
release-type: simple
version-file: version.txt
version-file: version.txt
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,4 @@ repos:
pass_filenames: false
language: system
types: [python]
stages: [push]
stages: [push]
2 changes: 1 addition & 1 deletion Dockerfile.llamacpp-cpu
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,4 @@ COPY --chown=worker *.yaml ./
COPY --chown=worker scripts/ scripts

USER worker
ENTRYPOINT python -m private_gpt
ENTRYPOINT python -m private_gpt
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ dev-windows:
(set PGPT_PROFILES=local & poetry run python -m uvicorn private_gpt.main:app --reload --port 8001)

dev:
PYTHONUNBUFFERED=1 PGPT_PROFILES=local poetry run python -m uvicorn private_gpt.main:app --reload --port 8001
PYTHONUNBUFFERED=1 PGPT_PROFILES=huglama poetry run python -m uvicorn private_gpt.main:app --reload --port 8001

########################################################################################################################
# Misc
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,10 +110,10 @@ typing checks, just run `make check` before committing to make sure your code is
Remember to test your code! You'll find a tests folder with helpers, and you can run
tests using `make test` command.

Don't know what to contribute? Here is the public
[Project Board](https://github.com/users/imartinez/projects/3) with several ideas.
Don't know what to contribute? Here is the public
[Project Board](https://github.com/users/imartinez/projects/3) with several ideas.

Head over to Discord
Head over to Discord
#contributors channel and ask for write permissions on that GitHub project.

## 💬 Community
Expand All @@ -122,7 +122,7 @@ Join the conversation around PrivateGPT on our:
- [Discord](https://discord.gg/bK6mRVpErU)

## 📖 Citation
If you use PrivateGPT in a paper, check out the [Citation file](CITATION.cff) for the correct citation.
If you use PrivateGPT in a paper, check out the [Citation file](CITATION.cff) for the correct citation.
You can also use the "Cite this repository" button in this repo to get the citation in different formats.

Here are a couple of examples:
Expand Down Expand Up @@ -150,7 +150,7 @@ PrivateGPT is actively supported by the teams behind:
* [Fern](https://buildwithfern.com/), providing Documentation and SDKs
* [LlamaIndex](https://www.llamaindex.ai/), providing the base RAG framework and abstractions

This project has been strongly influenced and supported by other amazing projects like
This project has been strongly influenced and supported by other amazing projects like
[LangChain](https://github.com/hwchase17/langchain),
[GPT4All](https://github.com/nomic-ai/gpt4all),
[LlamaCpp](https://github.com/ggerganov/llama.cpp),
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,4 @@ services:
count: 1
capabilities: [gpu]
profiles:
- ollama-cuda
- ollama-cuda
268 changes: 268 additions & 0 deletions experiments/llama_index.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
{
"cells": [
{
"cell_type": "code",
"id": "initial_id",
"metadata": {
"collapsed": true,
"ExecuteTime": {
"end_time": "2024-09-05T14:03:18.404518Z",
"start_time": "2024-09-05T14:03:18.401535Z"
}
},
"source": "# Read PDF",
"outputs": [],
"execution_count": 33
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-05T14:39:16.282184Z",
"start_time": "2024-09-05T14:39:16.279186Z"
}
},
"cell_type": "code",
"source": [
"import pdf2image\n",
"import pytesseract\n",
"from pytesseract import Output, TesseractError"
],
"id": "ad1ffea2dcb7dcaf",
"outputs": [],
"execution_count": 74
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-05T14:39:16.648563Z",
"start_time": "2024-09-05T14:39:16.646494Z"
}
},
"cell_type": "code",
"source": "file = \"../local_data/input_raw/test/26223.pdf\"",
"id": "f38556f4ef09d669",
"outputs": [],
"execution_count": 75
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-05T14:39:23.492339Z",
"start_time": "2024-09-05T14:39:18.280277Z"
}
},
"cell_type": "code",
"source": "images = pdf2image.convert_from_path(file)",
"id": "67286a6f741debb0",
"outputs": [],
"execution_count": 76
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-05T14:39:38.005863Z",
"start_time": "2024-09-05T14:39:36.119195Z"
}
},
"cell_type": "code",
"source": [
"pil_im = images[5] # assuming that we're interested in the first page only\n",
"\n",
"ocr_dict = pytesseract.image_to_string(pil_im, lang=\"rus\")"
],
"id": "ec339f7da13fc37f",
"outputs": [],
"execution_count": 79
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-05T14:39:49.348914Z",
"start_time": "2024-09-05T14:39:49.344588Z"
}
},
"cell_type": "code",
"source": "print(ocr_dict)",
"id": "df86f1ed6f5f1b6e",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"096 отдельный батальон материального обеспечения\n",
"\n",
"1099 мотострелковый полк\n",
"\n",
"| береговая ракетно-артиллерийская бригада\n",
"\n",
"военная автомобильная инспекция\n",
"\n",
"гвардейская отдельная десантно-штурмовая бригада\n",
"гвардейская отдельная инженерная бригада\n",
"отдельная вертолетная эскадрилья\n",
"отдельная танковая бригада\n",
"отдельный медицинский батальон\n",
"отдельный танковый полк\n",
"\n",
"полк радиационной, химической и биологической защиты\n",
"\n",
"смешанный авиационный полк\n",
"средняя общеобразовательная школа\n",
"\n",
"центральный узел контроля безопасности связи\n",
"\n",
"11 экипаж большой подводной лодки\n",
"\n",
"110 военная автомобильная инспекция\n",
"\n",
"110 военное представительство Министерства обороны Российской Федерации\n",
"\n",
"110 отдельная мотострелковая бригада\n",
"110 отдельный стрелковый полк\n",
"\n",
"1101 отдел государственного технического надзора\n",
"\n",
"1102 мотострелковый полк\n",
"1104 мотострелковый полк\n",
"\n",
"144\n",
"\n",
"1105 мотострелковый полк\n",
"\n",
"109 отдельный оптико-электронный узел\n",
"\n",
"11 военная автомобильная инспекция\n",
"\n",
"П главный государственный центр судеоно-медицинских и криминалистических экспертиз Министерства\n",
"\n",
"|| отдельный стрелковый полк\n",
"\n",
"11 центральная база резерва танков\n",
"\n",
"110 объединенное управление эксплуатации специальных объектов\n",
"\n",
"117 зенитный ракетный полк\n",
"\n",
"152|1118 военное представительство Министерства обороны Российской Федерации\n",
"153|1118 отдельный радиолокационный узел\n",
"\n",
"154|112 авиационный полигон\n",
"\n",
"155112 гвардейская ракетная бригада\n",
"\n",
"156| 112 отдельный вертолетный полк\n",
"\n",
"157|112 отдельный стрелковый полк\n",
"\n",
"158|1122 отдельный батальон материального обеспечения\n",
"159|1124 отдельный батальон материального обеспечения\n",
"160|1127 ремонтный завод ракетно-артиллерийского вооружения\n",
"161|113 военная автомобильная инспекция\n",
"\n",
"162|1139 отдельный батальон материального обеспечения\n",
"163|1139 отдельный измерительный пункт\n",
"\n",
"164| 114 бригада\n",
"\n",
"165| 114 военная автомобильная инспекция\n",
"\n",
"166|114 гвардейская отдельная мотострелковая бригада\n",
"\n",
"114 гвардейский мотострелковый полк\n",
"\n",
"168\n",
"\n",
"114 отделение территориальное\n",
"\n",
"169\n",
"\n",
"40 гвардейский артиллерийский полк\n",
"\n",
"170\n",
"\n",
"41 гвардейский артиллерийский полк\n",
"\n",
"171\n",
"\n",
"1142 военное представительство Министерства обороны Российской Федерации\n",
"\n",
"172\n",
"\n",
"1143 отдельный зенитный ракетный дивизион\n",
"\n",
"173\n",
"\n",
"115 военная автомобильная инспекция\n",
"\n",
"174\n",
"\n",
"5 государственный специальный химический арсенал\n",
"\n",
"175\n",
"\n",
"150 радиоэлектронный центр\n",
"\n",
"176\n",
"\n",
"177\n",
"178\n",
"\n",
"1152 мотострелковый полк\n",
"1153 мотострелковый полк\n",
"54 мотострелковый полк\n",
"\n",
"179\n",
"\n",
"1155 центр\n",
"\n",
"180\n",
"\n",
"157 пожарная команда\n",
"\n",
"ТТТ.\n",
"\n",
"181\n",
"\n",
"1158 пожарная команда.\n",
"\n",
"182\n",
"\n",
"1159 военное представительство Министерства обороны Российской Федерации\n",
"\n",
"\n"
]
}
],
"execution_count": 81
},
{
"metadata": {},
"cell_type": "code",
"outputs": [],
"execution_count": null,
"source": "",
"id": "7e909e7dc99c7f4a"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading
Loading