attrs pytesseract opencv-python beautifulsoup4 google-cloud-vision boto3 pdf2image numpy transformers tqdm pandas python-dotenv cloudpathlib farm-haystack layoutparser Pillow torchvision detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.5#egg=detectron2