Skip to content

A Python 🐍 CLI tool to extract images from PowerPoint, Word or PDF files. Supports PPTX, DOCX, PDF.

Notifications You must be signed in to change notification settings

SlideSpeak/image-extractor-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Image Extractor for PowerPoint, Word and PDF

powerpoint-pdf-image-extractor-github

A CLI tool to extract images from PowerPoint, Word and PDF files written in Python 🐍. This script extract all images in your .pptx, .docx, or .pdf file into a local folder. The benefit of using this tool to extract images over taking screenshots is that you get the highest resolution possible.

Use Cases

  • 1️⃣ Extract images from PowerPoint presentations
  • 2️⃣ Extract images from Word (doc/docx) documents
  • 3️⃣ Extract images from PDF files

Features

  • ⬇️ Extract and download all images within a PowerPoint, Word or PDF
  • πŸ“ Supports all image file types (jpg, png, jp2, gif, tiff, ...)
  • πŸ“‘ Supports extracing images from: PowerPoint (.pptx, .ppt), Word (.docx, .doc) and PDF (.pdf)
  • πŸ“Έ High resolution images: Images are not compressed
  • πŸ“€ Runs locally: Keep your data

Setup

Create a virtual Python env

python3 -m venv env

Activate the virtual env

source env/bin/activate

Using pip install all dependencies

pip3 install -r requirements.txt

Requirements

You need to have 7Zip installed because under the hood unzip is used to unarchive and archive the pptx files.

Usage

python3 image_extractor.py <INPUT_FILE_PATH>

⚠️ Note: All images of the PowerPoint, PDF or Word document will be extracted to a folder called extracted_images in the same folder as the original document.

License

Apache License 2.0: See LICENSE file

Author

Written and maintained by SlideSpeak.co

About

A Python 🐍 CLI tool to extract images from PowerPoint, Word or PDF files. Supports PPTX, DOCX, PDF.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages