Automating Email Processing in Outlook: Efficiently Extract Text from PDF Attachments with and without OCR
- Introduction
- Motivation
- Features
- Installation
- Usage
- Excel Configuration
- Screenshots
- Logging
- Support
- Acknowledgments
- License
This project automates email processing in Microsoft Outlook by extracting text from PDF attachments. It supports both native PDF text extraction and OCR-based extraction for scanned documents, reducing manual effort and improving productivity.
Managing emails with attachments in a professional setting can be time-consuming. This project automates the tedious process of sorting and classifying emails, saving time and increasing accuracy for tasks like legal document processing, research workflows, and corporate operations.
- 📄 Extract text from PDFs (including scanned documents using OCR).
- 📥 Automatically sort emails into folders based on keywords.
- ⏲️ Process emails once or at periodic intervals (every 10 minutes).
- 🖥️ Intuitive GUI built with Tkinter.
- 🔍 Comprehensive logging for all operations.
- 🧩 Extensible rules: Configure and load dynamic Excel rules for email sorting.
- Microsoft Outlook installed and configured.
- Python 3.x installed on your system.
- Clone the repository:
git clone https://github.com/lostmedoulle/Content-Based-Outlook-Email-Automator-Python-Tkinter-.git cd your-repository
-
Set Up Excel File:
- Open the
outlook_parameters_mailbox.xlsx
file. - Define your folder destinations and keywords for sorting and processing emails.
Example Excel Configuration:
Folder Destination Filter_1 Filter_2 Filter_3 Filter_4 Filter_5 folder/ invoice payment contract folder/sub_folder tax report - Open the
-
Launch the Application:
- Start the GUI by running the following command:
python Outlook_GUI.py
- In the GUI interface:
- Use the Browse button to load your Excel configuration file.
- View logs and configurations directly in the main tab.
- Start the GUI by running the following command:
-
Processing Modes:
- "Exécuter une fois": Processes all emails once based on the configured rules.
- "Exécuter toutes les 10 minutes": Continuously processes emails every 10 minutes.
-
Close Outlook:
- To ensure a conflict-free environment, the application will automatically close Outlook during processing. Emails are processed in the background.
-
Stop Processing:
- Use the "Arrêter" button in the GUI to stop any ongoing processing tasks.
-
Exit the Application:
- Click the "Quitter" button to safely close the application.
```bash
├── LICENSE # License file for the project
├── Outlook_GUI.py # GUI interface script for the application
├── README.md # Project documentation
├── company_logo_client.png # Company logo used in the GUI
├── core_app.py # Core logic of the application
├── outlook_parameters_mailbox.xlsx # Configuration file for mailbox rules
├── outlook_process_log.log # Log file for email processing
├── requirements.txt # List of dependencies
Logs are generated for every action, providing transparency and debugging assistance.
Log File: outlook_process_log.log
Log Details:
- Records the status of processed emails.
- Provides details of any errors encountered during execution.
- Includes performance metrics, such as processing times.
- USDC on Ethereum:
0x87358fF28b29E09037C8068260062742CDeAD671
- USDC on Base Chain:
0x87358fF28b29E09037C8068260062742CDeAD671
- SOL :
Gd4ncC2zXuj7ickNHJuHHtAoEKESTYd5FCJzzQwqANWJ