Email Datasets can be found here
-
Updated
Jan 21, 2020 - Python
Email Datasets can be found here
Fraud Detection by finding the Person of Interest (POI)
A Person Of Interest identifier based on ENRON CORPUS data.
The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)
The fraud identification models were build using Python Scikit-learn machine-learning module.
CEREC and Seed corpus for coreference resolution for email threads taken from the Enron Corpus
[Incomplete] A chrome extension that tells you if a mail you're currently drafting is going to be classified as spam or not.
A project on Extract-Transform-Load (ETL) operations performed on the emails from the infamous enron corpus database.
Enron Email Analysis
📩 Modeling the Enron dataset of emails using graphs
Natural Language Processing (NLP) and programmatic data extraction in large scale fraud investigations.
Spam and No Spam text classification with Convolutional Neuronal Network and Word Embedding
📧 A data engineering exercise
Machine learning algorithms applied to explore Enron email dataset and figure out patterns about people involved in the scandal.
This repository contains code for normalizing the Enron dataset.
Identifying and cleaning the outliers of the Enron Dataset.
LT2212 V20 Assignment 3: Same-author-classification via feed-forward neural networks: Transformed email text (Enron) into a machine readable representation and built a classifier that determines whether two texts are authored by the same person or not.
Convolutional Neural Network to classify the emails of the enron data set
Phishing Detection classifier to filter fraudolent and phishing e-mail.
Add a description, image, and links to the enron-emails topic page so that developers can more easily learn about it.
To associate your repository with the enron-emails topic, visit your repo's landing page and select "manage topics."