Skip to content

Instantly share code, notes, and snippets.

@ZenithClown
Last active August 20, 2024 09:31
Show Gist options
  • Save ZenithClown/68cb16b2f86bdc240c73247974a4c93d to your computer and use it in GitHub Desktop.
Save ZenithClown/68cb16b2f86bdc240c73247974a4c93d to your computer and use it in GitHub Desktop.
A powerful collection library for feature extraction and text cleaning using Unicode translations, regular expressions, natural language processing, large language models and more.

NLP Utilities

Documentation Status GitHub Issues GitHub Forks GitHub Stars LICENSE File PyPI - Downloads PyPI Latest Release

Note

Dear user, to provide an one stop solution with robust functionalities, the content of the repository is migrated under PyPI. Please find the End of Life (EoL) details at #1.

Migrations guidelines: #5 is available for your reference.

NLPurify is a text cleaning and extraction engine was developed using a combination of traditional techniques like Unicode translations, cleaning using regular expressions, and modern tools like "natural language processing" and "large language models" to detect and clean long texts and create word vectors.


List of active and deprecated projects that I'm currently working on is available here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment