Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
-
Updated
Aug 26, 2021 - Python
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Large-scale pretraining for dialogue
Large-scale pretrained models for goal-directed dialog
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation
Cleans Reddit Text Data 📜 🧹
Tools to uniformly read in text data including semi-structured transcripts
Tools for reshaping text data
A Python library that enables smooth keyword extraction from any text using the RAKE(Rapid Automatic Keyword Extraction) algorithm.
Question Classification for the dataset CogComp QC Dataset - [ http://cogcomp.org/Data/QA/QC/ ].
Visualize large text collections with WebGL
Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).
Scrape EDGAR filings from https://www.sec.gov/
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
How Will Your Tweet Be Received? Predicting theSentiment Polarity of Tweet Replies
A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.
Directional Co-clustering with a Conscience (DCC)
A machine learning model that predicts tags for a given question and body.
Add a description, image, and links to the text-data topic page so that developers can more easily learn about it.
To associate your repository with the text-data topic, visit your repo's landing page and select "manage topics."