Stars
Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model-archs than bergamot-translator.
Integrate cutting-edge LLM technology quickly and easily into your apps
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.
The central repo for Creole based NLU and NLG work
This add-on implements a speech synthesizer driver for NVDA using neural TTS models. It supports Piper
Code to create a database with cleaned up Wiktionary data and then to create ebook dictionaries based on this data.
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
Build a chatbot or Q&A bot of your website's content
Spam Numbero and make your city the most dangerous place on Earth
A collaborative project to collect datasets in Indonesian languages.
High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)
Python source code for EMNLP 2020 paper "Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT".
This is the repository of the EMNLP 2021 paper "BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation".
Multilingual Speech Recognition for Indonesian Languages
AUSTENDER OCDS Search API. This portal will provide users of AusTender data with documentation, code examples, bug notifications and feature requests.
A platform for creating interactive data visualizations
The SQL to IATI Database repository contains all of the SQL scripts that are required to build DFID’s IATIv203 database, which is used by DFID to transform their internal data into IATI v2.03 stand…
Application Web de mise en valeur des données du graphe sireneLD