Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
-
Updated
Nov 28, 2024 - Python
Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
Stand-alone Language Identification for Node.js JavaScript based on FastText
Language identification script that can detect the language of a given text. Currently supports Swahili, Wolof, French, English, Arabic, and Dyula. Customizable language support.
MaskLID: Code-Switching Language Identification through Iterative Masking -- ACL 2024
An NLP project leveraging character trigrams and smoothing techniques (Lidstone, Linear Discounting, Absolute Discounting) for language identification. Trained on for Spanish, Italian, English, French, Dutch, and German, achieving 99.8932% accuracy. Includes datasets, model parameters, and comprehensive documentation.
A small and fast language identification model powered by fastText
Language Identification using Näive Bayes
It is django based web app from where users can download machine learning projects as well as they can directly run on the website
A method for identifying the language of a given text
Add a description, image, and links to the language-identifier topic page so that developers can more easily learn about it.
To associate your repository with the language-identifier topic, visit your repo's landing page and select "manage topics."