Skip to content

Amharic language model and word vectors to aid NLP tasks and advance research for low-resource languages.

Notifications You must be signed in to change notification settings

kidist-amde/amhric-language-models

Repository files navigation

Project Description

  • The main focus of the project is to create word vectors and language models for Amharic, an official Semitic language spoken primarily in Ethiopia. Amharic is the 2nd most spoken Semitic language in the world, after Arabic, and is the official language of Ethiopia.

  • The ultimate objective is to use these resources for different downstream tasks, such as classification, Named Entity Recognition, and language generation. The aim is to create the best possible language models and word vectors that are able to represent the semantic and syntactic connections between words in the Amharic language. This will significantly aid in natural language processing tasks involving the Amharic language and contribute to the advancement of NLP research for low-resource languages.

Amharic word2vec( link to the TensorBoard)

alt text

Train plot

alt text

About

Amharic language model and word vectors to aid NLP tasks and advance research for low-resource languages.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published