Skip to content

javierosorio/keep_it_local

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Keep it Local: Comparing Domain-Specific LLMs in Native Language and Machine Translation using Parallel Corpora

Authors:

  • Javier Osorio, University of Arizona
  • Sultan Alsarra, King Saud University
  • Amber Converse, University of Arizona
  • Afraa Alshammari, University of Texas - Dallas
  • Dagmar Heintze, University of Texas - Dallas
  • Latifur Khan, University of Texas - Dallas
  • Naif Alatrush, University of Texas - Dallas
  • Patrick T. Brandt, University of Texas - Dallas
  • Vito D'Orazio, West Virginia University
  • Niamat Zawad, University of Texas - Dallas
  • Mahrusa Billah, University of Texas - Dallas

Introduction

This repository contains the replication files for the paper "Keep it Local: Comparing Domain-Specific LLMs in Native Language and Machine Translation using Parallel Corpora"

The repository contains the following folders:

  • 1_data: includes the raw text data in English, Spanish, and Arabic, as well as the annotations.
  • 2_quality_analysis: includes the Python scripts used to generate the translation quality metrics and their corresponding data output.
  • 3_downstream_tasks: includes the Python scripts used to fine-tune the different models on the binary and multi-class classification tasks.
  • 4_analysis: includes the R scripts used to generate the Figures and Tables reported in the paper.

Funding:

The research reported herein was supported in part by NSF awards DMS-1737978, DGE-2039542, OAC-1828467, OAC-1931541, OAC-2311142, and DGE-1906630, ONR awards N00014-17-1-2995 and N00014-20-1-2738, Army Research Office Contract No. W911NF2110032.

References:

[1] Ziemski, M., Junczys-Dowmunt, M., and Pouliquen, B., (2016), The United Nations Parallel Corpus, Language Resources and Evaluation (LREC’16), Portorož, Slovenia, May 2016.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages