Skip to content
View BramVanroy's full-sized avatar

Highlights

  • Pro

Organizations

@lt3 @CCL-KULeuven

Block or report BramVanroy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

"Galahad". Goal: enable linguists to experiment with different taggers and use the result in other INT products

Kotlin 1 Updated Sep 24, 2024

Lightning-fast serving engine for AI models. Flexible. Easy. Enterprise-scale.

Python 2,204 134 Updated Oct 4, 2024
Jupyter Notebook 23 Updated Aug 13, 2023

High-quality datasets, tools, and concepts for LLM fine-tuning.

1,752 166 Updated Aug 18, 2024

Scalable data pre processing and curation toolkit for LLMs

Jupyter Notebook 487 60 Updated Oct 3, 2024

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

TypeScript 6,090 651 Updated Sep 23, 2024

A bagel, with everything.

Python 307 31 Updated Apr 11, 2024

Evaluation of language models on mono- or multilingual tasks.

Python 71 13 Updated Aug 9, 2024
Python 432 43 Updated Oct 1, 2024

Training LLMs with QLoRA + FSDP

Jupyter Notebook 1,394 187 Updated Sep 23, 2024

Using open source LLMs to build synthetic datasets for direct preference optimization

Jupyter Notebook 35 5 Updated Feb 29, 2024

They Are Billions Save Automation Tool

C# 19 4 Updated Jun 28, 2023

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 8,831 561 Updated Apr 16, 2024

Compute complexity metrics from Universal Dependencies

Python 2 Updated Mar 7, 2022

Measure the readability of a given text using surface characteristics

Python 71 17 Updated Dec 15, 2022

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 1,465 113 Updated Oct 4, 2024

Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"

Python 283 23 Updated Dec 20, 2023

GEITje 7B: een groot open Nederlands taalmodel

Python 114 1 Updated Feb 7, 2024

Code for Multilingual Eval of Generative AI paper published at EMNLP 2023

Jupyter Notebook 63 6 Updated Mar 6, 2024

Multilingual Large Language Models Evaluation Benchmark

Python 96 19 Updated Aug 21, 2024

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Python 90 2 Updated Aug 18, 2023

benchmarks for evaluating MT models

Smalltalk 10 1 Updated Jun 26, 2024

MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki

Python 21 3 Updated Oct 3, 2024

Robust recipes to align language models with human and AI preferences

Python 4,536 393 Updated Sep 23, 2024

German Alpaca Dataset (Cleaned + Translated)

Jupyter Notebook 23 1 Updated Apr 6, 2023

Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.

Python 312 21 Updated Aug 12, 2024

A package for handy processing of semantic graphs such as AMR, with a special focus on standardized evaluation

Python 17 2 Updated Sep 29, 2024

Tools for creating TrueType fonts for written sign language in the SignWriting script based on the ISWA 2010

Python 3 Updated May 1, 2023

Supercharged BLIP-2 that can handle videos

Python 116 6 Updated Dec 1, 2023

LLM based autonomous agent that does online comprehensive research on any given topic

Python 14,288 1,867 Updated Oct 4, 2024
Next