Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
-
Updated
Nov 5, 2024 - Python
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
ZYN: Zero-Shot Reward Models with Yes-No Questions
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Timo: Towards Better Temporal Reasoning for Language Models (COLM 2024)
distilled Self-Critique refines the outputs of a LLM with only synthetic data
Add a description, image, and links to the rlaif topic page so that developers can more easily learn about it.
To associate your repository with the rlaif topic, visit your repo's landing page and select "manage topics."