rlaif

Here are 9 public repositories matching this topic...

argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

python ai openai synthetic-data synthetic-dataset-generation huggingface llms rlhf rlaif

Updated Jan 24, 2025
Python

mengdi-li / awesome-RLAIF

Star

A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)

alignment rl llms rlhf rlaif

Updated Jan 23, 2025

CIntellifusion / VideoDPO

Star

Official Implementation of VideoDPO

self-improvement diffusion-models aigc generative-ai rlhf rlaif videogeneration

Updated Jan 12, 2025
Python

holarissun / Prompt-OIRL

Star

code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning

inverse-reinforcement-learning irl offline-rl large-language-models llm prompt-engineering rlhf rlaif offline-irl

Updated Mar 20, 2024
Python

vicgalle / zero-shot-reward-models

Sponsor

Star

ZYN: Zero-Shot Reward Models with Yes-No Questions

reinforcement-learning zero-shot llm rlhf reward-models trlx rlaif

Updated Aug 15, 2023
Python

dannylee1020 / openpo

Star

python ai evaluation synthetic-data finetuning dpo huggingface synthetic-data-generation llm rlhf rlaif llm-evaluation ai-feedback

Updated Dec 26, 2024
Python

zhaochen0110 / Timo

Star

Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)

temporal-reasoning sota-model llms rlhf rlaif llm-as-a-judge llm-as-evaluator self-critic-framework colm2024

Updated Oct 23, 2024
Python

vicgalle / awesome-rlaif

Sponsor

Star

A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)

awesome research language-model llm rlhf rlaif

Updated Jan 24, 2024

vicgalle / distilled-self-critique

Sponsor

Star

distilled Self-Critique refines the outputs of a LLM with only synthetic data

synthetic-data llm rlaif self-critique

Updated Apr 11, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the rlaif topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rlaif topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rlaif

Here are 9 public repositories matching this topic...

argilla-io / distilabel

mengdi-li / awesome-RLAIF

CIntellifusion / VideoDPO

holarissun / Prompt-OIRL

vicgalle / zero-shot-reward-models

dannylee1020 / openpo

zhaochen0110 / Timo

vicgalle / awesome-rlaif

vicgalle / distilled-self-critique

Improve this page

Add this topic to your repo