llm-as-a-judge

A set of examples demonstrating how to evaluate Generative AI augmented systems using traditional information retrieval and LLM-As-A-Judge validation techniques

information-retrieval generative-ai genai llm-evaluation llm-as-a-judge

Updated Sep 5, 2024
Jupyter Notebook

aws-samples / model-as-a-judge-eval

Star

Notebooks for evaluating LLM based applications using the Model (LLM) as a judge pattern.

evaluation llm generative-ai llm-as-a-judge

Updated May 31, 2024
Jupyter Notebook

rafaelsandroni / antibodies

Star

Antibodies for LLMs hallucinations (grouping LLM as a judge, NLI, reward models)

python nli hallucinations llms hallucination-detection llm-as-a-judge llm-as-evaluator

Updated Jun 13, 2024
Python

djokester / groqeval

Star

Use groq for evaluations

groq llm generative-ai mixtral llm-as-a-judge llm-as-evaluator llama3

Updated Jul 11, 2024
Python

romaingrx / llm-as-a-jailbreak-judge

Star

Explore techniques to use small models as jailbreaking judges

jailbreak aisafety llm-as-a-judge

Updated Sep 18, 2024
Python

Improve this page

Add a description, image, and links to the llm-as-a-judge topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-as-a-judge topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-as-a-judge

Here are 14 public repositories matching this topic...

prometheus-eval / prometheus-eval

metauto-ai / agent-as-a-judge

IAAR-Shanghai / xFinder

martin-wey / CodeUltraFeedback

KID-22 / LLM-IR-Bias-Fairness-Survey

MJ-Bench / MJ-Bench

zhaochen0110 / Timo

minnesotanlp / cobbler

UMass-Meta-LLM-Eval / llm_eval

aws-samples / genai-system-evaluation

aws-samples / model-as-a-judge-eval

rafaelsandroni / antibodies

djokester / groqeval

romaingrx / llm-as-a-jailbreak-judge

Improve this page

Add this topic to your repo