Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix Zeno visualizer on tasks like GSM8k
#2599 opened Dec 28, 2024 by pasky Loading…
Fix gguf loading via Transformers
#2596 opened Dec 25, 2024 by CL-ModelCloud Loading…
change to single process for bootstrap_stderr
#2593 opened Dec 23, 2024 by zhuyuhua-v Loading…
Fix the format of mgsm zh and ja.
#2587 opened Dec 20, 2024 by timturing Loading…
fix multiple input chat tempalte
#2576 opened Dec 17, 2024 by baberabb Loading…
Added caseHOLD task
#2570 opened Dec 16, 2024 by zolastro Loading…
add llama3 tasks
#2556 opened Dec 10, 2024 by baberabb Loading…
[MM] Chartqa
#2544 opened Dec 5, 2024 by baberabb Draft
[MM] Ai2d
#2542 opened Dec 5, 2024 by baberabb Draft
New arabicmmlu
#2541 opened Dec 5, 2024 by bodasadallah Loading…
Update KorMedMCQA: ver 2.0
#2540 opened Dec 5, 2024 by GyoukChu Loading…
max_length not used
#2515 opened Nov 25, 2024 by lintangsutawika Loading…
fixed mmlu generative response extraction
#2503 opened Nov 18, 2024 by RawthiL Loading…
Added regex filter for bbh fewshot
#2502 opened Nov 18, 2024 by RawthiL Loading…
Add GigaChat API
#2495 opened Nov 15, 2024 by seldereyy Loading…
Yaml crowspairs tasks
#2488 opened Nov 14, 2024 by NAM00 Loading…
Biology ds
#2486 opened Nov 13, 2024 by deema-A Loading…
MILU dataset from AI4Bharat for Indic LLM eval
#2482 opened Nov 12, 2024 by abhinand5 Loading…
Update citation
#2474 opened Nov 8, 2024 by Sypherd Loading…
Use global filter alias
#2473 opened Nov 8, 2024 by Sypherd Loading…
ProTip! Follow long discussions with comments:>50.