VHELM update #2592

teetone · 2024-04-26T17:54:59Z

Changes:

New scenarios
IDEFICS 2 support

…into vh

setup.cfg

src/helm/benchmark/metrics/common_metric_specs.py

src/helm/benchmark/metrics/evaluate_reference_metrics.py

src/helm/benchmark/run_specs/vlm_run_specs.py

src/helm/benchmark/metrics/efficiency_metrics.py

src/helm/clients/openai_client.py

src/helm/clients/vertexai_client.py

src/helm/clients/vision_language/huggingface_vision2seq_client.py

src/helm/clients/openai_client.py

yifanmai · 2024-04-30T22:24:58Z

src/helm/config/model_deployments.yaml

+
+  - name: huggingface/llava-v1.6-vicuna-7b-hf
+    model_name: uw-madison/llava-v1.6-vicuna-7b-hf
+    tokenizer_name: hf-internal-testing/llama-tokenizer


Use the respective tokenizers (vicuna etc)?

I think vicuna uses the llama tokenizer if I'm not mistaken.

yifanmai · 2024-04-30T22:28:54Z

src/helm/clients/vertexai_client.py


-                    return {"predictions": [{"text": raw_response.candidates[0].text}]}
+                    if not candidates:


We should make this condition tighter... I'll send you a follow-up pull request to fix it.

yifanmai · 2024-05-01T16:05:38Z

src/helm/clients/vision_language/huggingface_vision2seq_client.py

+
+_models_lock: Lock = Lock()
+_models: Dict[str, Optional[Vision2SeqModelProcessor]] = {
+    "HuggingFaceM4/idefics2-8b": None,


Any reason we need to explicitly declare models here? Most clients do not validate which model names are supported (because the list can change frequently).

It was for name validation. Do you think I can just remove this line then?

teetone added 30 commits February 28, 2024 23:47

MathVistaScenario + perturbations

34107bf

fix models

968faa6

added test

69a227b

update schema

ccd1b69

rename to vqa

7015022

add f1 to schema

ecf24f4

hateful memes as mc

5819962

valid

9191ba2

updated conf

e48b734

debug

30a2824

crossmodal + cider

122a417

better adaptation

dde04f3

comments

d2508c8

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

ea80da4

…into vh

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

aa5ca49

…into vh

resolve merge conflicts

5e28485

todos

b9b7639

comment out empty open questions

405a3d8

flickr30k

8f82799

remove debug

ac3be64

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

4100854

…into vh

fix instructions

5bda3c7

GQA

5af02db

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

f2e8a8b

…into vh

support location for crossmodal

39667ba

added geographic bias

a05432a

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

707c4cb

…into vh

a-okvqa

2478f7c

Merge branch 'main' of https://github.com/stanford-crfm/benchmarking …

b96d931

…into vh

mm safety bench

ad5bac4

teetone added 18 commits April 21, 2024 17:40

done

164cc90

prompting

b2ae3b9

prompting

c092d50

idefics 2

efb76f8

tokenizer

84d74a4

vizwiz

0d32442

multiple num_completions

81f3cc3

multiple num_completions

60006a5

quasi

0b6d00d

test

795fc17

cleanup schema

21e3ff2

cleanup schema

8550ffa

resize

6d2cca7

cleanup

cbfb6ad

log

e828f3b

resolve merge conflicts

9a16d5f

renamed file

29dc34e

vhelm lite documentation

a909139

teetone requested a review from yifanmai April 26, 2024 17:54

yifanmai requested changes Apr 26, 2024

View reviewed changes

cleanup

ded6d2c

teetone requested a review from yifanmai April 28, 2024 18:58

get rid of extra space

ca91ad5

yifanmai approved these changes May 1, 2024

View reviewed changes

teetone added 3 commits May 2, 2024 00:46

earth mover similarity block

a70d0e5

fix

11d9e48

fix type

f9a7646

teetone merged commit 6d3ad7f into main May 3, 2024
6 checks passed

teetone deleted the vh branch May 3, 2024 08:11

yifanmai mentioned this pull request Aug 5, 2024

Add model description for Claude 3 #2435

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VHELM update #2592

VHELM update #2592

teetone commented Apr 26, 2024 •

edited

Loading

yifanmai Apr 30, 2024

teetone May 2, 2024

yifanmai Apr 30, 2024

yifanmai May 1, 2024

teetone May 2, 2024


		return {"predictions": [{"text": raw_response.candidates[0].text}]}
		if not candidates:

VHELM update #2592

VHELM update #2592

Conversation

teetone commented Apr 26, 2024 • edited Loading

yifanmai Apr 30, 2024

Choose a reason for hiding this comment

teetone May 2, 2024

Choose a reason for hiding this comment

yifanmai Apr 30, 2024

Choose a reason for hiding this comment

yifanmai May 1, 2024

Choose a reason for hiding this comment

teetone May 2, 2024

Choose a reason for hiding this comment

teetone commented Apr 26, 2024 •

edited

Loading