Intro Text for Eval Capabilities Page #28

danmcduff · 2024-05-18T18:13:14Z

Replace

Many modern foundation models are released with general conversational abilities, such that their use cases are poorly specified and open-ended. This poses significant challenges to evaluation benchmarks which are unable to critically evaluate so many tasks, applications, and risks systematically or fairly. As a result, it is important to carefully scope the original intentions for the model, and the evaluations to those intentions.

With

Many modern foundation models are released with general abilities, such that their use cases are poorly specified and open-ended, posing significant challenges to evaluation benchmarks which are unable to critically evaluate so many tasks, applications, and risks systematically or fairly. It is important to carefully scope the original intentions for the model, and the evaluations to those intentions.

neural-loop · 2024-05-19T13:35:47Z

https://onm-demo.aimodels.org/foundation-model-resources/model-evaluation-capabilities/

neural-loop added a commit to neural-loop/fm-cheatsheet that referenced this issue May 19, 2024

Category Details Update - Model Evaluation Capabilities allenai#28

67f9e49

neural-loop closed this as completed Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intro Text for Eval Capabilities Page #28

Intro Text for Eval Capabilities Page #28

danmcduff commented May 18, 2024

neural-loop commented May 19, 2024

Intro Text for Eval Capabilities Page #28

Intro Text for Eval Capabilities Page #28

Comments

danmcduff commented May 18, 2024

neural-loop commented May 19, 2024