Process reward modeling support #362

fabrahman · 2024-09-21T00:32:13Z

This is a commit to support process reward modeling includes supports for:

TODOs: refactoring the evaluation code for this prm task

…main Pull recent updates

…main Merged new evals

…main merge with main

…main

Faeze Brahman and others added 15 commits August 30, 2024 17:57

modified for handling specific inputs

b37166e

Merge branch 'main' of https://github.com/allenai/open-instruct into …

681a0e1

…main Pull recent updates

ierge branch 'main' of https://github.com/allenai/open-instruct into …

1e2ddff

…main Merged new evals

Merge branch 'main' of https://github.com/allenai/open-instruct into …

44e1d31

…main merge with main

use older image

8f2019f

Merge branch 'main' of https://github.com/allenai/open-instruct into …

768035c

…main

Merge branch 'main' of https://github.com/allenai/open-instruct into …

1278cff

…main

use new image

41f8f17

Merge branch 'main' of https://github.com/allenai/open-instruct into …

c44fb66

…main

Merge branch 'main' of https://github.com/allenai/open-instruct into …

22826ca

…main

have conda setup

bf1f14d

added support for prm data

3fc06e8

added process reward modeling

2fad09f

added process reward computation

dc8af70

bash for process reward modeling

7de1600

fabrahman requested a review from vwxyzjn September 21, 2024 00:32

Faeze Brahman added 4 commits September 21, 2024 23:26

todo: evaluation is not supported

b87837d

fix python call

1b8be29

added prm eval support

d78d0ec

added sft and dpo config for IF and MATH

bd0d03b

Provide feedback