You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all! I was interested in getting the exact training dataset contents for the models available on huggingface (DIAL-FLANT5-XL, etc). I see in the README it says it would be all datasets contained in the repo as of June 2022, but wasn't sure how to extract these or group them by tasks. Is there a config file which specifies these exactly, including which split(s) from each dataset? I'm interested in using these models but need to be careful to avoid data contamination.
The text was updated successfully, but these errors were encountered:
Hi all! I was interested in getting the exact training dataset contents for the models available on huggingface (DIAL-FLANT5-XL, etc). I see in the README it says it would be all datasets contained in the repo as of June 2022, but wasn't sure how to extract these or group them by tasks. Is there a config file which specifies these exactly, including which split(s) from each dataset? I'm interested in using these models but need to be careful to avoid data contamination.
The text was updated successfully, but these errors were encountered: