-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add korean dataset #1157
Comments
Hi. I am a project lead of EleutherAI polyglot team. |
Hi, @hyunwoongko Currently, I am working on converting the public Korean data into a "instruction-fulfillment" format. I don't know what type of data you can provide, but any data you provide will be of great help. Thank you! |
@ontocord If this issue is relevant, I would love to take over from here. Would that be possible? |
@CertifiedJoon if you are still interested in taking on this project, I can assign it to you. |
@camsdixon1 yeah! please do :) |
In order for open-assistant to work in Korean, we are working on adding a Korean dataset.
Creating a dataset from zero to end is quite difficult. To efficiently add a dataset, we will proceed as follows.
Current progress is as follows.
These datasets are expected to be used for labeling and learning for RLHF.
The text was updated successfully, but these errors were encountered: