-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to generate code-trajectory data with GPT4? #1
Comments
@SeungyounShin Hi, I am currently working on a very similar project, mainly generating a dataset to use tools. One of the dataset I am working include using code interpreter tool. My method was basically starting with a few dozen of instruction and asked GPT-4 to generate more similar instructions. Using this slightly larger instructions set, I use the evol instruct [1] method to generate more instructions. So far I had only 4,628 instructions set about using code interpreter. [1] WizardLM: Empowering Large Language Models to Follow Complex Instructions |
Here's an output of the code generated by GPT-4 from my repository. The task was: "Can you plot the Tesla's 90-day volume with the mean of the closing price and a marker at 't' where the mean until 't-1' plus the standard deviation until 't-1' is less than the price at 't'?" The performance of GPT-4 is impressive but the data collection process tends to be slow. This is primarily because it operates in an iterative manner: generating code, executing it, then debugging and modifying the code, and repeating the process. This can lead to considerable latency. Your method is a valuable alternative, but I believe I would greatly appreciate any further discussion on this topic. Please feel free to share your insights or suggestions. |
@SeungyounShin oh I had a code execution module as well, just the initial questions are generated via augmentation. Each round typically took me 20-120 seconds depends on complexity. My progress usually slows down due to bad for loop or training a 500M huggingface model on my mac. What's the exact issue with #2 ? Could you provide more insights to the weird answer problem? An example would be nice 😊 |
I recently explored the concept of Evo-Instruct and found it quite fascinating. Inspired, I crafted my own version of Evo-Instruct. In the process, I observed that a significant number of human-engineered prompts are required. In addition, I noticed that GPT often tends to prompt with instructions like "Write ~" to create a Python function but does not actively check the result or implement it itself. It then appears to congratulate itself on completing the task. One thing that stood out to me was that Evo-Instruct seems to perform better than Self-Instruct. It not only produces higher quality prompts but also a diverse range of them. While generating high-quality prompts is comparatively simpler (for instance, we could just request "more difficult one"), generating diverse prompts is quite challenging. Transitioning from one topic to another can potentially lead to significant deviations, such as moving from a simple '1+1=?' to a complex 'Use CAD to...'. Considering these observations, it seems that maintaining a balance between diversity and quality could be an interesting research topic. |
[Still in progress] How we can enhance the generation of trajectories (code gen, exec, debug from it) |
Creation of SFT data for
User :
Assistant :
<Thinking, GPT4>
<Debug...>
...
How to automate this process? with GPT4 and collect data efficiently?
The text was updated successfully, but these errors were encountered: