-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase delay buffer estimate for the ECO flow #1019
Increase delay buffer estimate for the ECO flow #1019
Conversation
When working out how many delay buffers to insert, the ECO flow uses a very small hard coded value (0.06ns). At the tt corner, a better estimate is around 0.5ns. This means we heavily over fix hold violations, and quite possibly cause setup violations. Setting a more realistic estimate for the buffer delay should have no downsides. Since the ECO flow is iterative, any hold path that is not completely fixed will be fixed in subsequent iterations. This value (and the name of the buffer to use) should really end up in the open_pdks config.tcl.
@antonblanchard FO1 delay at TT is ~40ps? adding 500ps per buffer might be too high? |
@donn @antonblanchard The best way to do this I think is to set a variable in a config file that overwrites the default one. |
Why not use the actual delay of the cell being inserted? |
This would be best, especially since the timing reports are multi-corner. Then the delay could be calculated based on the .lib being used for signoff. |
I agree with @mithro and @msaligane, getting the actual delay would be best.
This is what I see in one of my taped out designs (I'm just pulling out delays for a few of the ECO inserted hold buffers here):
|
I added some code to dump the delay the resizer calculated for |
@antonblanchard Do you know how those delays values are calculated? the slew values seems to be matching the FO1 delay. Can you paste the timing of path where eco buffers are inserted? |
Here's an example from the final STA run:
|
The reason we use 0.06ns for FO1 is that
I don't quite understand why the resizer calculates the delay so large. |
I'm not sure where 60ps is coming from. Based on some Googling (and finding an old post from @RTimothyEdwards), it looks we can use the
They are all in the 500ps+ range. |
@antonblanchard ~40ps is based on spice simulation |
Something is wrong if spice gives 40ps and liberty 500ps. |
This is what the values should look like:
|
Where did that table come from? |
@msaligane can answer for the spice derived data, but I'm guessing the liberty version comes from https://github.com/google/skywater-pdk-libs-sky130_fd_sc_hd/blob/ac7fb61f06e6470b94e8afdf7c25268f62fbd7b1/cells/dlygate4sd3/sky130_fd_sc_hd__dlygate4sd3_1__tt_025C_1v80.lib.json#L314 |
Correct. @antonblanchard which corner is that table you shared from? Regarding the FO4 / FO1 delays. Here is some simulation ran by @tdene : |
That was inverters. It's minimal effort to re-format my spice sim and check Regarding liberty vs spice Typically liberty values have to be taken with half a grain of salt, because they depend on the table indices. In this case though, just looking in the middle of the tables and comparing the delay cell at I think OSU re-characterized all the SkyWater-provided standard cells. Let me see if anyone can find me that data, and it can be used as another verification point. |
I updated the spreadsheet Mehdi linked with a new sheet, representing the delay cells: FO4 delay for I'm confused about the difference between this reply and this reply. |
The amounts of the differences here seem much larger than process variation would account for. There seems to be some fundamental mismatch and alarm bells are going off for me. |
@msaligane The ECO flow should not estimate what is going to happen but instead make the change in the netlist with incremental timing and then measure if the slack is better or worse via opensta. Making this change will require ensuring that incremental global routing happens as well so the parasitics are reasonable. This aligns with @mithro's comment. |
I think we should have a meeting to discuss this. @tdene @msaligane |
@tspyrou I'm free almost any time tomorrow, on the 30th. My 31st is completely full though. |
@msaligane do you want to propose a time via email? |
@tdene in our case we insert back to back buffers. So the FO1 delay makes more sense. The delay cell here should be 2 inverters in series which might explain the higher delay.
Yes, I agree that we should pick the delay value of the inserted buffer from .lib directly. Re: incremental timing, we actually run a timing check at signoff step and calculate the number of buffers to insert based on violating slack. We picked a value based on spice simulation but it is certainly better to use the .lib file instead.
Sure, I should be able to meet on 31st PM. |
The ECO flow currently uses the spice FO delay to calculate an estimation of the buffer number. The change is incremental: every time we add buffers to the netlist, we'll do the opensta timing analysis again and then decide how many buffers should be inserted in the next iteration by checking the reports. For each iteration, the ECO flow may adjust the buffer number according to the timing reports so that all the violations can be fixed in the end. Here is the document about the ECO flow details: https://docs.google.com/document/d/1RYWNeVjQpBkMhN1LGLOyMwwtLzFAENI-veqsfWg-5v8/edit?usp=sharing |
@donghl17 is there a reason to use the spice FO delay instead of OpenSTA API's? Either should work but then the spice has to correlate to the spice used to build the .lib |
Yes, it could be better to use OpenSTA APIs, but I wasn't aware of the APIs before. |
There appears to be a discrepancy between the sky130_fd_sc_hd__dlygate4sd3_1 delay listed by the timing files and the sky130_fd_sc_hd__dlygate4sd3_1 delay obtained via spice simulation, as agreed upon by @msaligane @donghl17 @tspyrou @tdene This merits further investigation. Raising issue #363 inside skywater-pdk. |
Another conclusion that was reached is the following:
During the meeting, a consensus was reached that a change was advisable due to the following reason:
Personally, I see another possible point-of-failure. Though I do not understand the code-base sufficiently to judge whether this may occur, I fear that such hardcoded cells may lead to failure if a cell library other than The script in question should be changed. One option is to modify it. Another option is to replace it with OpenROAD's hold-violation-fix functionality, if doing so is possible and makes sense here. Unsure of who to ping: @donn @donghl17 @antonblanchard, @jjcherry56 [most recent committer to OpenROAD |
@tdene I notice three liberty cells (
We currently disable that cell, but the 8 and 12 sized cells are used, and have unrealistic values in the table for transitions > 1.5ns. |
Negative delays are an artifact of how delay is measured and completely normal with super slow input transitions. The gate is an amplifier so the output can cross the logic threshold before the input does. |
@jjcherry56 thanks, but does it make sense that all the values of delay are monotonically increasing until the final row (eg |
If it is real, I wonder if we should treat max transition errors as errors during openlane sign off (it's a warning right now), because if we are far outside the limit of the lookup table the delay might be wildly different. |
Yes, it makes sense (ie, when looking at delay versus input transition time). |
When working out how many delay buffers to insert, the ECO flow uses a
very small hard coded value (0.06ns). At the tt corner, a better estimate
is around 0.5ns. This means we heavily over fix hold violations, and
quite possibly cause setup violations.
Setting a more realistic estimate for the buffer delay should have no
downsides. Since the ECO flow is iterative, any hold path that is not
completely fixed will be fixed in subsequent iterations.
This value (and the name of the buffer to use) should really end up
in the open_pdks config.tcl.