-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[17.6] Cache source generator node tables only if their state counts match #66992
[17.6] Cache source generator node tables only if their state counts match #66992
Conversation
BatchNode
BatchNode
src/Compilers/CSharp/Test/Semantic/SourceGeneration/StateTableTests.cs
Outdated
Show resolved
Hide resolved
3606ca2
to
8d3b3dd
Compare
BatchNode
BatchNode
I think this fix is actually just masking an underlying issue. When the batch node gets the input table its being listed as Looking at the input table, what's happening is that we're essentially shrinking the number of items from the previous generation pass, but the items that remain are (correctly) listed as cached. When we create the table from the builder we set In this case, that's not correct. Yes all the items in the table are cached, but the table itself shouldn't be considered cached, as it has less items than last time. I think we should track in the builder if we can consider the table to be cached or not. The only time we can say a table from a builder (as opposed to one we explicitly call Interestingly this is almost exactly the same as the The following test is a fairly simple repro that demonstrates this behavior outside of the batch node case: [Fact, WorkItem(61162, "https://github.com/dotnet/roslyn/issues/61162")]
public void IncrementalGenerator_Table_Smaller_Second_TimeAround()
{
var items = ImmutableArray.Create(new[] { "a", "b", "c" });
var generator = new IncrementalGeneratorWrapper(new PipelineCallbackGenerator(ctx =>
{
var transform1 = ctx.CompilationProvider.SelectMany((c, _) => items);
var transform2 = transform1.Select((i, _) => i);
ctx.RegisterSourceOutput(transform2, static (spc, i) =>
{
spc.AddSource(i, $@"// {i}");
});
}));
var source = "System.Console.WriteLine();";
var parseOptions = TestOptions.RegularPreview;
Compilation compilation = CreateCompilation(source, options: TestOptions.DebugExeThrowing, parseOptions: parseOptions);
GeneratorDriver driver = CSharpGeneratorDriver.Create(new[] { generator }, parseOptions: parseOptions);
verify(ref driver, compilation, new[] {
"// a",
"// b",
"// c"
});
items = ImmutableArray.Create(new[] { "a", "b" });
verify(ref driver, compilation, new[] {
"// a",
"// b",
});
static void verify(ref GeneratorDriver driver, Compilation compilation, string[] generatedContent)
{
driver = driver.RunGeneratorsAndUpdateCompilation(compilation, out var outputCompilation, out var generatorDiagnostics);
outputCompilation.VerifyDiagnostics();
generatorDiagnostics.Verify();
var trees = driver.GetRunResult().GeneratedTrees;
Assert.Equal(generatedContent.Length, trees.Length);
for(int i = 0; i < trees.Length; i++)
{
AssertEx.EqualOrDiff(generatedContent[i], trees[i].ToString());
}
}
} This will fail right now, as transform2 will end up returning three items the second time through, when it should only return two. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per comment.
Thanks @chsienki for the investigation. I couldn't reproduce a scenario where my fix would fail. Your test fails simply because you never update the compilation hence the generator doesn't run the second time. This change makes the test pass (even without the fix in this PR): GeneratorDriver driver = CSharpGeneratorDriver.Create(new[] { generator }, parseOptions: parseOptions);
- verify(ref driver, compilation, new[] {
+ verify(ref driver, ref compilation, new[] {
"// a",
"// b",
"// c"
});
items = ImmutableArray.Create(new[] { "a", "b" });
- verify(ref driver, compilation, new[] {
+ verify(ref driver, ref compilation, new[] {
"// a",
"// b",
});
- static void verify(ref GeneratorDriver driver, Compilation compilation, string[] generatedContent)
+ static void verify(ref GeneratorDriver driver, ref Compilation compilation, string[] generatedContent)
{
- driver = driver.RunGeneratorsAndUpdateCompilation(compilation, out var outputCompilation, out var generatorDiagnostics);
- outputCompilation.VerifyDiagnostics();
+ driver = driver.RunGeneratorsAndUpdateCompilation(compilation, out compilation, out var generatorDiagnostics);
+ compilation.VerifyDiagnostics();
generatorDiagnostics.Verify();
var trees = driver.GetRunResult().GeneratedTrees;
Assert.Equal(generatedContent.Length, trees.Length);
for(int i = 0; i < trees.Length; i++)
{
AssertEx.EqualOrDiff(generatedContent[i], trees[i].ToString());
}
} However, your suggestion makes sense, I rewrote the fix. |
🤦 Yep, you're right about that. Here's a test that *does* fail with the original fix [Fact, WorkItem(61162, "https://github.com/dotnet/roslyn/issues/61162")]
public void IncrementalGenerator_Collect_SyntaxProvider2()
{
var generator = new IncrementalGeneratorWrapper(new PipelineCallbackGenerator(static ctx =>
{
var invokedMethodsProvider = ctx.SyntaxProvider
.CreateSyntaxProvider(
static (node, _) => node is InvocationExpressionSyntax,
static (ctx, ct) => ctx.SemanticModel.GetSymbolInfo(ctx.Node, ct).Symbol?.Name ?? "(method not found)")
.Select((n, _) => n);
ctx.RegisterSourceOutput(invokedMethodsProvider, static (spc, invokedMethod) =>
{
spc.AddSource(invokedMethod, "// " + invokedMethod);
});
}));
var source = """
System.Console.WriteLine();
System.Console.ReadLine();
""";
var parseOptions = TestOptions.RegularPreview;
Compilation compilation = CreateCompilation(source, options: TestOptions.DebugExeThrowing, parseOptions: parseOptions);
GeneratorDriver driver = CSharpGeneratorDriver.Create(new[] { generator }, parseOptions: parseOptions);
verify(ref driver, compilation, new[]
{
"// WriteLine",
"// ReadLine"
});
replace(ref compilation, parseOptions, """
System.Console.WriteLine();
""");
verify(ref driver, compilation, new[]
{
"// WriteLine"
});
replace(ref compilation, parseOptions, "_ = 0;");
verify(ref driver, compilation, Array.Empty<string>());
static void verify(ref GeneratorDriver driver, Compilation compilation, string[] generatedContent)
{
driver = driver.RunGeneratorsAndUpdateCompilation(compilation, out var outputCompilation, out var generatorDiagnostics);
outputCompilation.VerifyDiagnostics();
generatorDiagnostics.Verify();
var trees = driver.GetRunResult().GeneratedTrees;
Assert.Equal(generatedContent.Length, trees.Length);
for (int i = 0; i < generatedContent.Length; i++)
{
AssertEx.EqualOrDiff(generatedContent[i], trees[i].ToString());
}
}
static void replace(ref Compilation compilation, CSharpParseOptions parseOptions, string source)
{
compilation = compilation.ReplaceSyntaxTree(compilation.SyntaxTrees.Single(), CSharpSyntaxTree.ParseText(source, parseOptions));
}
} The new proposed fix does make that test pass, but I'm no longer convinced its actually the correct thing to do. I was originally thinking this was an issue that would affect any node that removed tail entries but realize now that it doesn't because we track the removed entries. So remains the question as to why we're getting a cached result as the input when it shouldn't be? Looking at it I think the bug is actually in the SyntaxNodeProvider. When we construct the filter table, we will modify any existing items, but don't carry over the removed ones. This surfaces as a correctness problem when the removed items are at the end of the table, but it also surfaces as a perf issue when the removed items are in the middle of the table. Consider the following test [Fact, WorkItem(61162, "https://github.com/dotnet/roslyn/issues/61162")]
public void IncrementalGenerator_Collect_SyntaxProvider3()
{
var generator = new IncrementalGeneratorWrapper(new PipelineCallbackGenerator(static ctx =>
{
var invokedMethodsProvider = ctx.SyntaxProvider
.CreateSyntaxProvider(
static (node, _) => node is InvocationExpressionSyntax,
static (ctx, ct) => ctx.SemanticModel.GetSymbolInfo(ctx.Node, ct).Symbol?.Name ?? "(method not found)")
.Select((n, _) => n)
.WithTrackingName("Select");
ctx.RegisterSourceOutput(invokedMethodsProvider, static (spc, invokedMethod) =>
{
spc.AddSource(invokedMethod, "// " + invokedMethod);
});
}));
var source1 = """
System.Console.WriteLine();
System.Console.ReadLine();
""";
var source2 = """
class C {
public void M()
{
System.Console.Clear();
System.Console.Beep();
}
}
""";
var parseOptions = TestOptions.RegularPreview;
Compilation compilation = CreateCompilation(new[] { source1, source2 }, options: TestOptions.DebugExeThrowing, parseOptions: parseOptions);
GeneratorDriver driver = CSharpGeneratorDriver.Create(new[] { generator }, parseOptions: parseOptions, driverOptions: new GeneratorDriverOptions(IncrementalGeneratorOutputKind.None, trackIncrementalGeneratorSteps: true));
verify(ref driver, compilation, new[]
{
"// WriteLine",
"// ReadLine",
"// Clear",
"// Beep"
});
// edit part of source 1
replace(ref compilation, parseOptions, """
System.Console.WriteLine();
System.Console.Write(' ');
""");
verify(ref driver, compilation, new[]
{
"// WriteLine",
"// Write",
"// Clear",
"// Beep"
});
Assert.Collection(driver.GetRunResult().Results[0].TrackedSteps["Select"],
(r) => Assert.Equal(IncrementalStepRunReason.Cached, r.Outputs.Single().Reason),
(r) => Assert.Equal(IncrementalStepRunReason.Modified, r.Outputs.Single().Reason),
(r) => Assert.Equal(IncrementalStepRunReason.Cached, r.Outputs.Single().Reason),
(r) => Assert.Equal(IncrementalStepRunReason.Cached, r.Outputs.Single().Reason)
);
// remove second line of source 1
replace(ref compilation, parseOptions, """
System.Console.WriteLine();
""");
verify(ref driver, compilation, new[]
{
"// WriteLine",
"// Clear",
"// Beep"
});
Assert.Collection(driver.GetRunResult().Results[0].TrackedSteps["Select"],
(r) => Assert.Equal(IncrementalStepRunReason.Cached, r.Outputs.Single().Reason),
(r) => Assert.Equal(IncrementalStepRunReason.Removed, r.Outputs.Single().Reason),
(r) => Assert.Equal(IncrementalStepRunReason.Cached, r.Outputs.Single().Reason),
(r) => Assert.Equal(IncrementalStepRunReason.Cached, r.Outputs.Single().Reason)
);
static void verify(ref GeneratorDriver driver, Compilation compilation, string[] generatedContent)
{
driver = driver.RunGeneratorsAndUpdateCompilation(compilation, out var outputCompilation, out var generatorDiagnostics);
outputCompilation.VerifyDiagnostics();
generatorDiagnostics.Verify();
var trees = driver.GetRunResult().GeneratedTrees;
Assert.Equal(generatedContent.Length, trees.Length);
for (int i = 0; i < generatedContent.Length; i++)
{
AssertEx.EqualOrDiff(generatedContent[i], trees[i].ToString());
}
}
static void replace(ref Compilation compilation, CSharpParseOptions parseOptions, string source)
{
compilation = compilation.ReplaceSyntaxTree(compilation.SyntaxTrees.First(), CSharpSyntaxTree.ParseText(source, parseOptions));
}
} This will fail on the last collection assertion. The table is returning Basically I think we're not correctly tracking removals in the In general, I think there should probably be an invariant that the table size going in is the same as what comes out, before compaction. In practice this isn't true today, as we implicitly track empty slots, meaning tables can (and do) get shorter. We might want to think about changing that, as I think its probably been the route cause of a lot of these removal bugs. But until we do (its not a totally trivial amount of work) lets keep the 'we only cache if the table size is the same' logic. |
HasTrackedSteps = hasTrackedSteps; | ||
|
||
static bool areStatesCompatible(ImmutableArray<TableEntry> previous, ImmutableArray<TableEntry> current) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets calculate this in the builder, and just pass in an 'IsCached' flag instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this computation into the builder. I just realized you originally suggested we keep track of the flag throughout the builder, but it seems computing it once is simpler.
@jjonescz @jaredpar I think we should probably take this almost as-is for 17.6, while recognizing that its really just patching over the underlying issue. On main we can fix the SyntaxStrategy issue, then decide even later on if we want to consider re-architecting the way we handle the empty entries or not. |
@chsienki if you both agree let's do that. I'd like to open a new issue that details what needs to change for the more complete fix as a part of merging this. That way we can track the work. |
Co-authored-by: Chris Sienkiewicz <chsienki@microsoft.com>
I agree. Opened issues per Chris work item suggestions (one for the |
BatchNode
@cston @dotnet/roslyn-compiler for a second review |
Fixes #61162.
Fixes #63422.
Devdiv issue for 17.6: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1786293