-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make XmlCommentHelper faster #3651
Conversation
ninedan
commented
May 2, 2023
- Use StringBuilder only when there are multiple lines
- In white space normalization, return original string when no change is needed.
- When changes are really needed, calculate exact size, allocate a char buffer for the changed text, then allocate string.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #3651 +/- ##
==========================================
- Coverage 93.26% 93.26% -0.01%
==========================================
Files 1080 1080
Lines 113902 113950 +48
Branches 4035 4043 +8
==========================================
+ Hits 106229 106271 +42
- Misses 6636 6641 +5
- Partials 1037 1038 +1 |
if (normalizeWhitespace) | ||
{ | ||
result = Regex.Replace(result, @"\s+", " "); | ||
result = result.NormalizeWhiteSpace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, @stephentoub on the latest .NET (of course this is broader targeted) would you expect regex replace using pre generated regex to have comparable performance to open coded?
I didn't measure, but fwiw
[GeneratedRegex(@"[\s^ ]+")]
private static partial Regex GetRegex();
generates code in part like
private bool TryFindNextPossibleStartingPosition(ReadOnlySpan<char> inputSpan)
{
int pos = runtextpos;
if ((uint)pos < (uint)inputSpan.Length)
{
ReadOnlySpan<char> span = inputSpan.Slice(pos);
for (int i = 0; i < span.Length; i++)
{
char ch;
if (((ch = span[i]) < '\u0080') ? ((byte)("㸀\0\u0001\0\0䀀\0\0"[(int)ch >> 4] & (1 << (ch & 0xF))) != 0) : RegexRunner.CharInClass(ch, "\0\u0004\u0001 !^_d"))
{
runtextpos = pos + i;
return true;
}
}
}
runtextpos = inputSpan.Length;
return false;
the code below hides some of this inside char.IsWhiteSpace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and will return the same string if there are no non-space whitespace chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After addressing the single line issue and regular expression issue, more optimizations could be having much less return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, @stephentoub on the latest .NET (of course this is broader targeted) would you expect regex replace using pre generated regex to have comparable performance to open coded?
Yes.
Note that in .NET 8, the generated code now looks like this:
private bool TryFindNextPossibleStartingPosition(ReadOnlySpan<char> inputSpan)
{
int pos = base.runtextpos;
// Empty matches aren't possible.
if ((uint)pos < (uint)inputSpan.Length)
{
// The pattern begins with a whitespace character.
// Find the next occurrence. If it can't be found, there's no match.
int i = inputSpan.Slice(pos).IndexOfAnyWhiteSpace();
if (i >= 0)
{
base.runtextpos = pos + i;
return true;
}
}
// No match found.
base.runtextpos = inputSpan.Length;
return false;
}
where that IndexOfAnyWhitespace is emitted as:
/// <summary>Finds the next index of any character that matches a whitespace character.</summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal static int IndexOfAnyWhiteSpace(this ReadOnlySpan<char> span)
{
int i = span.IndexOfAnyExcept(Utilities.s_asciiExceptWhiteSpace);
if ((uint)i < (uint)span.Length)
{
if (char.IsAscii(span[i]))
{
return i;
}
do
{
if (char.IsWhiteSpace(span[i]))
{
return i;
}
i++;
}
while ((uint)i < (uint)span.Length);
}
return -1;
}
/// <summary>Supports searching for characters in or not in "\0\u0001\u0002\u0003\u0004\u0005\u0006\a\b\u000e\u000f\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001a\u001b\u001c\u001d\u001e\u001f!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\u007f".</summary>
internal static readonly IndexOfAnyValues<char> s_asciiExceptWhiteSpace = IndexOfAnyValues.Create("\0\u0001\u0002\u0003\u0004\u0005\u0006\a\b\u000e\u000f\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001a\u001b\u001c\u001d\u001e\u001f!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\u007f");
such that the search for the whitespace is vectorized. Then the matching logic for finding all the contiguous whitespace currently looks like:
// Match a whitespace character atomically at least once.
{
int iteration = 0;
while ((uint)iteration < (uint)slice.Length && char.IsWhiteSpace(slice[iteration]))
{
iteration++;
}
if (iteration == 0)
{
return false; // The input didn't match.
}
slice = slice.Slice(iteration);
pos += iteration;
}
(though we're flirting with a change that'll result in that loop also changing to similarly be vectorized with IndexOfAnyExcept).
Replace itself also gets better in .NET 8.
|
||
if (diff || (length != text.Length)) | ||
{ | ||
char[] buffer = new char[length]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as you probably know, if this code was targeting .NET Core (or conditionally compiled thus) you could eliminate this char[]
allocation by using string.Create( ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this project is targeting multiple runtimes.
StyleCop.Analyzers/StyleCop.Analyzers/Helpers/ObjectPools/StringBuilderPool.cs
Show resolved
Hide resolved
Can someone please review the changes? Thanks. |
/// <param name="normalizeWhitespace">Normalize flag.</param> | ||
/// <param name="lastWhitespace">last char is white space flag.</param> | ||
/// <returns>True if output is different.</returns> | ||
internal static bool AppendNormalize(this StringBuilder builder, string text, bool normalizeWhitespace, ref bool lastWhitespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ Can we reasonably omit the normalizeWhitespace
parameter, and update callers that passed false to call Append
instead?
stringBuilder.AppendNormalize(single, normalizeWhitespace, ref lastWhitespace); | ||
} | ||
|
||
stringBuilder.AppendNormalize(item.ToString(), normalizeWhitespace, ref lastWhitespace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ Is it common for this line to be hit when lastWhitespace
is true and item
is pure whitespace?
[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [StyleCop.Analyzers](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers) | nuget | patch | `1.2.0-beta.435` -> `1.2.0-beta.507` | --- ### Release Notes <details> <summary>DotNetAnalyzers/StyleCopAnalyzers</summary> ### [`v1.2.0-beta.507`](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/releases/tag/1.2.0-beta.507) [Compare Source](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/compare/1.2.0-beta.435...1.2.0-beta.507) #### What's Changed - Update to StyleCop.Analyzers 1.2.0-beta.435 by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3499](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3499) - Add c# 11 test project to opencover-report.ps1 by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3506](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3506) - Use GetText instead of ToFullString by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3514](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3514) - Keep tracked nodes in a list by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3525](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3525) - Remove unnecessary nullable directives by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3530](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3530) - Remove hard-coded language versions in test projects for c# 8, 9 and 10 by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3528](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3528) - Update SA1515 to not let one range of trivia affect another by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3529](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3529) - Mentioned VS 2022 by [@​twojnarowski](https://togithub.com/twojnarowski) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3549](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3549) - Remove byte order mark from schema file by [@​martincostello](https://togithub.com/martincostello) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3562](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3562) - Update SA1012 to expect no space between a property pattern's opening brace and an enclosing list pattern's opening bracket by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3511](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3511) - Update Microsoft.CodeAnalysis.CSharp.Workspaces to version 4.4.0 for the c# 11 test project by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3580](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3580) - Update SA1008 to handle positional patterns inside property patterns by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3579](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3579) - Update SA1000 to trigger after keywords is, or, and, not by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3585](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3585) - Update SA1000.md by [@​Youssef1313](https://togithub.com/Youssef1313) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3563](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3563) - Update SA1313 to also allow incorrect names in explicitly implemented methods from interfaces by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3569](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3569) - Update SA1023 to not trigger first in line, inside a foreach without braces by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3543](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3543) - Update SA1400 to recognize access modifier "file" by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3590](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3590) - Update SA1206 to recognize modifier "file" by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3591](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3591) - Update SA1000 to handle checked operator declarations correctly by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3505](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3505) - Update SA1402 to handle records and record structs by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3570](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3570) - Bump Newtonsoft.Json from 12.0.3 to 13.0.2 in /StyleCop.Analyzers/StyleCop.Analyzers.Status.Generator by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3584](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3584) - Update to the latest version of the testing library by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3601](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3601) - Update so that SA1600 tests will be run with the expected language version in test projects for c# 8 and above by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3614](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3614) - Update reading of file_header_template and stylecop.documentation.copyrightText to allow multiple lines by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3617](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3617) - Update SA1015 to require trailing space after an explicit generic return type in a lambda expression by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3625](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3625) - Update to Microsoft.CodeAnalysis.Analyzers 3.3.5-beta1.23205.2 by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3628](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3628) - Update SA1206 to handle c# 11 modifier "required" by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3535](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3535) - Preparations for SettingsHelper optimizations by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3635](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3635) - Correct SA1515 to not fire on the second line of a file header by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3633](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3633) - Update AnalyzersExtensions and SettingsHelper to use cached JsonValue objects where possible by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3642](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3642) - Update SA1010 to not trigger on list patterns by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3507](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3507) - Update NamingSettings and DocumentationSettings to keep one Regex instance instead of calling Regex.IsMatch by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3639](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3639) - Use ResxSourceGenerator for resource generation by [@​sharwell](https://togithub.com/sharwell) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3343](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3343) - Make XmlCommentHelper faster by [@​ninedan](https://togithub.com/ninedan) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3651](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3651) - Update RenameToUpperCaseCodeFixProvider to not offer a code fix if the identifier only consists of underscores by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3637](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3637) - Don't emit SA1414 for interface implementations by [@​CollinAlpert](https://togithub.com/CollinAlpert) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3644](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3644) - Support file-scoped namespaces in SA1516 by [@​JakubLinhart](https://togithub.com/JakubLinhart) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3513](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3513) - Update SA1137 to also consider init accessors by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3669](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3669) - Update SA1500 to also consider init accessors by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3670](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3670) - Update SA1513 to not trigger before an init accessor by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3666](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3666) - Update SA1212 to also trigger for an init accessor before a getter by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3661](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3661) - Update SA1513 codefix to use the existing newline character sequence by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3607](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3607) - Correct code fix for SA1130 when delegate expression is part of a cast expression by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3516](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3516) - Update so that c# 7 tests will be run with the expected language version in test projects for c# 8 and above by [@​bjornhellander](https://togithub.com/bjornhellander) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3616](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3616) - SA1629 should allow full-sentence links instead of forcing the period to glow white by [@​jnm2](https://togithub.com/jnm2) in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3371](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3371) #### New Contributors - [@​twojnarowski](https://togithub.com/twojnarowski) made their first contribution in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3549](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3549) - [@​ninedan](https://togithub.com/ninedan) made their first contribution in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3651](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3651) - [@​CollinAlpert](https://togithub.com/CollinAlpert) made their first contribution in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3644](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3644) - [@​JakubLinhart](https://togithub.com/JakubLinhart) made their first contribution in [https://github.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3513](https://togithub.com/DotNetAnalyzers/StyleCopAnalyzers/pull/3513) **Full Changelog**: DotNetAnalyzers/StyleCopAnalyzers@1.2.0-beta.435...1.2.0-beta.507 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 4am" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/ThorstenSauter/NoPlan). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4xMzEuMCIsInVwZGF0ZWRJblZlciI6IjM1LjEzMS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9--> --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Thorsten Sauter <Thorsten.Sauter@gmail.com>