Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Fix assertion due to remorph #110516

Closed
wants to merge 70 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
0cbf3a2
Try fixing remorph issue
hez2010 Dec 6, 2024
8e63d8d
Delete unused static field from List (#110515)
stephentoub Dec 9, 2024
f7fad34
Remove some RuntimeExport/RuntimeImport indirections (#110437)
MichalStrehovsky Dec 9, 2024
9df306f
Fix linux-armel build (#110514)
am11 Dec 9, 2024
d9f5ee4
[wasi] bump wasmtime to 27 (#110524)
pavelsavara Dec 9, 2024
4525619
[debugger] Fix a step that becomes a go (#110484)
thaystg Dec 9, 2024
df70878
fix profiling env var names in profiling.md (#109764)
loadingcn Dec 9, 2024
6ade56a
Change assertion in IPGlobalProperties_DomainName_ReturnsEmptyStringW…
antonfirsov Dec 9, 2024
1c960dd
Added a fix for the build failure when STRESS_DYNAMIC_HEAP_COUNT is d…
mrsharm Dec 9, 2024
6fdfb78
[cdac] Handle no method def token when trying to get the IL version s…
elinor-fung Dec 9, 2024
a08711c
Normalization APIs using the spans (#110465)
tarekgh Dec 9, 2024
511b2e0
Fix test under JIT stress (#110538)
AaronRobinsonMSFT Dec 10, 2024
810bad5
JIT: extract BBJ_COND to BBJ_ALWAYS profile repair as utility (#110494)
AndyAyersMS Dec 10, 2024
cf92a64
Speed up surrogate validation in HttpUtility (#110478)
MihaZupan Dec 10, 2024
69229cd
Don't wait for finalizers in 'IReferenceTrackerHost::ReleaseDisconnec…
Sergio0694 Dec 10, 2024
dede4f9
[cdac] Fix calculation of `MethodDesc` optional slot addresses (#110491)
elinor-fung Dec 10, 2024
c3e071c
[dac] Make `GetObjectStringData` return the needed buffer element cou…
elinor-fung Dec 10, 2024
7b1214b
Remove ld_classic in 16+ (#110542)
agocke Dec 10, 2024
43c5fac
Fix AV error in DAC on Linux/MacOS - issue #109877 (#110557)
mikem8361 Dec 10, 2024
40014b6
[browser] fix code gen overflow (#110539)
pavelsavara Dec 10, 2024
5d539a2
Disable `HybridGlobalization` tests for WASM on CI (#110526)
ilonatommy Dec 10, 2024
0bb7432
JIT: Include more edges in `BlockDominancePreds` (#110531)
jakobbotsch Dec 10, 2024
e0f70cc
[mono][interp] Remove no_inlining functionality for dead bblocks (#11…
BrzVlad Dec 10, 2024
bbe9a9d
Enable more ILLinker skipped tests on native AOT (#110353)
MichalStrehovsky Dec 10, 2024
5750272
JIT: Avoid comparing regnums in `GenTreeHWIntrinsic::Equals` (#110535)
jakobbotsch Dec 10, 2024
2071313
fix FastOpen compilation (#110561)
wfurt Dec 10, 2024
f430ffa
Remove duplicate IsAscii check in string.IsNormalized (#110576)
MihaZupan Dec 10, 2024
c7fc667
Remove Helper Method Frames (HMF) from Reflection (#110481)
AaronRobinsonMSFT Dec 10, 2024
df0eaa2
Speed up surrogate validation in string.Normalize (#110574)
MihaZupan Dec 10, 2024
836b868
Use holding thread id in AwareLock to avoid orphaned lock crash (#107…
eduardo-vp Dec 10, 2024
6d18e0d
Cleanup some dead code (#110579)
huoyaoyuan Dec 11, 2024
2579b1e
Revert "[browser] fix code gen overflow (#110539)" (#110599)
jkotas Dec 11, 2024
8708c3d
Share threadpool configuration (#110469)
MichalStrehovsky Dec 11, 2024
d564cb3
Ensure that we don't try and optimize masks for promoted fields (#110…
tannergooding Dec 11, 2024
22001f7
[cDAC] SOSDacImpl::GetMethodDescData DynamicMethodObject (#110545)
max-charlamb Dec 11, 2024
ff171c4
Fix comments in AggregateException.GetBaseException() (#107743)
epsitec Dec 11, 2024
9652163
[browser] Remove WASM `HybridGlobalization` from library tests, WBT a…
ilonatommy Dec 11, 2024
d97abb1
fix wrong arguments order in CrlCacheExpired call (#110457)
Alex4414 Dec 11, 2024
fe9a96a
Fix TensorExtensions.StdDev (#110392)
lilinus Dec 11, 2024
0181b15
Use FLS detach as thread termination notification on windows. (#110589)
VSadov Dec 11, 2024
2692fc5
[Profiler] Avoid Recursive ThreadStoreLock in Profiling Thread Enumer…
mdh1418 Dec 11, 2024
097ed73
Remove HttpMetricsEnrichmentContext caching (#110580)
MihaZupan Dec 11, 2024
ab2fa84
[NRBF] Reduce the most time-consuming test case to avoid timeouts for…
adamsitnik Dec 12, 2024
ae492ef
All `WasmBuildTests` use static project from assets or `dotnet new`, …
ilonatommy Dec 12, 2024
8c80358
[Mono]: Update Mono diagnostic docs. (#110621)
lateralusX Dec 12, 2024
124986b
Update dependencies from dotnet/roslyn (#110105)
am11 Dec 12, 2024
b0f79a4
[Mono]: Fix Mono profiler EventPipe provider instrumentation feature.…
lateralusX Dec 12, 2024
add0aa3
SPMI: Avoid duplicate example diffs in diffs summary (#110619)
jakobbotsch Dec 12, 2024
0c0281e
[browser] fix code gen overflow - reapply (#110606)
pavelsavara Dec 12, 2024
432af20
JIT: Remove `VisitLoopBlocksLexical` utility (#110490)
amanasifkhalid Dec 12, 2024
80b8de7
Fix crash when pTargetMD is null (#110650)
thaystg Dec 12, 2024
57e0e9c
[main] Update dependencies from dotnet/roslyn (#110084)
dotnet-maestro[bot] Dec 12, 2024
c0e3f59
JIT: Fix reporting of tier name metadata (#110610)
jakobbotsch Dec 12, 2024
b3d059f
More WriteGather fixes (#109826)
adamsitnik Dec 12, 2024
c39d942
[cdac] Always re-read global pointers in GetUsefulGlobals (#110633)
elinor-fung Dec 12, 2024
3955bc8
Correct arm64 SignExtension (#110635)
mikelle-rogers Dec 12, 2024
46946fe
Delete .GuardCF library build (#110671)
MichalStrehovsky Dec 13, 2024
32acefa
[browser] NativeAOT-LLVM support in browser-bench (#110611)
maraf Dec 13, 2024
cb8d141
JIT: Remove always-true `fgCanRelocateEHRegions` (#110612)
jakobbotsch Dec 13, 2024
a4ca48f
[wasm] Add bench output log, to the file and to the console (#110669)
radekdoulik Dec 13, 2024
34cf5bc
JIT: Add an "init BB" invariant (#110404)
jakobbotsch Dec 13, 2024
d7cc790
Remove FabricBot from area-owners.md (#110525)
akoeplinger Dec 13, 2024
15e01d4
JIT: Spill newarr into temp (#110518)
hez2010 Dec 13, 2024
07e85b6
[cdac] Fix ISOSDacInterface13.TraverseLoaderHeap parameter type (#110…
elinor-fung Dec 13, 2024
05d687e
[cdac] Handle non-IL method descs in `RuntimeTypeSystem_1.GetMethodCl…
elinor-fung Dec 13, 2024
97f8570
[cDAC] Implement GCCover portion of SOSDacImpl::GetMethodDescData (#1…
max-charlamb Dec 13, 2024
f1b1f3d
Remove unused Precode::IsCorrectMethodDesc (#110703)
elinor-fung Dec 13, 2024
1502947
JIT: capture class types when spilling a GDV arg (#110675)
AndyAyersMS Dec 14, 2024
fd3c397
[cdac] Clear cached data as part of IXCLRDataProcess::Flush (#110700)
elinor-fung Dec 14, 2024
f52248f
Improve codegen for Vector512.ExtractMostSignificatBits (#110662)
tannergooding Dec 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[mono][interp] Remove no_inlining functionality for dead bblocks (#11…
…0468)

Many methods in the BCL, especially hwintrins related, contain a lot of code that is detected as dead during compilation. On mono, inlining happens during IL import and a lot of optimizations are run as later passes. This exposed the issue where we have a lot of dead code bloat from inlining, with optimizations later running on it.

A simple solution for this problem was tracking jump counts for each bblock (#97514), which are initialized when bblocks are first created, before IL import stage. Then a small set of IL import level optimizations were added, in order to reduce the jump targets of each bblock. As we were further importing IL, if we reached a bblock with 0 jump targets, we would disable inlining into it, in order to reduce code bloat. Disabling code emit altogether was too challenging. Another limitation of this approach was that we would fail to detect dead code if it was part of a loop. The results were good however, by reducing mem usage in `System.Numerics.Tensor.Tests` from 6GB to 600MB.

For an unrelated issue, the order in which we generate bblocks was redesigned in order to account for bblock stack state initialization in weird control flow scenarios (#108731). This was achieved by deferring IL import into bblocks that were not yet reached from other live bblocks. A side effect of this is that we no longer generate code at all in unreachable bblocks, completely superseding the previous approach while addressing both the problems of inlining into loops or generating IR for dead IL. In the previously mentioned test suite, this further reduced the memory usage to 300MB.

Remnants of the unnecessary `no_inlining` approach still lingered in the code, leading to disabling of inline optimization in some reachable code. This triggered a significant performance regression which this PR addresses.
  • Loading branch information
BrzVlad authored and hez2010 committed Dec 14, 2024
commit e0f70cc6e7d18fd1cfd79e13bb32f6bd0aea677d
48 changes: 7 additions & 41 deletions src/mono/mono/mini/interp/transform.c
Original file line number Diff line number Diff line change
Expand Up @@ -760,8 +760,6 @@ handle_branch (TransformData *td, int long_op, int offset)
init_bb_stack_state (td, target_bb);

if (long_op != MINT_CALL_HANDLER) {
if (td->cbb->no_inlining)
target_bb->jump_targets--;
// We don't link finally blocks into the cfg (or other handler blocks for that matter)
interp_link_bblocks (td, td->cbb, target_bb);
}
Expand Down Expand Up @@ -803,8 +801,6 @@ one_arg_branch(TransformData *td, int mint_op, int offset, int inst_size)
return FALSE;
} else {
// branch condition always false, it is a NOP
int target = GPTRDIFF_TO_INT (td->ip + offset + inst_size - td->il_code);
td->offset_to_bb [target]->jump_targets--;
return TRUE;
}
} else {
Expand Down Expand Up @@ -901,8 +897,6 @@ two_arg_branch(TransformData *td, int mint_op, int offset, int inst_size)
return FALSE;
} else {
// branch condition always false, it is a NOP
int target = GPTRDIFF_TO_INT (td->ip + offset + inst_size - td->il_code);
td->offset_to_bb [target]->jump_targets--;
return TRUE;
}
} else {
Expand Down Expand Up @@ -2884,9 +2878,6 @@ interp_method_check_inlining (TransformData *td, MonoMethod *method, MonoMethodS
if (td->disable_inlining)
return FALSE;

if (td->cbb->no_inlining)
return FALSE;

// Exception handlers are always uncommon, with the exception of finally.
int inner_clause = td->clause_indexes [td->current_il_offset];
if (inner_clause != -1 && td->header->clauses [inner_clause].flags != MONO_EXCEPTION_CLAUSE_FINALLY)
Expand Down Expand Up @@ -4151,7 +4142,6 @@ get_basic_blocks (TransformData *td, MonoMethodHeader *header, gboolean make_lis
unsigned char *target;
ptrdiff_t cli_addr;
const MonoOpcode *opcode;
InterpBasicBlock *bb;

td->offset_to_bb = (InterpBasicBlock**)mono_mempool_alloc0 (td->mempool, (unsigned int)(sizeof (InterpBasicBlock*) * (end - start + 1)));
get_bb (td, start, make_list);
Expand All @@ -4160,21 +4150,18 @@ get_basic_blocks (TransformData *td, MonoMethodHeader *header, gboolean make_lis
MonoExceptionClause *c = header->clauses + i;
if (start + c->try_offset > end || start + c->try_offset + c->try_len > end)
return FALSE;
bb = get_bb (td, start + c->try_offset, make_list);
bb->jump_targets++;
get_bb (td, start + c->try_offset, make_list);
mono_bitset_set (il_targets, c->try_offset);
mono_bitset_set (il_targets, c->try_offset + c->try_len);
if (start + c->handler_offset > end || start + c->handler_offset + c->handler_len > end)
return FALSE;
bb = get_bb (td, start + c->handler_offset, make_list);
bb->jump_targets++;
get_bb (td, start + c->handler_offset, make_list);
mono_bitset_set (il_targets, c->handler_offset);
mono_bitset_set (il_targets, c->handler_offset + c->handler_len);
if (c->flags == MONO_EXCEPTION_CLAUSE_FILTER) {
if (start + c->data.filter_offset > end)
return FALSE;
bb = get_bb (td, start + c->data.filter_offset, make_list);
bb->jump_targets++;
get_bb (td, start + c->data.filter_offset, make_list);
mono_bitset_set (il_targets, c->data.filter_offset);
}
}
Expand Down Expand Up @@ -4207,8 +4194,7 @@ get_basic_blocks (TransformData *td, MonoMethodHeader *header, gboolean make_lis
target = start + cli_addr + 2 + (signed char)ip [1];
if (target > end)
return FALSE;
bb = get_bb (td, target, make_list);
bb->jump_targets++;
get_bb (td, target, make_list);
ip += 2;
get_bb (td, ip, make_list);
mono_bitset_set (il_targets, GPTRDIFF_TO_UINT32 (target - start));
Expand All @@ -4217,8 +4203,7 @@ get_basic_blocks (TransformData *td, MonoMethodHeader *header, gboolean make_lis
target = start + cli_addr + 5 + (gint32)read32 (ip + 1);
if (target > end)
return FALSE;
bb = get_bb (td, target, make_list);
bb->jump_targets++;
get_bb (td, target, make_list);
ip += 5;
get_bb (td, ip, make_list);
mono_bitset_set (il_targets, GPTRDIFF_TO_UINT32 (target - start));
Expand All @@ -4231,15 +4216,13 @@ get_basic_blocks (TransformData *td, MonoMethodHeader *header, gboolean make_lis
target = start + cli_addr;
if (target > end)
return FALSE;
bb = get_bb (td, target, make_list);
bb->jump_targets++;
get_bb (td, target, make_list);
mono_bitset_set (il_targets, GPTRDIFF_TO_UINT32 (target - start));
for (j = 0; j < n; ++j) {
target = start + cli_addr + (gint32)read32 (ip);
if (target > end)
return FALSE;
bb = get_bb (td, target, make_list);
bb->jump_targets++;
get_bb (td, target, make_list);
ip += 4;
mono_bitset_set (il_targets, GPTRDIFF_TO_UINT32 (target - start));
}
Expand Down Expand Up @@ -5446,13 +5429,6 @@ generate_code (TransformData *td, MonoMethod *method, MonoMethodHeader *header,

/* We are starting a new basic block. Change cbb and link them together */
if (link_bblocks) {
if (!new_bb->jump_targets && td->cbb->no_inlining) {
// This is a bblock that is not branched to and falls through from
// a dead predecessor. It means it is dead.
new_bb->no_inlining = TRUE;
if (td->verbose_level)
g_print ("Disable inlining in BB%d\n", new_bb->index);
}
/*
* By default we link cbb with the new starting bblock, unless the previous
* instruction is an unconditional branch (BR, LEAVE, ENDFINALLY)
Expand All @@ -5472,16 +5448,6 @@ generate_code (TransformData *td, MonoMethod *method, MonoMethodHeader *header,
}
// link_bblocks remains true, which is the default
} else {
if (!new_bb->jump_targets) {
// This is a bblock that is not branched to and it is not linked to the
// predecessor. It means it is dead.
new_bb->no_inlining = TRUE;
if (td->verbose_level)
g_print ("Disable inlining in BB%d\n", new_bb->index);
} else {
g_assert (new_bb->jump_targets > 0);
}

if (new_bb->stack_height >= 0) {
// This is relevant only for copying the vars associated with the values on the stack
memcpy (td->stack, new_bb->stack_state, new_bb->stack_height * sizeof(td->stack [0]));
Expand Down
3 changes: 0 additions & 3 deletions src/mono/mono/mini/interp/transform.h
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,6 @@ struct _InterpBasicBlock {
StackInfo *stack_state;

int index;
int jump_targets;

InterpBasicBlock *try_bblock;

Expand All @@ -160,8 +159,6 @@ struct _InterpBasicBlock {
// This block has special semantics and it shouldn't be optimized away
guint preserve : 1;
guint dead: 1;
// This bblock is detectead early as being dead, we don't inline into it
guint no_inlining: 1;
// If patchpoint is set we will store mapping information between native offset and bblock index within
// InterpMethod. In the unoptimized method we will map from native offset to the bb_index while in the
// optimized method we will map the bb_index to the corresponding native offset.
Expand Down