-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrustgc_paper.ltx
1833 lines (1590 loc) · 95.6 KB
/
rustgc_paper.ltx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%&rustgc_paper_preamble
\endofdump
\begin{document}
\begin{abstract}
\noindent Rust is a non-Garbage Collected (GCed) language, but the lack of GC
makes expressing data-structures that require shared ownership awkward,
inefficient, or both. In this paper we explore a new design for, and implementation of, GC in
Rust, called \ourgc. Unlike previous
approaches to GC in Rust, \ourgc maps existing Rust destructors to
finalizers: this makes GC in Rust natural to use but introduces surprising
soundness, performance, and ergonomic problems. \ourgc provides solutions for
each of these problems.
\end{abstract}
\maketitle
\section{Introduction}
\begin{figure}[t]
\lstinputlisting[
language=Rust,
firstline=6,
basicstyle=\footnotesize\ttfamily,
caption={An \ourgc example, showing \lstinline{Gc<T>}
and destructors as finalizers. We create a type \lstinline{GcNode} which
models a graph: it stores an 8 bit integer value and a possibly-null
reference (via Rust's standard \lstinline{Option} type) to a neighbouring node
(line 1). We add a normal Rust destructor which \ourgc is able to use as a
finalizer when \lstinline{GcNode} is used inside \lstinline{Gc<T>} (line 2).
Inside \lstinline{main} we create the first GCed node in the graph (line 5).
We use Rust's normal \lstinline{RefCell} type to allow the node to be mutated
(using the \lstinline{RefCell::borrow\_mut} method which dynamically checks
for mutation that would undermine Rust's static rules)
and a cycle created directly back to itself (line 6). We then create a second cyclic graph (lines 7
and 8), immediately assigning it to the \lstinline{gc1} variable (line 9):
this copies, rather than moves, the \lstinline{Gc<T>}.
This cause the first cyclic graph \lstinline{GcNode\{value: 1, ..\}}
to no longer be reachable, so after forcing a collection (line 10) that node
can be collected. Its finalizer is then scheduled to be run, causing
\lstinline{drop 1} to be printed out at a later point; when it has completed the GC
heap memory can be reclaimed. The print statement outputs \lstinline{2 2} (line
11).},
label={fig:first_example}
]{listings/first_example.rs}
\end{figure}
Amongst the ways one can classify programming languages are whether they
are Garbage Collected (GCed) or not: GCed languages enable implicit memory management;
non-GCed languages require explicit memory management (e.g~\lstinline{C}'s \lstinline{malloc} /
\lstinline{free} functions). Rust's use of affine types~\citep[p.~5]{pierce04advanced}
and ownership does not
fit within this classification: it is not GCed but it has implicit memory management.
Most portions of most Rust programs are as
succinct as a GCed equivalent, but ownership is too inflexible to express
\emph{shared ownership} for data-structures that require multiple owners
(e.g.~doubly linked lists).
Workarounds such as reference counting impose an extra burden on the programmer,
make mistakes more likely, and often come with a performance penalty.
In an attempt to avoid such problems, there are now a number of GCs for Rust
(e.g.~\cite{manish15rustgc, coblenz21bronze, gcarena, boa, shifgrethor}). Most
introduce a user-visible type \lstinline{Gc<T>} which takes a value $v$
of type \lstinline{T} and moves $v$ to the `GC heap'. The \lstinline{Gc<T>}
value itself is a wrapper around a pointer to $v$ on the GC heap.
\lstinline{Gc<T>} can be \emph{cloned} (i.e.~duplicated) and
\emph{dereferenced} to a value of type \lstinline{&T} (i.e.~a type-safe pointer) at will by the user. When no
\lstinline{Gc<T>} wrappers pointing to $v$
can be found, indirectly or directly, from the
program's \emph{roots} (e.g.~variables on the stack),
then the GC heap memory for $v$ can be reclaimed.
It has proven hard to find a satisfying design and implementation for a GC for
Rust, as perhaps suggested by the number of attempts to do so.
We identify two fundamental challenges
for GC for Rust: how to give \lstinline{Gc<T>} an idiomatic and complete
API; and how to make \emph{finalizers} (i.e.~the code that is run just before a
value is collected by the GC) safe, performant, and ergonomic. We show that
\emph{conservative} GC (i.e.~treating each reachable machine word as
a potential pointer) is necessary and sufficient to solve the API challenge,
but the finalization challenge is more involved.
Normal Rust code uses \emph{destructors} (i.e.~code which is run just before a
value is reclaimed by Rust's implicit memory management) extensively. Although
finalizers and destructors may seem to be synonyms, existing GCs for Rust
cannot reuse destructors as finalizers:
the latter must be manually implemented for each type that needs it.
Unfortunately, even this is trickier than it appears:
it is not possible to implement a finalizer for
\lstinline{Gc<T>} if \lstinline{T} is an external library; some parts of
destructors are automatically created by the Rust compiler, but
hand-written finalizers must duplicate those parts manually; and users
can accidentally cause a type's finalizer to be run more than once. In
short, finalization in existing GCs for Rust is unpalatable.
GCs for Rust are not alone in requiring manually written finalizers.
In a close cousin to our work,
a GC proposal for C++, the reuse of destructors as finalizers was ruled out due to
seemingly insoluble problems~\cite[p.~32]{boehm09garbage}, which we divide
into four categories:
(1) finalizers are prohibitively slower than destructors;
(2) finalizers can be run prematurely;
(3) running finalizers on the same thread as a paused mutator can cause race conditions and deadlocks;
(4) some safe destructors are not safe finalizers.
All are, at least to some degree, classical GC problems; all are exacerbated
in some way by Rust; and none, with the partial exception of (2), has
existing solutions.
In this paper we show that it is possible to reuse most
Rust destructors as finalizers in a satisfying way. We introduce novel solutions to the
long-standing problems this implies by making use of some of Rust's unusual
static guarantees. We thus gain a better GC for
Rust and solutions to open GC problems. Our solutions, in order, are:
(1) \emph{finalizer elision} statically optimises away finalizers if the
underlying destructor duplicates the GC's work;
(2) \emph{premature finalizer prevention} automatically inserts barriers to prevent
the GC from being `tricked' into collecting values before they are dead;
(3) we run finalizers on a separate thread; and
(4) \emph{finalizer safety analysis}
extends Rust's static analyses to reject programs whose destructors are not
provably safe to be used as finalizers.
These solutions are implemented as part of \ourgc, a new GC for Rust:
an example of its use is shown in \cref{fig:first_example}.
Although \ourgc is not production ready, it has good enough performance
(across a number of benchmarks, its performance is \jake{geomean over whole suite}) and
other polish (e.g.~good quality error messages) that we believe it shows
a plausible path forwards for those who may wish to follow it. Furthermore,
although \ourgc is necessarily tied to Rust, we suspect that most of the
techniques in this paper, particularly those related to finalization, will
generalise to other ownership-based languages.
This paper is divided into four main parts: background (\cref{sec:background});
\ourgc's basic design (\cref{sec:alloy_design}); destructor and finalizer challenges
and solutions (\cref{sec:destructor challenges,sec:elision,sec:premature_finalize_prevention,sec:fsa}); and evaluation
(\cref{sec:evaluation}). The first three parts have the challenge that our work
straddles two areas that seem almost mutually exclusive: GC and Rust. We have
tried to provide sufficient material for readers expert in one of these
areas to gain adequate familiarity with the other, without boring either, but we explicitly encourage
readers to skip material they are already comfortable with.
\section{Background}
\label{sec:background}
\subsection{Does Rust need a GC?}
\begin{figure}[t]
\lstinputlisting[
language=Rust,
firstline=5,
basicstyle=\footnotesize\ttfamily,
caption={
A version of~\cref{fig:first_example} using Rust's standard reference
counting type \lstinline{Rc<T>}. To avoid memory leaks we use \emph{weak}
references between nodes (line 1). We again create two cyclic graphs (lines
6--9) using \lstinline{Rc::downgrade} to create weak references (lines 7 and
0). Since \lstinline{Rc<T>} is not copyable, we must use a manual
\lstinline{clone} call to have both the \lstinline{rc1} and \lstinline{rc2}
variables point to the same cyclic graph (line 10). Accessing a neighbour
node becomes a delicate dance requiring upgrading the weak reference (line 11).
The need to downgrade \lstinline{Rc<T>} to \lstinline{Weak<T>} and upgrade
(which may fail, hence the \lstinline{unwrap}) back to \lstinline{Rc<T>}
creates significant extra complexity relative to~\cref{fig:first_example}: compare
line 11 in \cref{fig:first_example} to (the much more complex) lines 10-12
in this \lstinline{Rc} example.
},
label={fig:rc_example}
]{listings/rc_example.rs}
\end{figure}
Rust uses affine types and \emph{ownership}
to statically guarantee that: a value has a
single owner (e.g.~a variable); an owner can \emph{move} (i.e.~permanently
transfer the ownership of) a value to another owner; and
when a value's owner goes out of scope, the value's destructor
is run and its backing memory reclaimed. An owner can pass \emph{references} to a value
to other code, subject to the following static restrictions: there can be
multiple immutable references (`\lstinline{&}') to a value or a single
mutable reference (`\lstinline{&mut}'); and references cannot outlast the owner.
These rules allow many Rust programs to be as succinct as their equivalents
in GCed languages. This suggests that the search for a good GC for Rust may be
intellectually stimulating but of little practical value.
However, there are many programs which need to express data structures
which do not fit into the restrictions of affine types and
ownership. These are often described as `cyclic data-structures', though
that sometimes gives the incorrect impression that only data structures
such as doubly linked lists are of interest. In reality, programs as diverse as
interpreters for dynamically typed languages have need to use such types.
In this paper we use the more abstract term `shared ownership', which includes,
but is not limited to, cyclic data-structures.
A common way of working around these limitations
is the reference counting type \lstinline{Rc<T>} in Rust's
standard library. For many data-structures, this is a reasonable
solution, but some forms of shared ownership require
juggling strong and weak counts. This complicates the program
(see~\cref{fig:rc_example}) and makes
it easy for values to live shorter or longer than intended.
Another workaround is to store values in a vector and use
integer indices into that vector. Such indices are morally closer to
machine pointers than normal Rust references: the indices can become
stale, dangle, or may never have been valid in the first place. The programmer
must also manually deal with issues such as detecting unused values,
compaction, and so on. In other words, the programmer
ends up writing a partial GC themselves. A variant of this idea are
\emph{arenas}, which gradually accumulate multiple values but free all of them in one go: values
can no longer be reclaimed too early, though arenas tend to unnecessarily
increase the lifetime of objects.
A type-based approach is
\lstinline{GhostCell}~\cite{yanovski21ghostcell}, which uses \emph{branding} to
statically ensure that only one part of a program can access a
shared ownership data-structure at a time.
However, this implicitly prevents multiple owners (e.g.~in different threads)
from reading or mutating different parts of the structure.
Although it is easily overlooked, some workarounds (e.g.~\lstinline{Rc<T>})
rely on using \emph{unsafe} Rust (i.e.~parts of the language, often involving
pointers, that are not fully statically checked by the compiler). Pragmatically,
we assume that widely used code, even if technically unsafe, has been pored
over sufficiently that it is trustworthy in practise. However,
`new' solutions that a programmer implements using
unsafe Rust are unlikely to immediately reach the same level of trustworthiness.
While we do not believe that every Rust program would be improved by GC, the
variety of workarounds already present in Rust code, and the difficultly
of creating new ones, suggests that a substantial subset might benefit from GC.
\subsection{GC terminology}
GC is a venerable field and has accumulated terminology that can seem
unfamiliar or unintuitive. We mostly use the same terminology
as~\cite{jones23garbage}, the major parts of which we define here.
A program which uses GC is split between the \emph{mutator} (the user's program) and
the \emph{collector} (the GC itself). At any given point in time, a particular thread is either
running as a mutator or a collector. In our context, all threads
run as a collector at least sometimes, with some threads always running as a collector.
Tracing and reclamation is performed during a \emph{collection} phase. Our
collections always \emph{stop-the-world}, where all threads running
mutator code are paused while collection occurs.
A \emph{tracing} GC is one that scans memory looking
for reachable objects from a program's roots: objects that are not reachable
from the roots can then be \emph{reclaimed}. In contrast, a reference counting GC does
not scan memory, and thus cannot free objects that form a cycle. As is common
in most GC literature, henceforth we use `GC' to mean `tracing GC' (and not
reference counting or the like) unless we explicitly say otherwise.
We refer to memory which is allocated via \lstinline{Gc<T>} as being on
the \emph{GC heap}. We use the term `GC value' to refer both to the pointer wrapped in a
\lstinline{Gc<T>} and the underlying value on the GC heap, even though multiple
pointers / wrappers can refer to a single value on the GC heap, unless doing so
would lead to ambiguity.
We use `\ourgc' to refer to the combination of: our extension to the Rust
language; our modifications to the \lstinline{rustc} compiler; and our
integration of the Boehm-Demers-Weiser GC (\boehm) into the runtime of programs
compiled with our modified \lstinline{rustc}.
\section{\ourgc: Design and Implementation}
\label{sec:alloy_design}
In this section we outline \ourgc's basic design and implementation choices --
the rest of the paper then goes into detail on the more advanced aspects.
\subsection{Basic Design}
\label{sec:basic design}
\ourgc provides a \lstinline{Gc<T>} type that exposes an API modelled on the
\lstinline{Rc<T>} type from Rust's standard library, because
\lstinline{Rc<T>}: is conceptually
similar to \lstinline{Gc<T>}; widely used in Rust code, and its API
familiar; and that API reflects long-term experience about what Rust programmers
need.
When a user calls \lstinline{Gc::new(v)}, the value \lstinline{v} is
moved to the GC heap: the \lstinline{Gc<T>} value returned to the user is a
simple wrapper around a pointer to \lstinline{v}'s new address. The same underlying GCed value
may thus have multiple, partly overlapping, references active at any point.
To avoid undermining
Rust's ownership system, this means that dereferencing a \lstinline{Gc<T>} must
produce an immutable (i.e.~`\lstinline{&}') reference to the underlying value.
If the user wishes to mutate the underlying value, they must use other Rust
types that enable \emph{interior mutability} (e.g.~\lstinline{RefCell<T>} or
\lstinline{Mutex<T>}).
One feature that \ourgc explicitly adopts is \lstinline{Rc<T>}'s
ability to be transformed into a raw pointer (\lstinline{into_raw}) and
back (\lstinline{from_raw}). Crucially, transforming a \lstinline{Rc<T>} / \lstinline{Gc<T>}
into a pointer consumes the high-level wrapper: the
resulting pointer is thus not constrained by the wrapper's lifetime.
Though many programmers do not directly
encounter these functions, they are a vital link between Rust's high and
low-level features (e.g.~being used for the C Foreign Function Interface
(FFI) alongside the ability to cast Rust references to pointer types and back).
We believe that a viable GC for Rust must include this same
ability, but doing so has a profound impact because Rust allows raw pointers to be
converted to the integer type \lstinline{usize} and back\footnote{Although
it is outside the scope of this paper, it would
be preferable for Rust to have different types for `data width' and
`address'. Modern C, for example, captures this difference with the \lstinline{size_t} and
\lstinline{uintptr_t} types respectively. Rust now has a provenance lint to
nudge users in this general direction, but the \lstinline{as}
keyword still allows arbitrary conversions.}.
\label{conservative_gc}
Having acknowledged that pointers can be disguised as integers, it is then
inevitable that \ourgc must be a conservative GC: if a machine word's integer
value, when considered as a pointer, falls within a GCed block of memory,
then that block itself is considered reachable (and is transitively scanned).
Since a conservative GC cannot know if a word is really a pointer, or just happens to be a sequence of
bits that look like a valid pointer, this over-approximates the
\emph{live set} (i.e.~the blocks that the GC will not reclaim). Typically
the false detection rate is too low to be a worry (see e.g.~a Java study which measures
it at under 0.01\% of live objects~\cite{shahriyar14fast}).
Conservative GC occupies a grey zone in programming language semantics: in most
languages, and most compiler's internal semantics, conservative GC is, formally
speaking, unsound; and furthermore some languages (including Rust) allow
arbitrary `bit fiddling' on pointers that temporarily
obscures the address they are referring to. Despite this, conservative GC is widely used,
including in the two most widespread web browsers: Chrome uses it in its Blink
rendering engine~\citep{ager13oilpan} and Safari uses it in its JavaScript VM
JavaScriptCore~\citep{pizlo17riptide}. Even in 2024, we lack good alternatives
to conservative GC: there is no cross-platform API for precise GC; and while
some compilers such as LLVM provide some support for GC
features~\cite{llvm14statepoints}, we have found them incomplete and buggy.
Despite the potential soundness worries, conservative GC thus remains a widely
used technique.
\label{gc_is_copyable}
Conservative GC enables \ourgc to make a useful ergonomic improvement over
most other GCs for Rust whose \lstinline{Gc<T>} is only \emph{cloneable}. Such types can be duplicated, but doing
so requires executing arbitrary user code. To make the possible run-time cost of this clear, Rust has
no direct syntax for cloning: users must explicitly call \lstinline{Rc::clone(&v)}
to duplicate a value \lstinline{v}. In contrast, since \ourgc's \lstinline{Gc<T>} is just a wrapper around a pointer it
is not just cloneable but also \emph{copyable}: duplication only requires copying
bytes (i.e.~no arbitrary user code need be executed). Copying is implied by assignment
(i.e. \lstinline{w = v}),
reducing the need for a function call that cloning requires entirely\footnote{The lengthier
syntax \lstinline{y = Gc::clone(&v)} is available, since every copyable type is
also cloneable.}. This is not just a syntactic convenience but also reflects an underlying
semantic difference: duplicating a \lstinline{Gc<T>} in \ourgc is is a cheaper and simpler operation
than most other GCs for Rust which which tend to rely in part on reference counting.
\subsection{Basic Implementation}
The most visible aspect of \ourgc is its fork, and extension of, the standard
Rust compiler \rustc. We forked \rustc~\rustcversion and have
added or changed approximately 3,150 \laurie{needs updating? also split apart the LoC for tests} Lines of Code (LoC). Users must opt-in
to GC by enabling the allocator and assigning it to an arbitrarily-named variable:
\begin{lstlisting}[language=Rust, numbers=none, basicstyle=\footnotesize\ttfamily]
#![feature(gc)] use std::gc::GcAllocator;
#[global_allocator] static A: GcAllocator = GcAllocator
\end{lstlisting}
\ourgc uses \boehm as the underlying conservative GC, because it is the
most widely ported conservative GC we know of. Although we adapted \boehm
for \ourgc, we expect that our techniques will apply to other conservative GCs with
relative ease.
We made the following changes to \boehm. First, we disabled \boehm's parallel collector
because, for reasons we don't fully understand, it worsens \ourgc's
performance. Second, \boehm cannot scan pointers stored in thread locals
because these are platform dependent. Fortunately, \rustc uses LLVM's
thread local storage implementation, which stores such pointers in the
\lstinline{PT_TLS} segment of the ELF binary: we modified \boehm to scan
this ELF segment during each collection. Third,
\boehm dynamically intercepts thread creation calls so that it can
then can scan their stacks, but (for bootstrapping
reasons) is unable to do so in our context: we explicitly changed \ourgc
to register new threads with \boehm.
We made one conceptually larger -- though only 60LoC -- change to \boehm,
making it run all finalizers on a
separate thread (see~\cref{sec:general_challenges,sec:fsa_finalizer_thread} for
motivation). We lazily create a thread dedicated to this once the collector has
first identified objects that need finalizing. The finalizer thread then runs
until the end of the program waiting for further work;
the collector notifies the finalizer thread of additional work via a condition
variable.
\section{Destructors and Finalizers}
\label{sec:destructor challenges}
In many GCed languages, `destructor' and `finalizer' are used as synonyms, as
both terms refer to code run when a value's lifetime has ended. In existing GCs
for Rust, these two terms refer to different hierarchies of code (i.e.~destructors
and finalizers are fundamentally different). As we will see later, in \ourgc, finalizers can
often be thought of as a subset of destructors. In this section we pick apart
these differences, before describing the challenges of using destructors as
finalizers.
When a value in Rust is \emph{dropped} (i.e.~the value's owner went out of lexical
scope) its destructor is automatically run. Rust destructors are formed of two
parts, run in the following order: a user-defined \emph{drop method}; and
automatically inserted \emph{drop glue}. Drop methods are optional; users
can provide one for a type by implementing the \lstinline{Drop} trait's \lstinline{drop}
method. Drop glue recursively calls destructors of contained types (e.g.~fields
in a \lstinline{struct}). Although it is common usage to conflate `destructor' in
Rust with drop methods, drop glue is an integral part of a Rust destructor:
we therefore use `destructor' as the umbrella term for both drop methods and drop glue.
Rust's destructors enable a style of programming that originated in C++ called RAII (Resource
Acquisition Is Initialization)~\cite[Section~14.4]{stroustrup97c++}: when a
value is dropped, the underlying resources it possesses (e.g.~file handles or heap memory)
are released. Drop methods are not only used frequently, but programmers regularly
implement their own drop methods.
Finalizers are an important concept for most non-Rust GCs, subsuming the notion of destructors.
However, existing GCs for Rust are forced to have separate notions of destructors and finalizers.
Where the former have the \lstinline{Drop} trait, the later typically have
a \lstinline{Finalize} trait. If a user type needs to be finalized then
the user must provide an implementation of the \lstinline{Finalize} trait for that type.
However, doing so introduces a number of problems: (1) external libraries are
unlikely to provide finalizers and so placing their types in a \lstinline{Gc<T>}
tends to break RAII expectations; (2) Rust's \emph{orphan
rule}~\cite[Section~6.12]{rustlangref} prevents one implementing traits for
types defined in external libraries (i.e.~unless a library's types were
designed to support \lstinline{Gc<T>}, those types cannot be directly GCed);
(3) one cannot automatically replicate drop glue for finalizers; and (4) one
cannot replicate \rustc's refusal to allow calls to the equivalent of
\lstinline{Drop::drop}.
Programmers can work around problems \#1 and \#2 in various ways. For example,
they can wrap external library types in \emph{newtypes} (zero-cost wrappers)
and implement finalizers on those instead~\cite[Section~19.3]{klabnik18rust}.
Doing so is tedious but not conceptually difficult.
Problem \#3 has partial solutions: for example, ~\cite{manish15rustgc} uses the
\lstinline{Trace} macro to generate \emph{finalizer glue} (the finalizer equivalent of drop glue) for
\lstinline{struct} fields. This runs into an unsolvable variant of problem \#2:
types in external libraries will not implement this trait and cannot be
recursively scanned for finalizer glue.
Problem \#4 is impossible to solve in Rust as-is. One cannot define a function
that can never be called --- what use would such a function have? It might seem
tempting to have the \lstinline{finalize} method take ownership of the value,
but \lstinline{Drop::drop} does not do so because that would not allow drop
glue to be run afterwards.
\subsection{General Challenges When Using Destructors as Finalizers}
\label{sec:general_challenges}
We have stated as our aim that \ourgc should use destructors as finalizers.
Above we explained some Rust-specific challenges --- but there are several
non-Rust-specific challenges too! Fundamentally, finalizers and destructors
have different, and sometimes incompatible, properties. The best
guide to these differences, and the resulting problems, is~\cite{boehm03destructors},
supplemented by later work by some of the same authors on support
for GC in the C++ specification~\cite{boehm09garbage}\footnote{These features
were added to the C+11 specification, but do not seem to have been implemented by
compilers. C++23 removed these features.}.
An obvious difference between destructors and finalizers is when both
are run. Where C++ and Rust define
precisely when a destructor will be run\footnote{Mostly. Rust's `temporary
lifetime extension' delays destruction, but for how long is currently
unspecified.}, finalizers run at an unspecified point in time. Although
we often assume this happens after the last reference to a GCed value is lost,
finalizers can run \emph{prematurely}, that is before the equivalent
destructor~\cite[section~3.4]{boehm03destructors}.
A less obvious difference relates to where destructors and finalizers are run.
Destructors run in the same thread as the last owner of a value.
However, running finalizers in the same thread as the last owner of the value
can lead to race conditions and deadlocks if a finalize tries to access a resource that the mutator
expects to have exclusive access too~\cite[section~3.3]{boehm03destructors}.
When such problems affect destructors in normal Rust code, it is the clear result of programmer error, since they should
have taken into account the predictable execution point of destructors. However, since
finalizers have no such predictable execution point, there is no way for finalizers
to safely access shared resources if they are run on the same thread.
The only way to avoid this to run
finalizers on a non-mutator thread: however, not all Rust types / destructors
are safe to run on another thread.
Finally, finalization and cycles are not happy bedfellows: finalizers
can reference other GCed values that are partly, or wholly, `finalized' and may
even have had their backing memory reused. A related, but
distinct, problem is the ability of finalizers to \emph{resurrect} values by
copying the reference passed to the finalizer and storing it somewhere.
\subsection{The Challenge of Finalizers for \ourgc}
At this point we hope to have convinced the reader of two general points: a
viable GC for Rust needs to be able to use existing destructors as finalizers
whenever possible; and that finalizers, even in existing GCs, cause
various problems.
Over time, finalizers have come to be viewed with increasing suspicion. Java,
for example, has deprecated, and intends eventually removing, per-type
finalizers: instead it has introduced deliberately less flexible per-object `cleaners', whose API
prevents problems such as object resurrection~\cite{goetz21deprecated}. It
is important to differentiate such mechanisms from the \lstinline{Finalize} trait that many existing GCs for
Rust force users to manually implement: cleaners impose restrictions
to make finalizers less problematic where existing \lstinline{Finalize} traits
mostly do not.
Our desire that \ourgc should use existing Rust destructors as finalizers whenever
possible may seem out of reach. As alluded to above, in a proposal for GC
for C++, the desire was ruled out as the
problems were thought insoluble~\cite[p.~32]{boehm09garbage}. We break these
problems down into four:
(1) finalizers are prohibitively slower than destructors;
(2) finalizers can be run prematurely;
(3) running finalizers on the same thread as a paused mutator can cause race conditions and deadlocks;
(4) some safe destructors are not safe finalizers.
Fortunately for us, Rust's unusual static guarantees, suitably expanded by
\ourgc, allow us to address each problem in novel, satisfying, ways. In the following
section, we tackle these problems in the order above, noting that we tackle problems
\#1 and \#2 separately, and \#3 and \#4 together.
\section{Finalizer Elision}
\label{sec:elision}
Finalizers are much slower to run than destructors --- running every Rust destructor
as a GC finalizer in \ourgc causes a significant slowdown \laurie{this is a case where a `worse of' might make sense, and i think it's 4x or similar?}
in our benchmark suite. In this section we show how to sidestep much of this overhead.
A variety of factors contribute to the finalizer performance overhead, including:
a queue of finalizers
must be maintained, whereas destructors can be run immediately; and finalizers run
some time after the last access of a value making cache misses more likely.
Most of these factors are inherent to any GC and
our experience of using and working on \boehm -- a mature, widely used GC -- does
not suggest that it is missing optimisations which would overcome all of this overhead.
Instead, whenever possible, \ourgc \emph{elides} finalizers so that they do not need to be run at all.
We are able to do this because many Rust destructors
only do work that a GC would do anyway and have no meaningful effect when run as finalizers.
Consider the standard Rust type \lstinline{Box<T>} which heap allocates space for a value;
when a \lstinline{Box<T>} value is dropped, the heap allocation will be freed.
We can then make two observations. First,
\lstinline{Box<T>}'s drop method solely consists of a \lstinline{deallocate}
call. Second, while we informally say
that \lstinline{Box<T>} allocates on the `heap' and \lstinline{Gc<T>} allocates
on the `GC heap', all allocations in \ourgc are made through \boehm and stored in
a single heap.
When used as a finalizer,
\lstinline{Box<T>}'s drop method is thus unneeded, as the underlying memory will
naturally be freed by \boehm anyway\footnote{Indeed, since the drop method only
calls \lstinline{deallocate}, any resulting finalizer would cause the underlying
allocation to live longer than if the finalizer was not present!}
This means that there is no need to run a finalizer for a type such as
\lstinline{Gc<Box<u8>>} and we can statically elide it. However,
just because a `top-level''s drop method might be unnecessary does not
mean that transitively reachable drop methods via drop glue are unnecessary. For example,
\lstinline{Gc<Box<Arc<u8>>} (where \lstinline{Arc<T>} is the thread-safe
version of \lstinline{Rc<T>}) requires a finalizer because
\lstinline{Arc<T>}'s drop method needs to decrement a reference count.
\subsection{Implementing finalizer elision}
\begin{algorithm}[t]
\caption{Finalizer Elision}
\label{alg:elision}
\begin{algorithmic}
\Function{RequiresFinalizer}{$T$}
\If {\Call{Impls}{$T$, $Drop$} \AND \NOT \Call{Impls}{$T$, \textit{DropMethodFinalizerElidable}}}
\State \Return{true}
\EndIf
\ForEach {\textit{field} $\in T$}
\If{\Call{RequiresFinalizer}{\textit{field}}}
\State \Return{true}
\EndIf
\EndFor
\State \Return{false}
\EndFunction
\end{algorithmic}
\end{algorithm}
\label{needs_finalizer_intrinsic}
Finalizer elision statically determines which type's destructors do
not require corresponding finalizers and elides them. It does so conservatively,
and deals correctly with drop glue.
\label{dropmethodfinalizerelidable}
The high-level \cref{alg:elision} says that any type which implements
the \lstinline{Drop} trait requires finalization unless it also implements the
new \lstinline{DropMethodFinalizerElidable} \emph{marker trait} (i.e.~a
trait without methods). This trait can be
used by a programmer to signify that a type's drop method need
not be called if the type is placed inside a \lstinline{Gc<T>}. The `Drop'
part of the trait name is deliberate (i.e.~it is not a simplification of
`destructor'): it allows the programmer to reason about a type locally.
If the type has a transitively reachable field
whose type implements the \lstinline{Drop} trait but not the
\lstinline{DropMethodFinalizerElidable} trait, then then the top-level type
still requires finalization. This not only checks drop glue, but allows the
programmer to reason about type parameters without knowing
the concrete types passed to those parameters.
Even though neither normal Rust destructors or \ourgc finalizers are guaranteed
to run, a program whose destructors or finalizers never run would probably not
be usable (leaking resources such as memory, deadlocking, and so on). We
therefore make \lstinline{DropMethodFinalizerElidable} an unsafe
trait, because implementing it inappropriately is likely to lead to undesired
-- though not incorrect! -- behaviour at run-time. To implement it one therefore
needs to opt in to unsafe Rust:
\begin{lstrustsmall}
unsafe impl<T> DropMethodFinalizerElidable for Box<T> {}
\end{lstrustsmall}
\ourgc modifies the standard Rust library to implement
\lstinline{DropMethodFinalizerElidable} on the following types: \lstinline{Box<T>},
\lstinline{Vec<T>}, \lstinline{RawVec<T>}, \lstinline{VecDeque<T>},
\lstinline{LinkedList<T>}, \lstinline{BTreeMap<K, V>}, \lstinline{BTreeSet<T>},
\lstinline{HashMap<K, V>}, \lstinline{HashSet<T>},
\lstinline{RawTable<K, V>}\footnote{This is contained in the separate \lstinline{hashbrown} crate,
which is then included in Rust's standard library.}, and \lstinline{BinaryHeap<T>}. Fortunately,
not only are these types' drop methods compatible with \lstinline{DropMethodFinalizerElidable},
but they are extensively used in real Rust code: they enable significant numbers of
finalizers to be elided.
\begin{figure}[t]
\begin{lstlisting}[
language=Rust,
numbers=none,
basicstyle=\footnotesize\ttfamily,
label={listing:elision_in_rustc},
caption={
A simplified view of how finalizers are elided inside \ourgc. The new compiler intrinsic
\lstinline{needs_finalizer} returns true if a finalizer is required for a
type. The \lstinline{Gc<T>} type uses this intrinsic to ensure that the
value is registered as requiring a finalizer. Because \lstinline{needs_finalizer}
is a \lstinline{const} function, with optimisations turned on, it is inlined
and the branch optimised away. In other words, the seemingly dynamic, branching code
in \lstinline{Gc::new} turns into static, branchless code.
}]
impl<T> Gc<T> {
pub fn new(value: T) -> Self {
if needs_finalizer::<T>() { Gc<T>::new_with_finalizer(value) }
else { Gc<T>::new_without_finalizer(value) }
...
}
}
\end{lstlisting}
\end{figure}
\cref{listing:elision_in_rustc} shows how we use \ourgc's \texttt{const} compiler intrinsinc
\lstinline{needs_finalizer} to turn \cref{alg:elision} into reality.
\lstinline{Gc::new} uses this intrinsic to decide
whether a finalizer must be registered or not. Although this looks like
a dynamic check, in fully optimised builds it turns into a static
check: \lstinline{needs_finalizer} is
evaluated at compile-time and its result can be inlined into \lstinline{Gc::new};
this then allows the associated conditional to be removed too. In other words --
compiler optimisations allowing -- the `does this specific type require a
finalizer?' checks have no run-time cost at all.
\section{Premature Finalizer Prevention}
\label{sec:premature_finalize_prevention}
\begin{figure}[tb]
\begin{lstlisting}[
language=Rust,
basicstyle=\footnotesize\ttfamily,
caption={An example of possible premature
finalization. We create a new struct \lstinline{S}
(line 1) with a drop method that sets the wrapped integer to zero (line 2). In the
main method, we move an instance of the struct into a
\lstinline{Box<T>}, which we then move into a \lstinline{Gc<T>} (line 4). We
obtain a Rust reference to the inner integer (line 5), which at
run-time will be a pointer to the \lstinline{Box<T>}. At this
point, the compiler can determine that the \lstinline{Gc<T>} is no longer
used and overwrite \lstinline{root}'s pointer (which may be in a register). If
a collection then occurs, a finalizer can run \lstinline{S}'s drop method,
causing the program to print `0' instead of the expected `1' (line 7).},
label={fig:premature_finalization}]
struct S { value: u8 }
impl Drop for S { fn drop(&mut self) { self.value = 0; } }
fn main() {
let root = Gc::new(Box::new(S{ value: 1 }));
let inner: &u8 = &**root.value;
force_gc();
println!("{}", *inner);
}
\end{lstlisting}
\end{figure}
Most of us assume that finalizers are always run later than the
equivalent destructor would have run, but they can sometimes run
before~\cite[section~3.4]{boehm03destructors}, undermining soundness.
Such premature finalization is also possible in \ourgc as described thus far
(see~\cref{fig:premature_finalization}). In this section we
show how to prevent, and then optimise, premature finalization.
There are two aspects to premature finalization. First, language
specifications often do not define, or do not precisely define, when the earliest point that a value can
be finalized is. While this means that, formally, there is no `premature' finalization,
it seems unlikely that language designers anticipated some of the resulting
implementation surprises (see e.g.~this example in
Java~\cite{shipilev20local}). Second, most compiler optimisations are
`GC unaware', so optimisations such as scalar
replacement can change the point in a program when GCed values appear to be
finalizable.
In our context, it is trivial to define premature finalization as a (dynamic) finalizer
for a \lstinline{Gc<T>} value running before the (static) \lstinline{Gc<T>} owner
has gone out of scope. Similar to the high-level proposal mooted
in~\cite[Solution~1]{boehm07optimization}, we must ensure that the dynamic
lifetime of a reference derived from a \lstinline{Gc<T>} matches or
exceeds the lifetime of the \lstinline{Gc<T>} itself.
Our solution relies on adjusting \lstinline{Gc<T>}'s drop method to \emph{keep
alive} a GCed value for at least the static lifetime of the \lstinline{Gc<T>} itself. In
other words, we ensure that the conservative GC will always see a pointer
to a GCed value while the corresponding \lstinline{Gc<T>} is in-scope.
However, there is a major problem to overcome: copyable types such as
\lstinline{Gc<T>} are forbidden from having destructors. The fundamental
challenge we have to solve is that each copied value will have a destructor
called on it, which has the potential for any shared underlying value to be destructed
more than once. \ourgc explicitly allows \lstinline{Gc<T>} -- but no other
copyable type -- to have a destructor, but to ensure it doesn't cause surprises
in the face of arbitrary numbers of copies, the destructor must be idempotent.
Our task is made easier because \lstinline{Gc<T>} naturally has no drop glue from
Rust's perspective: \lstinline{Gc<T>} consists of a field with a pointer type,
and such types are opaque from a destruction perspective.
We therefore only need to make sure that \lstinline{Gc<T>}'s drop method
is idempotent. Fortunately, this is sufficient for our purposes: we want the drop
method to inhibit finalization but that does not require run-time side effects.
To achieve this, we use a memory fence in the form of a keep-alive
function. Simplifying somewhat, memory fences can be thought of as having both
compile-time and run-time effects:
they prevent compilers from reordering computations around a given address;
and the manner in which this is done also also ensures that CPUs
cannot reorder computations relevant to that address.
Keep-alive functions are highly platform (compiler, operating
system, and CPU) specific: we use \boehm's \lstinline{GC_reachable_here}
function\footnote{This is actually a macro, and our fork of \boehm provides a
function \lstinline{GC_keep_alive} to expose the macro to Rust.}
which abstracts over these details. \lstinline{Gc<T>}'s
implementation of the \lstinline{Drop} trait therefore looks as follows:
\begin{lstlisting}[language=Rust, basicstyle=\footnotesize\ttfamily]
impl<T: ?Sized> Drop for Gc<T> {
fn drop(&mut self) { unsafe { bdwgc::GC_keep_alive(self as *mut u8) }; }
}
\end{lstlisting}
\subsection{Optimising premature finalizer prevention}
\label{sec:optimising_prem}
The drop method we add to \lstinline{Gc<T>} fully prevents premature
finalization. It also naturally solves a performance problem with the suggested solution
for C++ in~\cite[Solution~1]{boehm07optimization}, which requires keeping alive
all pointers, no matter their type, for their full scope. By definition, our
solution only keeps alive \lstinline{Gc<T>} values: the compiler is free to
optimise values of other types as it so wishes. However, our approach still
imposes a performance overhead on code that uses
\lstinline{Gc<T>}: in the worst case in our benchmark suite, the overhead is
\somrsastbarriers{bnaive}{worstperf}{percent}.
Fortunately, we can optimise premature finalizer prevention further by
piggy-backing on finalizer elision: if a finalizer can be
elided, there is no finalizer to be called prematurely, and no need for
such values to be kept alive. Intuitively, we want to avoid generating drop methods
for \lstinline{Gc<T>} types that do not require finalization,
but this is difficult to do directly inside \rustc. Instead,
we suppress calls to the drop methods of such types: the two approaches
are functionally equivalent, though ours
does put an extra burden on dead-code elimination in the compiler tool-chain.
We add a new pass
\lstinline{RemoveElidableDrops} to
\rustc's Mid-Level Intermediate Representation (MIR) processing. MIR is best
thought of as the main IR inside \rustc: it contains the complete set of
functions in the program, where each function consists of a sequence of basic
blocks. Simplifying somewhat, function and drop method calls are represented as
different kinds of \emph{terminators} on basic blocks. Terminators
reference both a callee and a successor basic block.
The \lstinline{remove_elidable_drops} pass iterates over a program's MIR,
identifies drop method terminators which reference elidable finalizers (using
the \lstinline{needs_finalizer} intrinsic from
\cref{needs_finalizer_intrinsic}), and turns them into `goto' terminators
to the successor basic basic block. \cref{alg:barrier_removal} in the appendix
gives a more formal version of this algorithm.
\section{Finalizer Safety Analysis}
\label{sec:fsa}
In this section we address two high-level problems: running finalizers on the same thread
as a paused mutator can cause race conditions and deadlocks; and some safe destructors are not
safe finalizers. Addressing the former problem is conceptually simple --
finalizers must be run on a separate thread -- but we must ensure that doing so
is sound. We therefore consider this a specific instance of the latter problem,
treating both equally in this section.
In this section we introduce Finalizer Safety Analysis (FSA), which prevents
a type \lstinline{T}'s destructor being used as a finalizer if it is not provably
safe to do so.
\jake{Now that FSA steps through function calls, maybe when explaining it we
should say somewhere that really it's a form of abstract interpretation. We're
creating a sound approximation over the CFG (MIR) of a Rust program to reason
about finalization properties without actually executing it. It's probably then
obvious to a PL audience that this must be imprecise in order to be decidable.}
\subsection{Rust references}
\label{sec:fsa_rust_references}
\lstinline{Gc<T>} can store, directly or indirectly, normal Rust
references (i.e.~\lstinline{&} and \lstinline{&mut}), at which point it is
subject to Rust's normal borrow checker rules and cannot outlive the reference.
However, finalizers implicitly extend the lifetime of a GCed value,
including any stored reference: accessing a reference in a finalizer could
undermine Rust's borrow checking rules.
The simplest way of avoiding this problem would be to forbid \lstinline{Gc<T>}
from storing, directly or indirectly, references. This might seem to be
no great loss: storing references in a \lstinline{Gc<T>} largely nullifies
the `GCness' of \lstinline{Gc<T>}. However, we found the result hard to use,
making even simple tasks such as gradually refactoring existing code
to use \lstinline{Gc<T>} painful.
A moderate relaxation -- indeed, how we first designed this -- is to recognise
that only types that need a finalizer can possibly have problems with
references, and to forbid such types from storing references in
\lstinline{Gc<T>}. For example, if there is no drop method for
\lstinline|struct S {x: &u8}| then its destructor is safe to use as a
finalizer. We can relax matters further by recognising that \lstinline{S} could
safely have a drop method provided it uses no references at all. However,
these relaxations still proved frustrating to use with existing code.
The eventual rule we alighted upon for FSA is to allow a destructor for a type
\lstinline{T} to be used as a finalizer provided the destructor's drop methods
do not obtain references derived from \lstinline{T}'s fields (including fields
reachable from its attributes). Using Rust's terminology, we
forbid \emph{projections} (which include a struct's fields, indexes into a
vector, and so on) to reference types. Any other references that are created
in a drop method are by definition safe to use, as they either exist only
for the duration of the drop method (references to variables on the stack)
or will exist for the remainder of the program (references to global variables).
Forbidding all projections in this way over-approximates the safe set of
destructors. For example, if a drop method creates a new
value and tries to obtain a reference to a field in that new value, then FSA will
raise an error, even though that reference can only exist for the duration of
the drop method. It may be possible to relax our rule further, but we found that it becomes
rapidly harder to define and article what is safe, and to implement the resulting design in \rustc.
\jake{FSA has changed quite a bit since the last write-up in this section, below I will detail the key differences from before}
\subsection{New FSA rules}
\subsubsection{Bypassing FSA}
The \lstinline{FinalizerSafe} auto-trait no longer exists. This means there is
now no user-implementable auto-trait that can be implemented on types to bypass
FSA. The only way to now bypass FSA is to wrap a type inside
\lstinline{FinalizerUnchecked}.
The use of \lstinline{FinalizerUnchecked} works at both the top level, and
indirectly when used in a field. For example, the following would be allowed in
\ourgc:
\begin{lstlisting}[language=Rust, numbers=none]
struct S { x: *const u8 } // not Send + Sync because of the raw ptr
impl Drop for S { fn drop(&mut self) { let _ = *self.x }; }
Gc::new(unsafe { FinalizerUnchecked::new(S)});
\end{lstlisting}
\lstinline{S} is not safe to use in a finalizer because its drop method
dereferences the non-Send non-Sync raw pointer field \lstinline{x}. However, by
passing a wrapped \lstinline{FinalizerUnchecked} value of \lstinline{S} to
\lstinline{Gc::new} we have explicitly disabled FSA for this value. Note that
this is an unsafe operation because it could allow unsoundness.
This also works indirectly. For example:
\begin{lstlisting}[language=Rust, numbers=none]
struct T { y: FinalizerUnchecked<S> }
Gc::new(T { y: unsafe { FinalizerUnchecked::new(S)} });
\end{lstlisting}
Here, \lstinline{FinalizerUnchecked} is used to prevent FSA from checking the
drop method of \lstinline{S} only. In the above example, \lstinline{T}'s
finalizer will be allowed. However, if \lstinline{T} had an unsafe drop method.
FSA would still throw an error. This is useful for making sure that where FSA
is too conservative, it need only be disabled in the places it needs to be, and
other errors can't accidentally sneak through.
\lstinline{FinalizeUnchecked} has another benefit over the old approach of
implementing \lstinline{FinalizerSafe} for a specific type: it only affects a
particular instance of a type. In other words, implementing
\lstinline{FinalizerSafe} (or indeed \lstinline{Send + Sync}) for types so that
they can be finalized in GC means that users must be sure all uses of that type
are safe. I personally found this very difficult to be confident with --
especially over larger codebases (i.e. alacritty).
\subsubsection{FSA Entry-points}
Previously, we hard-coded \lstinline{Gc::new} to be the ``entry-point'' which
FSA would use to check a type. This was fine at the time, because
\lstinline{Gc::new<T>} was the only safe way to construct a Gcable object of
type \lstinline{T}. However, we have since extended the GC API to support
\lstinline{from/into} conversions, which are used extensively in some of the
\lstinline{Rc}'d programs that we have retrofitted.
We now define an internal rustc-only attribute
(\lstinline{#[rustc_fsa_entry_point]}) to replace this. The first stage of FSA
was to traverse the whole-program MIR looking for \lstinline{Gc::new}
constructors. Instead, we now traverse the whole-program MIR looking for
functions which have the \lstinline{#[rustc_fsa_entry_point]}. Constructor-like
functions inside the GC standard library (such as \lstinline{Gc::new} /
\lstinline{Gc::from}) are annotated with this attribute.
\subsubsection{The ReferenceFree auto-trait is back}
The viral properties of auto-traits are really useful for the kind of analyses
we want to do in FSA. E.g. with Send and Sync, we can know for certain if
\lstinline{T} is thread-safe provided it implements these autotraits -- we
don't have to step into T's fields and also check those because if they don't
the negative impl would bubble up and affect T too.
For FSA, this means that if we construct a \lstinline{Gc<T>} where
\lstinline{T} implements \lstinline{Send} + \lstinline{Sync} we don't need to
check drop methods for thread-safety for any projection into \lstinline{T}'s
fields.
However, thread-safety is not the entire story. We would still need to check
that drop methods do not project into fields of \lstinline{T} which are
reference types. For reasons we have already explained, both regular Rust
references and \lstinline{Gc} references cannot be used inside a finalizer. We
provide negative impls of the \lstinline{ReferenceFree} auto trait for
\lstinline{Gc<T>} and \lstinline{&/&mut T} types. This now lets us short-circuit
checking all projections for finalizer safety in drop methods, provided that
the top-level type implements all three auto-traits (\lstinline{Send + Sync + ReferenceFree}).
\lstinline{ReferenceFree} is a standard library internal type, and, like
\lstinline{Freeze}, must not be implemented by the user for their own types
(though due to the way rustc re-exports standard library items, we can't
actually prevent this, just as they can't prevent the same with
\lstinline{Freeze}).
\subsubsection{Traversing through function calls}
FSA can now automatically look into functions which are called inside drop
methods to also check that their implementation is finaliser safe. This greatly
improves the ergonomics of FSA. First it means the number of FSA errors that we
encounter is dramatically reduced. For alacritty and SWS, we go from 110 errors
(though likely more, as this is the ceiling at which point rustc no longer
emits any more errors) to 2 and 1 respectively. Almost all of those errors were
because a finalizer called something benign (e.g. unwrap()) which FSA could not
look into and therefore had to emit an error. This has a tangible impact on the
safety of a program: in the remaining places where types do need user
annotation, the user can cast a much narrower net, and the risk of them
disabling FSA entirely because of error fatigue is reduced.
Second, when an error is raised, it is more useful to the user. The error will
now pinpoint the exact line of code from within whatever function where the
problem arises, rather than the line of a function call which may or may not be
safe. Of course, some function calls (e.g. function pointers or external calls
where no MIR is available) still cannot be checked, so FSA emits an error at
the call for these. However, these are now issues in their own right which do
need user intervention, rather than spurious errors about potentially unsafe
function calls which could trivially be allowed automatically.
Function traversal in FSA is implemented iteratively using a queue. When
visiting the MIR for a drop method, we check whether the terminator of each
basic block is a call to a new function. If it is, we add that function to a
queue of functions to be checked.
Because functions in Rust can have generic type signatures, we must try to
enqueue the concrete monomorphized function definition of a call. Unfortunately
this is not always possible — for some functions, only the polymorphic function
definition may be available at the point where we perform FSA. In such
instances, we enqueue the polymorphic definition, however, this is more likely
to emit a false-positive FSA error, because \lstinline{Send + Sync + ReferenceFree} auto
trait bounds may not be present on the generic type, but are present on the