-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc: Load the rustc_trans
crate at runtime
#47671
Conversation
593f3e1
to
d41fbb9
Compare
(rust_highfive has picked a reviewer for you, use r? to override) |
The next plans for #46819 are to add rustbuild configuration and support (as well as another submodule) for Emscripten's backend, integrate that for Emscripten, and then we should be good to upgrade to LLVM 5! |
src/librustc_driver/lib.rs
Outdated
let sysroot = sysroot_candidates.iter() | ||
.map(|sysroot| { | ||
let libdir = filesearch::relative_target_lib_path(&sysroot, &target); | ||
sysroot.join(&libdir).join("trans-backends") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nitpick, but could you use "codegen backend" instead of "trans backend" for everything external to the compiler itself? It can be changed later as part of #45274, so it doesn't really matter as long as we don't stabilize anything based on it.
box LlvmTransCrate(()) | ||
} | ||
} | ||
|
||
impl TransCrate for LlvmTransCrate { | ||
fn init(&self, sess: &Session) { | ||
llvm_util::init(sess); // Make sure llvm is inited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now but in the future we might want to have a Session
created by the time we create this trait object (we'll need at least the target spec loaded).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let __rustc_codegen_backend
take a Option<&Session>
as argument and pass None when we are only using it for printing version info and stuff like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think we can always create a Session
, there shouldn't really be a significant cost to it since we're already doing the argument parsing and we have to load the target specs anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah so the two locations this was necessary (not taking a Session
on construction) was when we print the version and when we print the LLVM passes. There's also one where we need to construct a list of all error codes but that's commented out for now...
I it'd be possible to have a Session
here, but it'd be a "dummy session" in some cases because I don't think we want to generate an error for failure to parse command line arguments before we print the version. Either way I'd be fine though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't fully parse arguments before processing the version? Is it a separate codepath?
EDIT: note that you can't load the backend without also parsing --target
/ -Z codegen-backend
at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this PR doesn't use target specs yet, so it's not necessary for now. But I am curious, because I was hoping for a straight-forward implementation, is the "printing the version skips checking the rest of the CLI arguments" guaranteed/relied upon?
EDIT: I guess one could argue that -Z codegen-backend
should be taken into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it basically would just require more refactoring. We'd have to, for example, initialize the Session
in stages, first with a bunch of known options, then do some processing, then aftewards actually validate everything. AFAIK it's not really set up to do that yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we validate everything early and buffer the errors or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess? Again though I didn't get the impression that the argument parsing was at all ready for this sort of refactoring, in the sense that it seemed outside the scope of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't want to block this PR, this is more for future reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general nits. Overall approach looks okay to me, though I'm not super happy about having to add another step to compile.rs... seems okay though.
src/librustc_driver/lib.rs
Outdated
let lib = match DynamicLibrary::open_global_now(path) { | ||
Ok(lib) => lib, | ||
Err(err) => { | ||
let err = format!("Couldnt load codegen backend {:?}: {:?}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't
here and elsewhere. Minor nit.
src/librustc_driver/lib.rs
Outdated
static mut LOAD: fn() -> Box<TransCrate> = || unreachable!(); | ||
|
||
INIT.call_once(|| { | ||
let trans_name = sess.opts.debugging_opts.codegen_backend.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to clone this when we call as_ref
on it in the next line? This seems.. odd.
src/librustc_driver/lib.rs
Outdated
}); | ||
let backend = unsafe { LOAD() }; | ||
backend.init(sess); | ||
return backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: backend
src/librustc_driver/lib.rs
Outdated
// but there's a few manual calls to this function in this file we protect | ||
// against. | ||
static LOADED: AtomicBool = ATOMIC_BOOL_INIT; | ||
assert!(!LOADED.swap(true, Ordering::SeqCst), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is swap
the right thing here? I guess maybe... I'd sort of expect fetch_or
perhaps?
src/librustc_driver/lib.rs
Outdated
assert!(!LOADED.swap(true, Ordering::SeqCst), | ||
"cannot load the default trans backend twice"); | ||
|
||
if cfg!(test) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like a comment explaining this, as it seems odd -- why would this be the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly! I've added a comment
src/librustc_driver/lib.rs
Outdated
Some(s) => s, | ||
None => continue, | ||
}; | ||
if !filename.starts_with(DLL_PREFIX) || !filename.ends_with(DLL_SUFFIX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, this condition is equivalent to !(starts_with && ends_with)
which seems... odd? Why are we checking that it doesn't start with or end with the DLL_PREFIX? Aren't we looking for a DLL?
Specifically, given that we cut off the suffix and prefix below, shouldn't the condition be filename.starts_with(DLL_PREFIX) && filename.ends_with(DLL_SUFFIX)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yeah that's the condition I'm going for, but I think that's what's here as well? I'm looking for
if !(correct_prefix && correct_suffix) {
continue
}
which should be the same as
if !correct_prefix || !correct_suffix {
continue
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah. Maybe clearer to write it that way (edit: so !(a && b)
) though? Probably a minor point and not entirely relevant anyway. I was confused; but I don't think there's anyway to make this nonconfusing really.
src/librustc_driver/lib.rs
Outdated
let addr = current_dll_path as usize as *mut _; | ||
let mut info = mem::zeroed(); | ||
if libc::dladdr(addr, &mut info) == 0 { | ||
info!("dladdr failed: {}", io::Error::last_os_error()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is the right thing to call? I would sort of expect us to call dlerror
here... which might be the same thing, I'm not actually sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, no!
src/librustc_driver/lib.rs
Outdated
info!("dladdr failed: {}", io::Error::last_os_error()); | ||
return None | ||
} | ||
if info.dli_fname.is_null() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no symbol matching addr could be found, then dli_sname and dli_saddr are set to NULL
is what I see in my documentation, which seems to make me believe this is the wrong check?
I also see the following, which is also concerning. This might all be irrelevant and the existing code is "good enough" though.
Sometimes, the function pointers you pass to dladdr() may surprise you. On some architectures (notably i386 and x86_64), dli_fname and dli_fbase
may end up pointing back at the object from which you called dladdr(), even if the function used as an argument should come from a dynamically
linked library.
The problem is that the function pointer will still be resolved at compile time, but merely point to the plt (Procedure Linkage Table) section of
the original object (which dispatches the call after asking the dynamic linker to resolve the symbol). To work around this, you can try to com-
pile the code to be position-independent: then, the compiler cannot prepare the pointer at compile time any more and gcc(1) will generate code
that just loads the final symbol address from the got (Global Offset Table) at run time before passing it to dladdr().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's ok for us? I haven't used this sort of dynamic loading much before so I suspect that we'll probably find a bug or two along the way, but I think this specific situation may fall under the category of "turns out ok given our usage" maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess what I'm asking is why do we check dli_fname
and not dli_sname
? Judging by the documentation, what we check today isn't going to determine whether the symbol was found or not, which I believe we do want to determine.
Agreed though that I'm not too worried about hashing this out now, if it works in your usage then seems good enough.
#[cfg(feature="llvm")] | ||
all_errors.extend_from_slice(&rustc_trans::DIAGNOSTICS); | ||
// FIXME: need to figure out a way to get these back in here | ||
// all_errors.extend_from_slice(get_trans(sess).diagnostics()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a function that should lie on the trait -- it seems like it should anyway, or maybe the trait could return a vec of errors that need to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh it actually is! Unfortunately though there's not a great way to load it here, so I figured that for the one extended error in librustc_trans we could fudge it for now and figure it out later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just load all of the diagnostics, independent of backend, and then just some of them would be unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that's also gonna be difficult I think. I believe that with this PR as-is we could do that, but in the future I think we'll only want to dlopen one version of LLVM into the current process, and at this point in time we don't actually know the trans backend we'd otherwise be using. That means that with the Emscripten target we'd load the normal LLVM backend to load diagnostics and then load the Emscripten backend to actually do trans. I think that it would work ok? The "global" nature of the dlopen though required here though may throw a wrench into that..
In essence though I don't think there's a quick fix or an easy workaround for this, we'll want to deal with it later I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh also note that I added a dynamic assertion for this, the business with the LOADED
constant, which asserts that we only load a backend once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, yeah, I understand that we don't want to dlopen multiple rustc_trans backends, but I was rather suggesting that we put all the diagnostic definition in driver
and then the backend would use whichever ones it needed, though not all. This does increase the API surface of driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diagnostics can probably be moved to the librustc_mir/monomorphize
instance collector, so they're shared by all backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes indeed, such a strategy would for sure work!
let td = TempDir::new("create-dir-all-bare").unwrap(); | ||
env::set_current_dir(td.path()).unwrap(); | ||
let path = PathBuf::from(env::var_os("RUST_TEST_TMPDIR").unwrap()); | ||
env::set_current_dir(&path).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these changes necessary? Can we no longer link to tempdir from tests? Or is it that tempdir wasn't being rebuilt because rustbuild is buggy (I would not be surprised).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dependencies of librustc_trans are no longer available in the sysroot (as we just manually copy the one DLL), and one of its dependencies was tempdir
. (no other crates in rustc use tempdir
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I recall, the previous solution to this "problem" would be to move tempdir to a dependency of librustc
. I'm fine with this too though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true yeah we could do that but I'm also always a fan of tests with fewer dependencies :)
d41fbb9
to
1ff8486
Compare
Updated! |
1ff8486
to
13760b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes seem good other than the comments I responded to.
44486c9
to
b14ccef
Compare
@bors: r=Mark-Simulacrum |
📌 Commit b14ccef has been approved by |
src/librustc_driver/lib.rs
Outdated
} | ||
Err(e) => { | ||
let err = format!("couldn't load codegen backend as it \ | ||
doesn't export the `__rustc_backend_new` \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__rustc_codegen_backend
symbol?
b14ccef
to
6509458
Compare
@bors: r=Mark-Simulacrum |
📌 Commit 6509458 has been approved by |
|
8347e55
to
c8ab8fc
Compare
@bors: r=Mark-Simulacrum |
📌 Commit c8ab8fc has been approved by |
⌛ Testing commit c8ab8fcfa63d6e82b4c3e5587d4b17e68ee78b6a with merge 18342df22280960f92de0275308455030489f6c5... |
💔 Test failed - status-travis |
Building on the work of # 45684 this commit updates the compiler to unconditionally load the `rustc_trans` crate at runtime instead of linking to it at compile time. The end goal of this work is to implement # 46819 where rustc will have multiple backends available to it to load. This commit starts off by removing the `extern crate rustc_trans` from the driver. This involved moving some miscellaneous functionality into the `TransCrate` trait and also required an implementation of how to locate and load the trans backend. This ended up being a little tricky because the sysroot isn't always the right location (for example `--sysroot` arguments) so some extra code was added as well to probe a directory relative to the current dll (the rustc_driver dll). Rustbuild has been updated accordingly as well to have a separate compilation invocation for the `rustc_trans` crate and assembly it accordingly into the sysroot. Finally, the distribution logic for the `rustc` package was also updated to slurp up the trans backends folder. A number of assorted fallout changes were included here as well to ensure tests pass and such, and they should all be commented inline.
c8ab8fc
to
884715c
Compare
@bors: r=Mark-Simulacrum |
📌 Commit 884715c has been approved by |
rustc: Load the `rustc_trans` crate at runtime Building on the work of #45684 this commit updates the compiler to unconditionally load the `rustc_trans` crate at runtime instead of linking to it at compile time. The end goal of this work is to implement #46819 where rustc will have multiple backends available to it to load. This commit starts off by removing the `extern crate rustc_trans` from the driver. This involved moving some miscellaneous functionality into the `TransCrate` trait and also required an implementation of how to locate and load the trans backend. This ended up being a little tricky because the sysroot isn't always the right location (for example `--sysroot` arguments) so some extra code was added as well to probe a directory relative to the current dll (the rustc_driver dll). Rustbuild has been updated accordingly as well to have a separate compilation invocation for the `rustc_trans` crate and assembly it accordingly into the sysroot. Finally, the distribution logic for the `rustc` package was also updated to slurp up the trans backends folder. A number of assorted fallout changes were included here as well to ensure tests pass and such, and they should all be commented inline.
☀️ Test successful - status-appveyor, status-travis |
This commit introduces a separately compiled backend for Emscripten, avoiding compiling the `JSBackend` target in the main LLVM codegen backend. This builds on the foundation provided by rust-lang#47671 to create a new codegen backend dedicated solely to Emscripten, removing the `JSBackend` of the main codegen backend in the process. A new field was added to each target for this commit which specifies the backend to use for translation, the default being `llvm` which is the main backend that we use. The Emscripten targets specify an `emscripten` backend instead of the main `llvm` one. There's a whole bunch of consequences of this change, but I'll try to enumerate them here: * A *second* LLVM submodule was added in this commit. The main LLVM submodule will soon start to drift from the Emscripten submodule, but currently they're both at the same revision. * Logic was added to rustbuild to *not* build the Emscripten backend by default. This is gated behind a `--enable-emscripten` flag to the configure script. By default users should neither check out the emscripten submodule nor compile it. * The `init_repo.sh` script was updated to fetch the Emscripten submodule from GitHub the same way we do the main LLVM submodule (a tarball fetch). * The Emscripten backend, turned off by default, is still turned on for a number of targets on CI. We'll only be shipping an Emscripten backend with Tier 1 platforms, though. All cross-compiled platforms will not be receiving an Emscripten backend yet. This commit means that when you download the `rustc` package in Rustup for Tier 1 platforms you'll be receiving two trans backends, one for Emscripten and one that's the general LLVM backend. If you never compile for Emscripten you'll never use the Emscripten backend, so we may update this one day to only download the Emscripten backend when you add the Emscripten target. For now though it's just an extra 10MB gzip'd. Closes rust-lang#46819
This commit introduces a separately compiled backend for Emscripten, avoiding compiling the `JSBackend` target in the main LLVM codegen backend. This builds on the foundation provided by rust-lang#47671 to create a new codegen backend dedicated solely to Emscripten, removing the `JSBackend` of the main codegen backend in the process. A new field was added to each target for this commit which specifies the backend to use for translation, the default being `llvm` which is the main backend that we use. The Emscripten targets specify an `emscripten` backend instead of the main `llvm` one. There's a whole bunch of consequences of this change, but I'll try to enumerate them here: * A *second* LLVM submodule was added in this commit. The main LLVM submodule will soon start to drift from the Emscripten submodule, but currently they're both at the same revision. * Logic was added to rustbuild to *not* build the Emscripten backend by default. This is gated behind a `--enable-emscripten` flag to the configure script. By default users should neither check out the emscripten submodule nor compile it. * The `init_repo.sh` script was updated to fetch the Emscripten submodule from GitHub the same way we do the main LLVM submodule (a tarball fetch). * The Emscripten backend, turned off by default, is still turned on for a number of targets on CI. We'll only be shipping an Emscripten backend with Tier 1 platforms, though. All cross-compiled platforms will not be receiving an Emscripten backend yet. This commit means that when you download the `rustc` package in Rustup for Tier 1 platforms you'll be receiving two trans backends, one for Emscripten and one that's the general LLVM backend. If you never compile for Emscripten you'll never use the Emscripten backend, so we may update this one day to only download the Emscripten backend when you add the Emscripten target. For now though it's just an extra 10MB gzip'd. Closes rust-lang#46819
rustc: Split Emscripten to a separate codegen backend This commit introduces a separately compiled backend for Emscripten, avoiding compiling the `JSBackend` target in the main LLVM codegen backend. This builds on the foundation provided by #47671 to create a new codegen backend dedicated solely to Emscripten, removing the `JSBackend` of the main codegen backend in the process. A new field was added to each target for this commit which specifies the backend to use for translation, the default being `llvm` which is the main backend that we use. The Emscripten targets specify an `emscripten` backend instead of the main `llvm` one. There's a whole bunch of consequences of this change, but I'll try to enumerate them here: * A *second* LLVM submodule was added in this commit. The main LLVM submodule will soon start to drift from the Emscripten submodule, but currently they're both at the same revision. * Logic was added to rustbuild to *not* build the Emscripten backend by default. This is gated behind a `--enable-emscripten` flag to the configure script. By default users should neither check out the emscripten submodule nor compile it. * The `init_repo.sh` script was updated to fetch the Emscripten submodule from GitHub the same way we do the main LLVM submodule (a tarball fetch). * The Emscripten backend, turned off by default, is still turned on for a number of targets on CI. We'll only be shipping an Emscripten backend with Tier 1 platforms, though. All cross-compiled platforms will not be receiving an Emscripten backend yet. This commit means that when you download the `rustc` package in Rustup for Tier 1 platforms you'll be receiving two trans backends, one for Emscripten and one that's the general LLVM backend. If you never compile for Emscripten you'll never use the Emscripten backend, so we may update this one day to only download the Emscripten backend when you add the Emscripten target. For now though it's just an extra 10MB gzip'd. Closes #46819
Building on the work of #45684 this commit updates the compiler to
unconditionally load the
rustc_trans
crate at runtime instead of linking to itat compile time. The end goal of this work is to implement #46819 where rustc
will have multiple backends available to it to load.
This commit starts off by removing the
extern crate rustc_trans
from thedriver. This involved moving some miscellaneous functionality into the
TransCrate
trait and also required an implementation of how to locate and loadthe trans backend. This ended up being a little tricky because the sysroot isn't
always the right location (for example
--sysroot
arguments) so some extra codewas added as well to probe a directory relative to the current dll (the
rustc_driver dll).
Rustbuild has been updated accordingly as well to have a separate compilation
invocation for the
rustc_trans
crate and assembly it accordingly into thesysroot. Finally, the distribution logic for the
rustc
package was alsoupdated to slurp up the trans backends folder.
A number of assorted fallout changes were included here as well to ensure tests
pass and such, and they should all be commented inline.