-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arch/riscv: make RISC-V CSR accesses unsafe #3549
base: master
Are you sure you want to change the base?
Conversation
Accessing RISC-V CSRs is an inherently unsafe and risky operation: given they exercise large control over the state of the RISC-V hart, they can be used to manipulate the current privilege mode, manage PMP enforcement, etc. Writing to CSRs can be used to violate Rust's memory safety rules. This is especially true in the context of Tock, where we can construct a simple example by looking at our system call handler: it is sufficient for some untrusted code to store a non-zero value (pointer to some normally inaccessible memory) into `mscratch`, and a value to be written to that memory in a fixed register. The code must then cause a trap, e.g., by issuing an illegal instruction. The context switch code will incorrectly interpret this trap as a fault of a process, and proceed to write the previous register file to a "stack" indicated by `mscratch`. The kernel will continue execution in the main loop, leaking the previous stack. The untrusted code has successfully overwritten an arbitrary memory location. The only reasonable solution is to mark all CSR accesses as `unsafe`. While CSR reads are generally less problematic, even a CSR read of a non-existant CSR can cause a trap, which would be yet another way for some untrusted code to violate the kernel's liveness (apart from looping & panicing). Because of this, and given we provide access to the `ReadWriteRiscvCsr` types which inherently allow writing to CSRs, this marks CSR register type construction and CSR register type accesses through the `riscv::csr::CSR` struct as unsafe operations.
@@ -549,4 +550,569 @@ impl CSR { | |||
_ => unreachable!(), | |||
} | |||
} | |||
|
|||
// CSR accessor methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These accessors are generated through regex-parsing the above struct and templating a generic accessor method. It's not nice to have to vendor this, but I don't really know of a better way to achieve similar semantics without inserting ~1kLOC of auto-generated methods.
I'm not sure I understand why CSRs are different from any other registers. I agree that reading a nonexistent one is bad, and if that is possible in chip code without Code like capsules (the most untrusted inside the kernel) has never had access to CSRs or any other registers. So that's not a problem. Chip code does, of course, and if chip code is trying to use mscratch to trick the core kernel into writing stack data to some location, well that seems bad! But chip code has always had the ability to just use DMA to write whatever wherever (assuming there is a DMA peripheral) or whatever other tricks can be done with a specific MCU's peripherals. So, it's not clear to me why our policy of "using the registers (er, CSRs) is safe, it's creating them that is not" doesn't apply here. Is there a case here to use modules instead somehow? Like could we only make certain CSRs "public" (to chips) and the rest not? It seems like chips need to modify interrupts, but not much else. |
@bradjc It does, and that's exactly what I'm trying to enforce / implement with this PR. Previously anyone could've just run Technically, this implements a slightly different version of that paradigm, as We can (and probably) should change this to just define functions that construct CSR register types, like so: pub unsafe fn mscratch() -> ReadWriteRiscvCsr<usize, mscratch::mscratch::Register, MSCRATCH> {
ReadWriteRiscvCsr::<usize, mscratch::mscratch::Register, MSCRATCH>::new()
} Code which wants to access this register would then call e.g., We can't just once use unsafe to create all of these registers and them make them safely accessible through a public API. This would allow any untrusted code having access to that API to violate safety & soundness. For instance, there would be nothing stopping a safe function in a board crate to completely break safety invariants by a call to |
Unfortunately, given we work across crate boundaries here, we can either export things publicly in the API (making CSRs accessible to every crate depending on this one), or not exporting things at all (making CSRs inaccessible to e.g., Relying on the fact that a certain other crate (e.g., |
I thought I agreed with this, but it isn't so much that creating the type is unsafe, it's choosing the CSR number that is unsafe. By only exposing
I disagree, I think it's actually one of the best mechanisms we have. If we had to assume that capsules for instance had complete access to any function in any chip code (even perhaps chip crates for a different MCU) then I think we would have to architect things very differently. If a board main.rs wants to pass in accessing a CSR to a capsule via some other function, well, that's its prerogative. And it's not that there wouldn't be any unsafe in the call chain (constructing the CSR would still be unsafe), it's just that the unsafe wouldn't be in the board's main.rs.
It might be messy, but is it really not possible to expose certain CSRs publicly, and others not? Also, if the concern is that the low level arch crate relies on a very specific use of mscratch, and trusts that mscratch is not modified elsewhere, then why don't we just special case mscratch somehow (either unsafe or capability)? |
Unfortunately, it's the other way around in terms of where unsoundness actually occurs: it's fine to create a Our "using registers is safe, creating them is not" paradigm is nonetheless legal, but only when we add another restriction: that the caller of the unsafe code (i.e. the register's constructor) must guarantee that this register is not used in an unsound fashion. To illustrate this, take the following example: fn main() {
use std::ops::Deref;
let a: Box<u8> = Box::new(42);
let b: &u8 = unsafe { &*(a.deref() as *const u8) }; // unsafe, but not unsound in isolation
std::mem::drop(a); // "safe", but unsound
println!("Box value: {}", *b); // "safe", but unsound
} In this example, creating a reference from a pointer is clearly an Applying this logic to registers / CSRs in Tock, when we create a register (e.g., the
For regular Tock drivers, we encapsule this unsafety by creating the For CSRs, the story is different: by constructing CSR registers through an
In hindsight I perhaps should have phrased this differently. I fully agree with and embrace the idea that our dependency structure provides many guarantees, which are ultimately also relevant for safety. However, we need to distinguish functional safety from the concepts of safety and soundness as defined by Rust. For example, it's clearly undesirable for a capsule to call The Rust reference on unsafe is fairly explicit about this. An |
I don't agree with this as I don't see how this is different from what we do with chip registers. I think either we are doing this in both cases, or neither, but I don't see how CSRs are accessible from any less trusted code than StaticRef registers are. I will say that because the kernel depends on mscratch, it does seem like it falls into a special case where accessing it is never safe, and all code which touches it should have to be unsafe due to the memory safety concerns you raise. But, that seems orthogonal to it being a CSR, and this would be true if riscv gave us some other way to access it. I guess what I'm saying is I think we should fix mscratch, but I'm not sure about changing all CSRs.
This seems true, but perhaps slightly pedantic. Why let someone create something they can't use? It seems more rational to only allow things to be created that can be used. Like its ok for a child to hold a sucker, just not eat it, and then getting mad at them when they inevitably eat it. |
On the other hand, liveness is not a memory safety issue. We should to do one of the following:
Given how much work the first option is, doing the second seems reasonable. |
The I still contend the correct fix is scoping: the Allowing While thinking this through, I don't understand why we don't just delete the Maybe it's worth figuring out mscratch first before deciding on the remaining CSRs? I still don't understand the argument that they are unsafe. We have a MPU object in cortex-m which only provides safe accesses, for example. |
Pull Request Overview
Accessing RISC-V CSRs is an inherently unsafe and risky operation: given they exercise large control over the state of the RISC-V hart, they can be used to manipulate the current privilege mode, manage PMP enforcement, etc.
Writing to CSRs can violate Rust's memory safety rules in a variety of ways, e.g., by manipulating PMP rules and cooperating with a userspace process. In addition, Tock's system call handler can be trivially used for that: it is sufficient for some untrusted code to store a non-zero value (pointer to some normally inaccessible memory) into
mscratch
, and a value to be written to the target memory address in some register. The code must then cause a trap, e.g., by issuing an illegal instruction. The context switch code will incorrectly interpret this trap as a fault of a process, and proceed to write the previous register file to a "stack" indicated bymscratch
. The kernel will continue execution in the main loop, leaking the previous stack. The untrusted code has successfully overwritten an arbitrary memory location.The only reasonable solution is to mark all CSR accesses as
unsafe
. While CSR reads are generally less problematic, even a CSR read of a non-existant CSR can cause a trap, which would be yet another way for some untrusted code to violate the kernel's liveness (apart from looping & panicing). Because of this, and given we provide access to theReadWriteRiscvCsr
types which inherently allow writing to CSRs, this marks CSR register type construction and CSR register type accesses through theriscv::csr::CSR
struct as unsafe operations.Testing Strategy
This pull request was tested by compiling.
TODO or Help Wanted
As a follow up, we should move the CSR definitions to the
riscv-csr
crate. It does not make any sense to have them split between theriscv
andriscv-csr
crates as they are right now.Documentation Updated
Updated the relevant files inor no updates are required./docs
,Formatting
make prepush
.