Embive is a low-level sandboxing library focused on the embedding of untrusted code for constrained environments.
As it interprets RISC-V bytecode, multiple languages are supported out of the box by Embive (Rust, C, C++, Zig, Nim, etc.).
By default, it doesn’t require external crates, dynamic memory allocation or the standard library (no_std
& no_alloc
).
Embive is designed for any error during execution to be recoverable, allowing the host to handle it as needed. As so, no panics should occur on release builds, despite the bytecode being executed.
Currently, it supports the RV32I[MA]Zicsr_Zifencei
instruction set (Check Features).
The bytecode can be generated by any compiler that supports the RISC-V 32-bit instruction set, as long as it can output a flat
binary file (.bin
) statically linked to the correct addresses (Code at 0x00000000
, RAM at 0x80000000
).
The following templates are available for programs that run inside Embive:
use core::num::NonZeroI32;
use embive::{
engine::{Config, Engine, EngineState, SYSCALL_ARGS},
memory::{Memory, SliceMemory},
registers::CPURegister, Error,
};
// A simple syscall implementation. Check [`embive::engine::SyscallFn`].
fn syscall<M: Memory>(
nr: i32,
args: &[i32; SYSCALL_ARGS],
memory: &mut M
) -> Result<i32, NonZeroI32> {
// Match the syscall number
match nr {
1 => Ok(args[0] + args[1]), // Add two numbers (arg[0] + arg[1])
2 => match memory.load(args[0] as u32) { // Load from RAM (arg[0])
Ok(val) => Ok(i32::from_le_bytes(val)), // RISC-V is little endian
Err(_) => Err(1.try_into().unwrap()), // Could not read memory
},
_ => Err(2.try_into().unwrap()), // Not implemented
}
}
fn main() -> Result<(), Error> {
// "10 + 20" using syscalls (load from ram and add two numbers)
let code = &[
0x93, 0x08, 0x20, 0x00, // li a7, 2 (Syscall nr = 2)
0x13, 0x05, 0x10, 0x00, // li a0, 1 (a0 = 1)
0x13, 0x15, 0xf5, 0x01, // slli a0, a0, 31 (a0 << 31) (0x80000000)
0x93, 0x02, 0xa0, 0x00, // li t0, 10 (t0 = 10)
0x23, 0x20, 0x55, 0x00, // sw t0, 0(a0) (Store t0 in addr. a0)
0x73, 0x00, 0x00, 0x00, // ecall (Syscall, load from arg0)
0x93, 0x08, 0x10, 0x00, // li a7, 1 (Syscall nr = 1)
0x13, 0x05, 0x40, 0x01, // li a0,20 (a0 = 20)
0x73, 0x00, 0x00, 0x00, // ecall (Syscall, add two args)
0x73, 0x00, 0x10, 0x00, // ebreak (Halt)
];
// 4KB of RAM
let mut ram = [0; 4096];
// Create memory from code and RAM slices
let mut memory = SliceMemory::new(code, &mut ram);
// Create engine config
let config = Config::default()
.with_syscall_fn(Some(syscall))
.with_instruction_limit(10);
// Create engine
let mut engine = Engine::new(&mut memory, config)?;
// Run it until ebreak, triggering an interrupt after every wfi
loop {
match engine.run()? {
EngineState::Running => {},
EngineState::Waiting => engine.interrupt()?,
EngineState::Halted => break,
}
}
// Check the result (Ok(30))
assert_eq!(
engine.registers.cpu.get(CPURegister::A0 as usize)?,
0
);
assert_eq!(
engine.registers.cpu.get(CPURegister::A1 as usize)?,
30
);
Ok(())
}
System calls are a way for the untrusted code to interact with the host environment.
When provided to the engine, the system call function will be called when the ecall
instruction is executed.
You can check more information about system calls in the engine::SyscallFn
documentation.
Embive allows interrupts for the interpreted code to be triggered from the host. This is a complement to the system
calls, allowing asynchronous communication between the host and the interpreted code.
You can check more information about interrupts in the engine::Engine::interrupt
documentation.
Check the available features and their descriptions below:
m_extension
:- Enable the RV32M extension (multiply and divide instructions).
- Disabled by default, no additional dependencies.
- Enable the RV32M extension (multiply and divide instructions).
a_extension
:- Enable the RV32A extension (atomic instructions).
- Disabled by default, no additional dependencies.
- Enable the RV32A extension (atomic instructions).
- Fully support
RV32IMAZicsr_Zifencei
(machine mode)- RV32I Base Integer Instruction Set
- M Extension (Multiplication and Division Instructions)
- Zifencei
- Implemented as a no-operation as it isn't applicable (Single HART, no cache, no memory-mapped devices, etc.).
- A Extension (Atomic Instructions)
- Zicsr
- Implement machine mode CSRs (Needed for supporting interrupts)
- Machine mode instructions (MRET & WFI)
- System Calls
- Function calls from interpreted to native code
- Resource limiter
- Yield the engine after a configurable amount of instructions are executed.
- CI/CD
- Incorporate more tests into the repository and create test automations for PRs
- Interrupts
- Interpreted code interruption triggered by the host.
- Bytecode optimization
- Allow in-place compilation to a format easier to parse.
- Less bit-shifting, faster instruction matching, etc.
- Should be kept as close as possible to native RISC-V bytecode.
- Allow in-place compilation to a format easier to parse.
Fully implementing the RISC-V F and/or D extensions would require using a soft-float library, as Rust doesn't support custom rounding modes nor does it expose the IEEE exception flags.
As the soft-float libraries available do not satisfy my requirements (must be portable, safe, no_std, and support all rounding modes and exception flags), this feature will be halted until (if) an alternative is found.
RISC-V Compressed extension adds 16-bit instruction support for the most used operations, which can decrease the binary size in about 25%.
While smaller binaries are always welcomed, the compressed extension has a far more complex decoding process as the instruction format cannot be know just by the opcode. Beyond that, some instructions can only be distinguished after decoding almost all the data, even in cases where the format is very different between them (ex: C.ANDI and C.SUB).
As handling this would decrease the performance of Embive, as well as make the future JIT/AOT feature a lot more difficult, the RVC extension won't be supported (at least for now).
Embive is guaranteed to compile on stable Rust 1.81 and up.
Embive is licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.