Skip to content
/ embive Public

A low-level sandboxing library for RISC-V bytecode

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

embive/embive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Embive (Embedded RISC-V) Latest Version docs msrv

Embive is a low-level sandboxing library focused on the embedding of untrusted code for constrained environments.
As it interprets RISC-V bytecode, multiple languages are supported out of the box by Embive (Rust, C, C++, Zig, Nim, etc.).
By default, it doesn’t require external crates, dynamic memory allocation or the standard library (no_std & no_alloc).

Embive is designed for any error during execution to be recoverable, allowing the host to handle it as needed. As so, no panics should occur on release builds, despite the bytecode being executed.

Currently, it supports the RV32I[MA]Zicsr_Zifencei instruction set (Check Features).

Bytecode

The bytecode can be generated by any compiler that supports the RISC-V 32-bit instruction set, as long as it can output a flat binary file (.bin) statically linked to the correct addresses (Code at 0x00000000, RAM at 0x80000000).

Templates

The following templates are available for programs that run inside Embive:

Example

use core::num::NonZeroI32;
use embive::{
    engine::{Config, Engine, EngineState, SYSCALL_ARGS},
    memory::{Memory, SliceMemory},
    registers::CPURegister, Error,
};

// A simple syscall implementation. Check [`embive::engine::SyscallFn`].
fn syscall<M: Memory>(
    nr: i32,
    args: &[i32; SYSCALL_ARGS],
    memory: &mut M
) -> Result<i32, NonZeroI32> {
    // Match the syscall number
    match nr {
        1 => Ok(args[0] + args[1]), // Add two numbers (arg[0] + arg[1])
        2 => match memory.load(args[0] as u32) { // Load from RAM (arg[0])
            Ok(val) => Ok(i32::from_le_bytes(val)), // RISC-V is little endian
            Err(_) => Err(1.try_into().unwrap()), // Could not read memory
        },
        _ => Err(2.try_into().unwrap()), // Not implemented
    }
}

fn main() -> Result<(), Error> {
    // "10 + 20" using syscalls (load from ram and add two numbers)
    let code = &[
        0x93, 0x08, 0x20, 0x00, // li   a7, 2      (Syscall nr = 2)
        0x13, 0x05, 0x10, 0x00, // li   a0, 1      (a0 = 1)
        0x13, 0x15, 0xf5, 0x01, // slli a0, a0, 31 (a0 << 31) (0x80000000)
        0x93, 0x02, 0xa0, 0x00, // li   t0, 10     (t0 = 10)
        0x23, 0x20, 0x55, 0x00, // sw   t0, 0(a0)  (Store t0 in addr. a0)
        0x73, 0x00, 0x00, 0x00, // ecall           (Syscall, load from arg0)
        0x93, 0x08, 0x10, 0x00, // li   a7, 1      (Syscall nr = 1)
        0x13, 0x05, 0x40, 0x01, // li   a0,20      (a0 = 20)
        0x73, 0x00, 0x00, 0x00, // ecall           (Syscall, add two args)
        0x73, 0x00, 0x10, 0x00, // ebreak          (Halt)
    ];

    // 4KB of RAM
    let mut ram = [0; 4096];

    // Create memory from code and RAM slices
    let mut memory = SliceMemory::new(code, &mut ram);

    // Create engine config
    let config = Config::default()
        .with_syscall_fn(Some(syscall))
        .with_instruction_limit(10);

    // Create engine
    let mut engine = Engine::new(&mut memory, config)?;

    // Run it until ebreak, triggering an interrupt after every wfi
    loop {
        match engine.run()? {
            EngineState::Running => {},
            EngineState::Waiting => engine.interrupt()?,
            EngineState::Halted => break,
        }
    }

    // Check the result (Ok(30))
    assert_eq!(
        engine.registers.cpu.get(CPURegister::A0 as usize)?,
        0
    );
    assert_eq!(
        engine.registers.cpu.get(CPURegister::A1 as usize)?,
        30
    );
    
    Ok(())
}

System Calls

System calls are a way for the untrusted code to interact with the host environment. When provided to the engine, the system call function will be called when the ecall instruction is executed. You can check more information about system calls in the engine::SyscallFn documentation.

Interrupts

Embive allows interrupts for the interpreted code to be triggered from the host. This is a complement to the system calls, allowing asynchronous communication between the host and the interpreted code. You can check more information about interrupts in the engine::Engine::interrupt documentation.

Features

Check the available features and their descriptions below:

  • m_extension:
    • Enable the RV32M extension (multiply and divide instructions).
      • Disabled by default, no additional dependencies.
  • a_extension:
    • Enable the RV32A extension (atomic instructions).
      • Disabled by default, no additional dependencies.

Roadmap

  • Fully support RV32IMAZicsr_Zifencei (machine mode)
    • RV32I Base Integer Instruction Set
    • M Extension (Multiplication and Division Instructions)
    • Zifencei
      • Implemented as a no-operation as it isn't applicable (Single HART, no cache, no memory-mapped devices, etc.).
    • A Extension (Atomic Instructions)
    • Zicsr
      • Implement machine mode CSRs (Needed for supporting interrupts)
    • Machine mode instructions (MRET & WFI)
  • System Calls
    • Function calls from interpreted to native code
  • Resource limiter
    • Yield the engine after a configurable amount of instructions are executed.
  • CI/CD
    • Incorporate more tests into the repository and create test automations for PRs
  • Interrupts
    • Interpreted code interruption triggered by the host.
  • Bytecode optimization
    • Allow in-place compilation to a format easier to parse.
      • Less bit-shifting, faster instruction matching, etc.
    • Should be kept as close as possible to native RISC-V bytecode.

What about Floating Point?

Fully implementing the RISC-V F and/or D extensions would require using a soft-float library, as Rust doesn't support custom rounding modes nor does it expose the IEEE exception flags.

As the soft-float libraries available do not satisfy my requirements (must be portable, safe, no_std, and support all rounding modes and exception flags), this feature will be halted until (if) an alternative is found.

What about Compressed Instructions?

RISC-V Compressed extension adds 16-bit instruction support for the most used operations, which can decrease the binary size in about 25%.

While smaller binaries are always welcomed, the compressed extension has a far more complex decoding process as the instruction format cannot be know just by the opcode. Beyond that, some instructions can only be distinguished after decoding almost all the data, even in cases where the format is very different between them (ex: C.ANDI and C.SUB).

As handling this would decrease the performance of Embive, as well as make the future JIT/AOT feature a lot more difficult, the RVC extension won't be supported (at least for now).

Minimum supported Rust version (MSRV)

Embive is guaranteed to compile on stable Rust 1.81 and up.

License

Embive is licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

A low-level sandboxing library for RISC-V bytecode

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published

Languages