Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Compile to buffer #3862

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

RichardAH
Copy link
Contributor

Hi guys!

We need this compile-to-buffer API for the Xahau blockchain (https://github.com/Xahau/xahaud ) which uses WasmEdge for smart contracts, specifically we want to use ahead of time compilation to cache execution of frequently used contracts.

I haven't had time to test, this is just a preliminary version of the PR. If someone else thinks this is a needed API change please feel free to assist.

Copy link
Member

juntao commented Nov 4, 2024

Hello, I am a code review agent on flows.network. Here are my reviews of changed source code files in this PR.


include/api/wasmedge/wasmedge.h

Potential issues

Issues:

  1. Missing Include Guards:

    • The source code does not include any include guards for the header file, which can lead to multiple inclusions if included more than once.
  2. Inconsistent Naming Conventions:

    • There is inconsistency in naming conventions used throughout the code, such as using WASMEDGE_CAPI_EXPORT in some places and WasmEdge_Result_Success with a mix of uppercase and lowercase.
  3. Missing Function Definitions for C++ Compatibility:

    • For C++ compatibility, the macros WasmEdge_Result_Success, etc., should be defined properly to avoid potential issues when using these constants in C++ code.

Summary of changes

Summary of Key Changes

  • Added New Functions:

    • WasmEdge_CompilerCompileFromBufferToBuffer: Compiles a WebAssembly buffer and writes the result to another buffer.
    • WasmEdge_CompilerCompileFromBytesToBytes: Compiles a WebAssembly WasmEdge_Bytes and stores the output in another WasmEdge_Bytes.
  • Deprecation Notice:

    • The original function WasmEdge_CompilerCompileFromBuffer is marked for deprecation, suggesting future replacement by the new buffer-to-buffer compilation functions.

include/llvm/codegen.h

Potential issues

  1. The codegen_buffer method's parameter OutByte is not marked as const, which could lead to unintended modifications of the output buffer by the caller.
  2. The return type of codegen_buffer is Expect<void>, but it should ideally return an error code if there is an issue with writing to the output buffer, rather than throwing an exception.
  3. The Conf member variable is not used within the class, which makes the parameter in the constructor redundant unless there are other methods or members using it.

Summary of changes

  • Added Buffer-based Compilation Method: Introduced a new method codegen_buffer that compiles LLVM Module into a buffer instead of writing to a file, returning the number of bytes written.

  • Modified Private Method Signature: Changed the private helper method codegen_osvec to return an LLVM::MemoryBuffer.

  • Added Forward Declaration for MemoryBuffer: Added a forward declaration for the MemoryBuffer class within the LLVM namespace.

lib/api/wasmedge.cpp

Potential issues

  1. The code does not handle any potential errors that might occur during the compilation process.
  2. There is no check for a nullptr when passing WasmEdge_ValueArr to fillWasmEdge_ValueArr, which could lead to undefined behavior if the pointer is null.
  3. The genSpan functions do not check if the input buffer is null before returning a Span, which can lead to invalid memory accesses.

Summary of changes

  • Added a non-const version of genSpan function to handle mutable buffers.
  • Introduced two new C API functions: WasmEdge_CompilerCompileFromBufferToBuffer and WasmEdge_CompilerCompileFromBytesToBytes, which allow compiling WebAssembly code from an input buffer to an output buffer directly.
  • Implemented the compilation logic in WasmEdge_CompilerCompileFromBytesToBytes using existing methods, handling both LLVM and non-LLVM scenarios.

lib/llvm/codegen.cpp

Potential issues

  1. Potential Memory Leak in outputWasmLibrary: The function outputWasmLibrary creates a temporary object file but does not clean up the intermediate .obj/.o file or the .lib file if it exists, leading to potential memory leaks.

  2. Uninitialized Variable in getOSVersion: In the macOS implementation of getOSVersion, there is an issue where if the system call to get OS version fails, the function returns a default value but does not initialize the variables correctly.

  3. Error Handling in codegen_buffer Function: The function codegen_buffer does not provide meaningful error handling and always returns an IllegalPath error when it encounters issues, which might make debugging harder without additional context.

Summary of changes

  • The function CodeGen::codegen has been renamed to CodeGen::codegen_osvec and now returns an LLVM::MemoryBuffer.
    • A new function CodeGen::codegen_buffer has been added, which takes an output buffer and size pointer, writes the generated code to the buffer, and sets the size.
    • The original codegen function has been updated to call codegen_osvec, handle the case where the result is invalid, and then write the generated code to the output file based on the compiler configuration.

test/api/APIAOTCoreTest.cpp

Potential issues

  1. The function Compile is capturing the entire configuration context by value (auto Path = std::filesystem::u8path(Filename);), which could lead to unexpected behavior if ConfCxt is modified elsewhere.

  2. In multiple places (e.g., in T.onModule, T.onLoad, etc.), the function Compile is called even before checking if the module or file needs to be recompiled, which can be inefficient.

  3. There is no error handling for the case where WasmEdge_CompilerCompile fails and returns an invalid SOFilename. This could lead to subsequent operations on an uninitialized SOFilename.

Summary of changes

  • Added a new line after return SOPath; to separate it from the following code block.
    • Introduced a lambda function for the T.onModule member variable, capturing local variables VM and Compile.
    • Passed two parameters (ModName and Filename) to the lambda function, which are expected to be used within the function body.

@github-actions github-actions bot added c-CAPI An issue related to the WasmEdge C API c-Test An issue/PR to enhance the test suite labels Nov 4, 2024
@@ -1670,6 +1670,31 @@ WASMEDGE_CAPI_EXPORT extern WasmEdge_Result WasmEdge_CompilerCompileFromBuffer(
WasmEdge_CompilerContext *Cxt, const uint8_t *InBuffer,
const uint64_t InBufferLen, const char *OutPath);

/// Compile the input WASM from the given buffer and output to another buffer.
///
/// CAUTION: This function will be deprecated and replaced by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the following one will replace this function, why do we still need this now?

@@ -1685,6 +1710,24 @@ WASMEDGE_CAPI_EXPORT extern WasmEdge_Result
WasmEdge_CompilerCompileFromBytes(WasmEdge_CompilerContext *Cxt,
const WasmEdge_Bytes Bytes,
const char *OutPath);
/// Compile the input WASM from a WasmEdge_Bytes, and output it to a WasmEdge_Bytes.///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra /// at the end of the line.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also lost an empty line.

/// Compiling LLVM Module into loadable executable binary.
class CodeGen {
public:
CodeGen(const Configure &Conf) noexcept : Conf(Conf) {}
Expect<void> codegen(Span<const Byte> WasmData, Data D,
std::filesystem::path OutputPath) noexcept;
// returns number of bytes written to output buffer
Expect<void> codegen_buffer(Span<const Byte> WasmData,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be codegenBuffer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think codegenToBuffer is better.


private:
Expect<LLVM::MemoryBuffer> codegen_osvec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe codegenMemBuf or codegenOSVec is better?

/// Compiling LLVM Module into loadable executable binary.
class CodeGen {
public:
CodeGen(const Configure &Conf) noexcept : Conf(Conf) {}
Expect<void> codegen(Span<const Byte> WasmData, Data D,
std::filesystem::path OutputPath) noexcept;
// returns number of bytes written to output buffer
Expect<void> codegen_buffer(Span<const Byte> WasmData,
Data D, Span<const Byte> OutByte, // not actually const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird. If the OutByte is not actually const, then we should not use const Byte here.

std::filesystem::path LLPath(OutputPath);
LLPath.replace_extension("ll"sv);

auto result = codegen_osvec(WasmData, D);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming style: Result.

Comment on lines +660 to +663
// TODO:return error
// RH NOTE: the error values are all IllegalPath in this code when
// I got here, I'm not sure if I should change the enum and update them
// or if there's already a plan and scheme to do that...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are having the new release for 0.15, it's fine to update the enum and add more ErrCode to show the real error value.

// avoiding a copy here would also involve modifying LLVM::MemoryBuffer.
// Also trying to glue this function back to the CAPI proved annoying for
// const preservation, so here is an ugly const cast.
memcpy(const_cast<uint8_t*>(OutByte.data()), buf.data(), buf.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to use copy_n instead of any memcpy series function.

}

LLVM::MemoryBuffer& buf = *result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming issue: Buf.


*OutSize = 0;

auto result = codegen_osvec(WasmData, D);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming issue: Result.

@RichardAH
Copy link
Contributor Author

Thanks for the reviews I'll get back to this hopefully in the coming week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-CAPI An issue related to the WasmEdge C API c-Test An issue/PR to enhance the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants