[C++][Gandiva] Migration JIT engine from MCJIT to LLJIT #37848
Description
Description
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT [1], which presents several advantages over MCJIT:
- Active Maintenance: ORC v2 is under active development and maintenance by LLVM developers. In contrast, MCJIT is not receiving active updates and, based on indications from LLVM developers, is slated for eventual deprecation and removal.
- Modularity and Organization: ORC v2 boasts a more organized and modular structure, granting users the flexibility to seamlessly integrate various JIT components.
- Thread-Local Variable Support: ORC v2 natively supports thread-local variables, enhancing its functionality.
- Enhanced Resource Management: When compared to MCJIT, ORC v2 provides a more granular approach to resource management, optimizing memory usage and code compilation.
In my project, I've experimented with this migration and got it to work in a prototype. However, transitioning Gandiva to this new API is a substantial undertaking. I'm keen to gauge the community's interest in migrating to this new JIT engine API and would greatly appreciate any feedback or insights. Thank you.
Proposal
- There won't be any Gandiva user facing API change, namely, the
Projector
andFilter
APIs remain the same - There will be some API changes for
LLVMGenerator
andEngine
classes- Both
LLVMGenerator
andEngine
classes constructors are expected to take an optional additionalGandivaObjectCache
reference because LLJIT requires to set up the object cache mechanism during initialization of LLJIT instance - There will be major change for the
Engine
class implementation since it is currently interfacing the MCJIT directly and we will replace the MCJIT related APIs with the LLJIT related APIs
- Both
- There may be a minor change to the
Configuration
class, and it is expected to add a new configuration option calledneeds_ir_dumping
because LLJIT doesn't allow to retrieve the IR from module at any time. But previously Gandiva has an API calledDumpIR
which allows dumping IR at any time, so we need to use this new option to indicate IR dumping is needed and we can store the IR up front for later dumping - Performance is expected to be roughly the same (according to the feedback I got from LLVM developers in LLVM discord)
- LLJIT API is available since LLVM 7.0 [1][2][3] so theoretically after migration we could support LLVM >= 7.0, but I am not sure if all the APIs used for migration supports >= LLVM 7.0 across all platforms, and this migration may have to require higher version of LLVM (testing is needed)
References
[1] https://llvm.org/docs/ORCv2.html
[2] https://github.com/llvm/llvm-project/commits/c4e764ea24eb02b6ec34038061cee8ff94c0f34c/llvm/include/llvm/ExecutionEngine/Orc/LLJIT.h?after=c4e764ea24eb02b6ec34038061cee8ff94c0f34c+34
[3] LLVM release dates, https://releases.llvm.org
Component(s)
C++ - Gandiva