Tags: mlc-ai/mlc-llm
Tags
[Model] Add support for OLMo architecture (#3046) This PR add support for OLMo architecture. Additional support: add support for clip-qkv. Test: already tested on android(pixel 4) and cuda(setting tensor_parallel_shrads=2)
Added hermes 3 support (#2886) * added hermes 3 support * modified format * fixed lint
Initial commit --------- Co-authored-by: Hongyi Jin <jinhongyi02@gmail.com> Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> Co-authored-by: Tianqi Chen <tqchen@cmu.edu> Co-authored-by: Junru Shao <junrushao@apache.org> Co-authored-by: Zihao Ye <zhye@cs.washington.edu>