Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream merge Dec 01 #94

Merged
merged 179 commits into from
Dec 1, 2023
Merged
Changes from 1 commit
Commits
Show all changes
179 commits
Select commit Hold shift + click to select a range
898db76
[API] Add GenerationConfig (#1024)
davidpissarra Oct 8, 2023
ad3a6b9
Fix two bugs in kv-cache backtrack loop (#856)
shenberg Oct 8, 2023
6e40c21
[Build] Added --pdb flag to build.py, drop into pdb on error (#1017)
Lunderberg Oct 8, 2023
bae37b3
[Android] Use `AlertDialog` instead of `Toast` (#1039)
cyx-6 Oct 8, 2023
b44f679
Add doc for ChatConfig, ConvConfig, GenerationConfig, BuildArgs (#1040)
CharlieFRuan Oct 9, 2023
3a9849a
[Android] Add Llama2 q4f16_0 (#1041)
spectrometerHBH Oct 9, 2023
bed9e60
[Docs] Model prebuilts tracking page revamp (#1000)
CharlieFRuan Oct 9, 2023
c02fdaf
Update compile_models.rst (#1038)
yongjer Oct 9, 2023
85001ed
Support for the Stable LM 3B model (#1008)
jeethu Oct 9, 2023
a032d40
[Docs] Iterate model prebuilts docs (#1043)
CharlieFRuan Oct 9, 2023
a58605f
Update README.md
junrushao Oct 9, 2023
bdd9d9b
[CPP] Separate common utils out from llm_chat.cc (#1044)
MasterJH5574 Oct 9, 2023
20131fb
Update README.md (#1045)
junrushao Oct 9, 2023
1e6fb11
add verbose stats to mlc-chat REST API (#1049)
denise-k Oct 11, 2023
b9179cf
[Transform] Apply split_rotary optimization on prefill (#1033)
Lunderberg Oct 12, 2023
98ebd28
[Docs] Add `mlc.ai/package` to `DEPENDENCY INSTALLATION` group (#1055)
LeshengJin Oct 12, 2023
bfaa5b9
Revert "[Transform] Apply split_rotary optimization on prefill (#1033…
MasterJH5574 Oct 12, 2023
ca8c11b
[BugFix] Set the right `max_sequence_length` for both Llama-1 and Lla…
sunggg Oct 13, 2023
edab9b5
[Doc] Use -U instead of --force-reinstall (#1062)
junrushao Oct 13, 2023
d854105
[Model] Initial batching support for Llama (#1048)
MasterJH5574 Oct 14, 2023
c2b8cbc
Fix Stable LM 3B build (#1061)
jeethu Oct 14, 2023
481cd92
[Core] Remove duplication in MODEL.get_model calls (#1054)
Lunderberg Oct 14, 2023
8184431
[ParamManager] Cleanup creation of quantization IRModule (#1053)
Lunderberg Oct 14, 2023
9010d48
Minor typo fix (#1064)
jeethu Oct 15, 2023
b0bfc88
Add links to Python API Reference (#1068)
junrushao Oct 15, 2023
204860b
[Fix] ChatModule incorrect temperature buffer shape (#1070)
MasterJH5574 Oct 15, 2023
d202077
[ParamManager] Added progress bar for get_item/set_item (#1063)
Lunderberg Oct 16, 2023
9872c48
[Python] Extract common device str parse function in ChatModule (#1074)
MasterJH5574 Oct 16, 2023
3aefd9f
[Bugfix] Compilation Error in q4f32_1 (#1078)
junrushao Oct 17, 2023
2625945
Establish `mlc_chat.compiler` (#1082)
junrushao Oct 19, 2023
56a8004
Update README.md for Multi-GPU (#1090)
junrushao Oct 19, 2023
b0373d1
Support lib_path override in C++. Improvements on docs and error mess…
rickzx Oct 19, 2023
830656f
StreamIterator (#1057)
varshith15 Oct 19, 2023
9bf5723
Update `benchmark.py` according to #1086 (#1091)
junrushao Oct 19, 2023
62d0c03
Disable Disco for q4f16_ft and q8f16_ft quantization (#1094)
LeshengJin Oct 20, 2023
cf39bf6
[Format] Apply isort and black for `python/` (#1097)
junrushao Oct 20, 2023
e9b85ce
More formatting (#1099)
junrushao Oct 21, 2023
03c641a
Enable Python Linter (#1098)
junrushao Oct 21, 2023
46d11e6
Add Basic Pylint and Mypy Tooling (#1100)
junrushao Oct 21, 2023
6159cc4
[CI] Add clang-format (#1103)
junrushao Oct 22, 2023
16dd2ae
[Slim-LM] Smart path finding for config and weight (#1088)
LeshengJin Oct 23, 2023
f57c9c9
[Transform] Provide IRModule transform for rewrite_attention (#1052)
Lunderberg Oct 23, 2023
e5927ce
[ParamManager] Use BundleModelParams for transform_dequantize (#1056)
Lunderberg Oct 23, 2023
7ae8c6d
[Slim-LM] Introduce HFLoad for loading Pytorch and SafeTensor weights…
LeshengJin Oct 23, 2023
5a7dcd8
[WINDOWS] reduce noise in windows build (#1115)
tqchen Oct 24, 2023
61179a0
Add CLI commands for compilation (#1109)
junrushao Oct 24, 2023
8ce7793
Auto updated submodule references
Oct 24, 2023
488017d
fix mismatched argument name (#1117)
Sing-Li Oct 24, 2023
206103b
[Docs] Add doc for max and mean gen len, shift factor; and buildArgs …
CharlieFRuan Oct 24, 2023
2aa6809
Revert "[ParamManager] Use BundleModelParams for transform_dequantize…
junrushao Oct 24, 2023
9cb8e8e
Remove inaccurate warning message (#1121)
junrushao Oct 24, 2023
9166edb
[REST] OpenAI compatible Rest API (#1107)
Kartik14 Oct 24, 2023
a4279e3
Add --opt flag parsing to CLI (#1123)
junrushao Oct 25, 2023
973f9fc
[ParamManager][Redo] Use BundleModelParams for transform_dequantize (…
Lunderberg Oct 25, 2023
24f795e
added details to windows installation (#1133)
goutham2688 Oct 27, 2023
2c492e5
Grammatical and Typographical improvements (#1139)
tmsagarofficial Oct 28, 2023
2ec0cc8
Minor enhancements to `ChatModule` (#1132)
YuchenJin Oct 28, 2023
27ac5ac
Updating tvm install docs (#1143)
David-Sharma Oct 29, 2023
2b6d832
Make the help info consistent with program name (#1137)
fennecJ Oct 29, 2023
878ae84
Support parameter packing (#1146)
junrushao Oct 29, 2023
c0c3a8d
[Slim-LM] Enable Group Quant (#1129)
zxybazh Oct 29, 2023
2193767
Enable Mypy and Pylint in mlc_chat Python Package (#1149)
junrushao Oct 29, 2023
0a25374
Migrate Compiler Passes (#1150)
junrushao Oct 30, 2023
1a79a53
Compile Model Preset without External `config.json` (#1151)
junrushao Oct 30, 2023
ba67835
Update attention layer (#1153)
junrushao Oct 30, 2023
fee2cb5
Add batched Llama model definition using vLLM paged attention (#1134)
masahi Oct 30, 2023
ece97b1
[Transform][Redo] Apply split_rotary optimization on prefill (#1125)
Lunderberg Oct 30, 2023
b190578
Apply rewrite for normal attention and MQA (#1138)
Lunderberg Oct 30, 2023
8ca0176
[Rest] Fix emoji handling in Rest API. (#1142)
YuchenJin Oct 30, 2023
3cf5605
[Utility] Check for isinstance(exc, Exception) before entering pdb (#…
Lunderberg Oct 30, 2023
0a9d6c7
[Utils] Remove conversion to numpy array in utils.save_params (#1083)
Lunderberg Oct 30, 2023
425a2cb
[Fix][REST] Use lowered-cased "app" (#1159)
junrushao Oct 30, 2023
9076d01
[Rest] Document emoji handling (#1160)
YuchenJin Oct 31, 2023
b5bfa5b
Enable group quant transform with nn.Module (#1154)
cyx-6 Oct 31, 2023
8438b27
Misc Cleanups of Compilation Pipeline (#1165)
junrushao Oct 31, 2023
02d1e57
Support CUDA Multi-Arch Compilation (#1166)
junrushao Oct 31, 2023
e0cd3f6
[Bugfix] Cannot find global function `mlc.llm_chat_create` (#1167)
junrushao Oct 31, 2023
f5b2e88
Fix RWKV Support (#1136)
BBuf Nov 1, 2023
200653a
Auto updated submodule references
Nov 1, 2023
9831135
Fix Android app Permission denied error on Android 10 (#1175)
anibohara2000 Nov 1, 2023
1757777
[SLM] Fix group quantization (#1172)
cyx-6 Nov 1, 2023
2ca7d15
[Fix] TIR block name of dequantization (#1177)
junrushao Nov 2, 2023
53060af
[SLM][AutoLLM] Enable Command Line Weight Conversion (#1170)
zxybazh Nov 2, 2023
2dc8183
[Fix][SLM] Update q4f16 quantization with the new mutator name rule (…
LeshengJin Nov 3, 2023
6ae02dd
[Model Support][SWA] Add support for sliding window attention for Mis…
CharlieFRuan Nov 3, 2023
4716704
Add Python API for Weight Conversion (#1182)
junrushao Nov 4, 2023
9d20575
Merge `llama_config.CONFIG` into `MODEL_PRESETS` (#1188)
junrushao Nov 4, 2023
5d1dc34
Merge llama_config.py into llama_model.py (#1189)
junrushao Nov 4, 2023
4832c2f
Add CodeLlama as part of model presets (#1190)
junrushao Nov 4, 2023
78424f0
[Docs] Clarify zstd installation on Windows (#1191)
junrushao Nov 4, 2023
5d63f7e
[Docs] Clarify zstd installation on Windows (#1196)
junrushao Nov 4, 2023
3417505
Support overriding `--max-sequence-length` in command line (#1197)
junrushao Nov 5, 2023
0e08845
[RestAPI] Added docs (#1193)
anibohara2000 Nov 5, 2023
145a984
[API] ```llm-vscode``` extension support (#1198)
davidpissarra Nov 5, 2023
3413d17
[Fix] Use `fabs` as floating point abs function in C++ (#1202)
junrushao Nov 5, 2023
7ccb51a
Integrating MLC runtime with the new compilation workflow (#1203)
junrushao Nov 6, 2023
65478c8
[Fix] Remove Redundant Warnings (#1204)
junrushao Nov 6, 2023
01d4339
Try fix macOS build with picojson (#1206)
junrushao Nov 6, 2023
51d6f9c
Try fix macOS build with picojson again (#1207)
junrushao Nov 6, 2023
a7f1183
Auto updated submodule references
Nov 6, 2023
e2c99a8
[Fix] Keep update-to-date with upstream API change (#1209)
junrushao Nov 6, 2023
e00220c
Detect `mtriple` via LLVM (#1211)
junrushao Nov 6, 2023
9869ca6
Fix Python3.8 compatibility breakage (#1210)
Lunderberg Nov 6, 2023
4042626
[Slim-LM] Enable loading from AWQ pre-quantized weight. (#1114)
LeshengJin Nov 6, 2023
be1c18b
[Bugfix] Fix Cannot import name '_LIB' from 'mlc_chat.base' (#1214)
CharlieFRuan Nov 7, 2023
1015aae
[SLM] Support `q3f16_1` and `q4f32_1` (#1215)
cyx-6 Nov 8, 2023
1a6fadd
Make the Compilation Working E2E (#1218)
junrushao Nov 8, 2023
616ca42
[Mistral][SWA] Add sliding window to metadata (#1217)
CharlieFRuan Nov 8, 2023
e52f449
Support for `chatml` format conversation (for TinyLlama-1.1B-Chat-v0.…
acalatrava Nov 8, 2023
fbe75e3
Add Rust Support for MLC-LLM (#1213)
YuchenJin Nov 8, 2023
beca2ab
[Bugfix] Remove dependency on openai_api in chat module (#1222)
CharlieFRuan Nov 8, 2023
9ee5705
Bake in RAM Usage in the Generated DSO (#1224)
junrushao Nov 8, 2023
069181c
[Fix] ChatModule python messages and offset types (#1220)
YuchenJin Nov 8, 2023
f1bc951
[Fix] Variable Upperbound Should be Injected before Build Pipeline (#…
junrushao Nov 8, 2023
834811f
[MultiGPU] Support pre-sharded model weights (#1096)
Lunderberg Nov 9, 2023
45bf1c5
[AWQ] e2e awq-quantized model (#1229)
LeshengJin Nov 10, 2023
d08b009
[SLM] Support `q0f16` and `q0f32` (#1228)
cyx-6 Nov 10, 2023
fab4486
[Core][Llama] Argument `max_vocab_size` and `max_batch_size` (#1076)
MasterJH5574 Nov 11, 2023
cd71665
[Llama] Support batched prefill (#1233)
MasterJH5574 Nov 11, 2023
a21c759
[Core] Skip PrimExpr index int32 downcasting for batching (#1234)
MasterJH5574 Nov 11, 2023
cb68e7b
Auto updated submodule references
Nov 12, 2023
1400cd9
Update index.rst (#1236)
a7k3 Nov 12, 2023
c2082d8
Update android.rst (#1237)
a7k3 Nov 12, 2023
26fd019
Correct typo in cuda device name for rust chat model (#1241)
malramsay64 Nov 13, 2023
ab2a05b
Generating mlc-chat-config.json (#1238)
junrushao Nov 13, 2023
d24379c
Rename `--config` to `--model` and Consolidate CLI Messages (#1244)
junrushao Nov 13, 2023
4021785
Specify argument "dest" in argparse (#1245)
junrushao Nov 13, 2023
5005772
Add more stats during quantization (#1246)
junrushao Nov 13, 2023
34c15f2
ensure that max_gen_len is set properly in mlc_chat_config (#1249)
denise-k Nov 13, 2023
7da81a4
[Fix] Memory usage statistics (#1252)
LeshengJin Nov 13, 2023
cd4a8ed
Introduce mlc_chat subcommands (#1251)
junrushao Nov 13, 2023
8305b22
Update mlc-chat-config.json (#1254)
junrushao Nov 14, 2023
5e02cac
[Rust] Support multiple prompts (#1253)
YuchenJin Nov 14, 2023
77a4b69
[UI] Correct "convert_weight_only" to "convert_weights_only" (#1227)
Lunderberg Nov 14, 2023
12efd45
Add a downloader from HuggingFace (#1258)
junrushao Nov 14, 2023
1dbfac5
[Fix] Add prefix_tokens to `ConvConfig` in Python to match C++ implem…
YuchenJin Nov 14, 2023
8d9effe
[nn.Module] Mistral implementation (#1230)
davidpissarra Nov 15, 2023
8304d4c
Add `mlc_chat.__main__` as command line entrypoint (#1263)
junrushao Nov 15, 2023
64e3410
[Rust] Improve ergonomics of `generate` function in `ChatModule` (#1…
YuchenJin Nov 15, 2023
2c00373
[Fix] mistral `max_gen_len` (#1264)
davidpissarra Nov 15, 2023
ceb27d5
Rename `max-sequence-length` to `context-window-size` (#1265)
junrushao Nov 15, 2023
17aa5bf
Auto updated submodule references
Nov 16, 2023
fde2e85
Fix group quantization shape infer (#1273)
cyx-6 Nov 16, 2023
4a137d3
Continuous Model Delivery (#1272)
junrushao Nov 16, 2023
2600b9a
Auto updated submodule references
Nov 17, 2023
31910dd
Enhance Model Delivery (#1283)
junrushao Nov 17, 2023
fb7a224
add python, rest api test (#1278)
Kartik14 Nov 18, 2023
d3b7aad
Enable Jenkins CI (#1292)
Hzfengsy Nov 19, 2023
5fac856
Update android.rst (#1289)
a7k3 Nov 19, 2023
49f75d2
Consolidate Logics for GPU Detection (#1297)
junrushao Nov 20, 2023
01daa64
[CI] Fix lint concurrent clone issue (#1299)
MasterJH5574 Nov 20, 2023
418b9a9
Auto updated submodule references
Nov 20, 2023
b4ba7ca
[Feature] Prefill chunking for non-SWA models (#1280)
davidpissarra Nov 20, 2023
488f65d
Compatible with chatglm (#979)
qc903113684 Nov 20, 2023
2fd1bf5
Add q4/q8_ft_group quantization mode (#1284)
vinx13 Nov 21, 2023
5d96740
[CI] Clean workspace before build (#1304)
MasterJH5574 Nov 21, 2023
9a04de6
[Python] Detect Driver/Device in a Separate Process (#1311)
junrushao Nov 22, 2023
9641676
add chatglm3 support (#1313)
Jasonsey Nov 22, 2023
95f9abe
[SLIM] Skip None param when loading rather than failing (#1308)
CharlieFRuan Nov 22, 2023
9e28540
Auto updated submodule references
Nov 22, 2023
53f2747
[nn.Module] Implement GPT-2 Model Support (#1314)
rickzx Nov 23, 2023
b561810
remove ndk referencce from mali build target (#1312)
shanbady Nov 23, 2023
13759fd
[Rust] A few enhancements (#1310)
YuchenJin Nov 23, 2023
48df439
[iOS] Mistral support (#1320)
davidpissarra Nov 23, 2023
da07940
Add terminator for streaming REST API (#1325)
Sing-Li Nov 23, 2023
992ed42
read CUDA_ARCH_LIST to set CUDA capability versions for nvcc (#1326)
technillogue Nov 24, 2023
3358029
Update emcc.rst
tqchen Nov 24, 2023
fa40ec1
[AUTO-DEVICE] In process early exit device detection (#1333)
tqchen Nov 27, 2023
e7d2ce6
[RestAPI] Update parameters for /v1/completions and add tests (#1335)
anibohara2000 Nov 28, 2023
5dc809e
fix broken REST examples due to recent compatibility change (#1345)
Sing-Li Nov 29, 2023
1f8c2d0
[Bugfix] Ignore exit code in device detection (#1350)
junrushao Nov 29, 2023
02a41e1
[OpenHermes] Add conversation template for OpenHermes Mistral (#1354)
CharlieFRuan Nov 30, 2023
5315d18
[Tokenizer] Prioritize huggingface tokenizer.json, generate one if no…
CharlieFRuan Nov 30, 2023
76c2807
[Rust] Prepare for publishing (#1342)
YuchenJin Nov 30, 2023
2ab39c9
Fix gen_mlc_chat_config for mistral (#1353)
jinhongyii Nov 30, 2023
a4a06d5
Fix ft quantization scale computation (#1321)
vinx13 Nov 30, 2023
4cefcc9
Merge remote-tracking branch 'mlc-ai/main' into merge-dec01
masahi Dec 1, 2023
69eaa38
fix
masahi Dec 1, 2023
17c3678
fix
masahi Dec 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update compile_models.rst (mlc-ai#1038)
fix permission issue
  • Loading branch information
yongjer authored Oct 9, 2023
commit c02fdafc917d8bdc941ecb18be7e943cef22d89b
2 changes: 1 addition & 1 deletion docs/compilation/compile_models.rst
Original file line number Diff line number Diff line change
@@ -30,7 +30,7 @@ The easiest way is to use MLC-LLM is to clone the repository, and compile models
.. code:: bash

# clone the repository
git clone git@github.com:mlc-ai/mlc-llm.git --recursive
git clone https://github.com/mlc-ai/mlc-llm.git --recursive
# enter to root directory of the repo
cd mlc-llm
# install mlc-llm