Skip to content

Issues: ggerganov/llama.cpp

examples : add configuration presets
#10932 opened Dec 21, 2024 by ggerganov
Open 3
changelog : libllama API
#9289 opened Sep 3, 2024 by ggerganov
Open 5
changelog : llama-server REST API
#9291 opened Sep 3, 2024 by ggerganov
Open 12
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

imatrix : use GGUF to store importance matrices breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. enhancement New feature or request examples python python script changes refactoring Refactoring Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#9400 opened Sep 10, 2024 by compilade Draft
3 of 8 tasks
Add support for Phi-3.5-vision-instruct examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#9209 opened Aug 27, 2024 by abetlen Draft
llama : initial Mamba-2 support ggml changes relating to the ggml tensor library for machine learning python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level testing Everything test related
#9126 opened Aug 21, 2024 by compilade Loading…
8 of 9 tasks
llama : add ability to load model from memory buffer ggml changes relating to the ggml tensor library for machine learning Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#9125 opened Aug 21, 2024 by ngxson Draft
2 of 4 tasks
Quantize: specify each major tensor quant in CLI for common LLMs demo Demonstrate some concept or idea, not intended to be merged examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#8917 opened Aug 7, 2024 by Nexesenex Draft
2 of 4 tasks
lookup: Use tree of sequences instead of single sequence examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#8648 opened Jul 23, 2024 by JohannesGaessler Loading…
llama : tokenizer unicode codepoint categories python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level script Script related testing Everything test related
#8606 opened Jul 20, 2024 by jaime-m-p Loading…
2 of 4 tasks
ggml: avoid rebuild of GGML graph for each token (#7456) ggml changes relating to the ggml tensor library for machine learning Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#8366 opened Jul 8, 2024 by agray3 Draft
2 of 4 tasks
server : avoid breaking KV cache when prompt >= n_ctx (#6855) examples python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server
#8359 opened Jul 8, 2024 by prfd Loading…
2 of 4 tasks
build example/main.cpp as shared library and intercept token printing using FFI demo Demonstrate some concept or idea, not intended to be merged examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#8339 opened Jul 6, 2024 by mtasic85 Loading…
json: $ref + object overhaul (https & recursive $refs, mix properties & allOf) breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. examples python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#8199 opened Jun 28, 2024 by ochafik Loading…
json: unified properties order across optional & required examples python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#8133 opened Jun 26, 2024 by ochafik Draft
1 of 4 tasks
rpc : copy tensors across servers Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#8032 opened Jun 20, 2024 by rgerganov Draft
2 of 4 tasks
Save partial imatrix data examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7910 opened Jun 12, 2024 by CISC Loading…
WIP: Use DirectStorage with CUDA interop to more efficient load tensors build Compilation issues ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7796 opened Jun 6, 2024 by mtavenrath Draft
feat: add changes to handle jina v2 chinese code python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7795 opened Jun 6, 2024 by JoanFM Loading…
Add Intel Advanced Matrix Extensions (AMX) support to ggml build Compilation issues ggml changes relating to the ggml tensor library for machine learning performance Speed related topics Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7707 opened Jun 3, 2024 by mingfeima Loading…
batched : make n_threads and n_threads_batch configurable in batched & batched-bench examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7581 opened May 28, 2024 by msy-kato Loading…
ggml-threading.cpp build Compilation issues ggml changes relating to the ggml tensor library for machine learning Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7576 opened May 27, 2024 by kunnis Draft
Rebalancing Metal threads workload in dot product kernel kernel_mul_mv_f16_f32_l4 Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7522 opened May 24, 2024 by izard Loading…
Introduce ggml_syncthreads() performance Speed related topics Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7455 opened May 22, 2024 by jart Loading…
Check for llama_get_logits_ith() errors android Issues specific to Android examples Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server
#7448 opened May 21, 2024 by jart Loading…
Direct I/O and Transparent HugePages demo Demonstrate some concept or idea, not intended to be merged examples python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level script Script related server
#7420 opened May 20, 2024 by pavelfatin Loading…
Add minimal python client example for the server, streaming callback examples python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server
#7373 opened May 18, 2024 by chrismrutherford Loading…
sched : support async weight copy performance Speed related topics Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#7315 opened May 15, 2024 by slaren Draft
ProTip! Exclude everything labeled bug with -label:bug.