-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Issues: ggerganov/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
imatrix : use GGUF to store importance matrices
breaking change
Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility.
enhancement
New feature or request
examples
python
python script changes
refactoring
Refactoring
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Add support for Phi-3.5-vision-instruct
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
llama : initial Mamba-2 support
ggml
changes relating to the ggml tensor library for machine learning
python
python script changes
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
testing
Everything test related
#9126
opened Aug 21, 2024 by
compilade
Loading…
8 of 9 tasks
llama : add ability to load model from memory buffer
ggml
changes relating to the ggml tensor library for machine learning
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Quantize: specify each major tensor quant in CLI for common LLMs
demo
Demonstrate some concept or idea, not intended to be merged
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
lookup: Use tree of sequences instead of single sequence
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#8648
opened Jul 23, 2024 by
JohannesGaessler
Loading…
llama : tokenizer unicode codepoint categories
python
python script changes
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
script
Script related
testing
Everything test related
#8606
opened Jul 20, 2024 by
jaime-m-p
Loading…
2 of 4 tasks
ggml: avoid rebuild of GGML graph for each token (#7456)
ggml
changes relating to the ggml tensor library for machine learning
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
server : avoid breaking KV cache when prompt >= n_ctx (#6855)
examples
python
python script changes
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
#8359
opened Jul 8, 2024 by
prfd
Loading…
2 of 4 tasks
build example/main.cpp as shared library and intercept token printing using FFI
demo
Demonstrate some concept or idea, not intended to be merged
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#8339
opened Jul 6, 2024 by
mtasic85
Loading…
json
: $ref + object overhaul (https & recursive $refs, mix properties & allOf)
breaking change
#8199
opened Jun 28, 2024 by
ochafik
Loading…
json
: unified properties order across optional & required
examples
python
rpc : copy tensors across servers
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Save partial imatrix data
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7910
opened Jun 12, 2024 by
CISC
Loading…
WIP: Use DirectStorage with CUDA interop to more efficient load tensors
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7796
opened Jun 6, 2024 by
mtavenrath
•
Draft
feat: add changes to handle jina v2 chinese code
python
python script changes
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7795
opened Jun 6, 2024 by
JoanFM
Loading…
Add Intel Advanced Matrix Extensions (AMX) support to ggml
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
performance
Speed related topics
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7707
opened Jun 3, 2024 by
mingfeima
Loading…
batched : make n_threads and n_threads_batch configurable in batched & batched-bench
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7581
opened May 28, 2024 by
msy-kato
Loading…
ggml-threading.cpp
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Rebalancing Metal threads workload in dot product kernel kernel_mul_mv_f16_f32_l4
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7522
opened May 24, 2024 by
izard
Loading…
Introduce ggml_syncthreads()
performance
Speed related topics
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7455
opened May 22, 2024 by
jart
Loading…
Check for llama_get_logits_ith() errors
android
Issues specific to Android
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
#7448
opened May 21, 2024 by
jart
Loading…
Direct I/O and Transparent HugePages
demo
Demonstrate some concept or idea, not intended to be merged
examples
python
python script changes
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
script
Script related
server
#7420
opened May 20, 2024 by
pavelfatin
Loading…
Add minimal python client example for the server, streaming callback
examples
python
python script changes
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
#7373
opened May 18, 2024 by
chrismrutherford
Loading…
sched : support async weight copy
performance
Speed related topics
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.