-
✨ I'm a third-year joint PhD. student at USTC and Shanghai AI Lab. I passionate about effective and efficient inference for AIGC.
-
🔭 I’m currently working on speculative decoding, a promising technique for acclerating LLM inference.
-
🌱 I’m currently learning some techniques for better inference performance.
-
👯 I’m looking to collaborate on relative topics, mainly about inference techniques. -->
Joint PhD. student at USTC&Shanghai AI Lab
- Shanghai, China
-
05:50
(UTC +08:00) - https://smart-lty.github.io/
- https://www.zhihu.com/people/xiao-jian-bu-wen-sheng-jian-qiao-65
Pinned Loading
-
ParallelSpeculativeDecoding
ParallelSpeculativeDecoding PublicThe official code for paper "parallel speculative decoding with adaptive draft length."
Python 21
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.