Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
qwopqwop200 authored Apr 1, 2023
1 parent ee1a16a commit 82229f3
Showing 1 changed file with 1 addition and 5 deletions.
6 changes: 1 addition & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,7 @@ Changed to support new features proposed by [GPTQ](https://github.com/IST-DASLab
* Slightly adjusted preprocessing of C4 and PTB for more realistic evaluations (used in our updated results); can be activated via the flag --new-eval.
* two new tricks:--act-order (quantizing columns in order of decreasing activation size) and --true-sequential (performing sequential quantization even within a single Transformer block). Those fix GPTQ's strangely bad performance on the 7B model (from 7.15 to 6.09 Wiki2 PPL) and lead to slight improvements on most models/settings in general.

I changed the code to use [triton](https://github.com/openai/triton) now. It works without triton, but using triton is recommended.

Due to triton's limitation, 3bit could not be implemented.

Triton only supports Linux, so if you are a Windows user, please use [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install).
**Unless you are using 3bit, i recommend using a [branch](https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/triton) that currently supports [triton](https://github.com/openai/triton).**

## Result
<details>
Expand Down

0 comments on commit 82229f3

Please sign in to comment.