Skip to content

Commit

Permalink
Dev/1.18 (#8)
Browse files Browse the repository at this point in the history
* Update to 1.1.8.0.

* Add MACs.

* Power should be measured in the host not inside the container.

---------

Co-authored-by: joonhyung.lee <joonhyung.lee@navercorp.com>
  • Loading branch information
veritas9872 and joonhyung.lee authored Oct 30, 2024
1 parent 8645f55 commit a825a28
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 6 deletions.
4 changes: 2 additions & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,6 @@ services:
dockerfile: Dockerfile
args:
OS: ${OS:-ubuntu22.04}
PYTORCH_VERSION: ${PYTORCH_VERSION:-2.3.1}
SYNAPSE_VERSION: ${SYNAPSE_VERSION:-1.17.0}
PYTORCH_VERSION: ${PYTORCH_VERSION:-2.4.0}
SYNAPSE_VERSION: ${SYNAPSE_VERSION:-1.18.0}
IMAGE_TAG: ${IMAGE_TAG:-latest}
2 changes: 1 addition & 1 deletion environment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ dependencies: # Use conda packages if possible.
- pip # For `pip` dependencies that are not available in conda.
- pip:
# Modify the version tag as necessary.
- git+https://github.com/huggingface/optimum-habana@v1.13.2
- git+https://github.com/huggingface/optimum-habana@v1.14.1
8 changes: 5 additions & 3 deletions prefill/llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
--num_steps 32
```
For power measurements, use one of the following commands.
For power measurements, use one of the following commands on the host.
The container will likely not have `ipmitool` available.
```bash
sudo ipmitool dcmi power reading 5_sec
Expand Down Expand Up @@ -83,7 +84,7 @@ def measure(
) -> dict:
config = AutoConfig.from_pretrained(model_name, torch_dtype=torch.bfloat16)

flops = approx_llama_forward_macs(
macs = approx_llama_forward_macs(
num_decoder_blocks=config.num_hidden_layers,
sequence_length=seq_len,
vocabulary_size=config.vocab_size,
Expand All @@ -94,7 +95,7 @@ def measure(
gated_ffn_act=True,
)

flops *= 2 * batch_size # 1 MAC is approximately 2 FLOPs.
flops = macs * 2 * batch_size # 1 MAC is approximately 2 FLOPs.
device = torch.device("hpu") # HPUs do not have numbers, unlike NVIDIA GPUs.
x = torch.zeros(size=(batch_size, seq_len), dtype=torch.int64, device=device)

Expand Down Expand Up @@ -136,6 +137,7 @@ def measure(
"Model Min TFLOPS": min(tfps),
"Model Max TFLOPS": max(tfps),
"Model STDEV TFLOPS": stdev(tfps),
"Forward MAC Count": macs,
}
return info

Expand Down

0 comments on commit a825a28

Please sign in to comment.