Skip to content

[Feature]: Support for logprobs sampling parameter in TT backend #37

Open
@milank94

Description

🚀 The feature, motivation and pitch

I'm working on evaluating Llama3.1-70B on the MMLU and MMLU-Pro datasets from Language Model Evaluation Harness to compare with the benchmarks obtained by Meta https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct#instruction-tuned-models with what Tenstorrent achieves.

This dataset evaluation relies on the logprobs output of the model: https://cookbook.openai.com/examples/using_logprobs. However, TT backend currently does not support this parameter output: https://github.com/tenstorrent/vllm/blob/dev/vllm/worker/tt_model_runner.py#L430 and as observed by trying to run the evaluation harness:

ERROR 11-23 14:35:16 engine.py:159] AssertionError('Currently not supporting logprobs')
ERROR 11-23 14:35:16 engine.py:159] Traceback (most recent call last):
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/engine/multiprocessing/engine.py", line 157, in start
ERROR 11-23 14:35:16 engine.py:159]     self.run_engine_loop()
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/engine/multiprocessing/engine.py", line 220, in run_engine_loop
ERROR 11-23 14:35:16 engine.py:159]     request_outputs = self.engine_step()
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/engine/multiprocessing/engine.py", line 238, in engine_step
ERROR 11-23 14:35:16 engine.py:159]     raise e
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/engine/multiprocessing/engine.py", line 229, in engine_step
ERROR 11-23 14:35:16 engine.py:159]     return self.engine.step()
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/engine/llm_engine.py", line 1402, in step
ERROR 11-23 14:35:16 engine.py:159]     outputs = self.model_executor.execute_model(
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/executor/tt_executor.py", line 55, in execute_model
ERROR 11-23 14:35:16 engine.py:159]     output = self.driver_worker.execute_model(execute_model_req)
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/worker/tt_worker.py", line 333, in execute_model
ERROR 11-23 14:35:16 engine.py:159]     inputs = self.prepare_input(execute_model_req)
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/worker/worker_base.py", line 291, in prepare_input
ERROR 11-23 14:35:16 engine.py:159]     return self._get_driver_input_and_broadcast(execute_model_req)
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/worker/worker_base.py", line 253, in _get_driver_input_and_broadcast
ERROR 11-23 14:35:16 engine.py:159]     self.model_runner.prepare_model_input(
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/worker/tt_model_runner.py", line 192, in prepare_model_input
ERROR 11-23 14:35:16 engine.py:159]     self._validate_sampling_params(sampling_params)
ERROR 11-23 14:35:16 engine.py:159]   File "/home/mkordic/vllm_test/vllm/vllm/worker/tt_model_runner.py", line 430, in _validate_sampling_params
ERROR 11-23 14:35:16 engine.py:159]     assert sampling_params.logprobs is None, "Currently not supporting logprobs"
ERROR 11-23 14:35:16 engine.py:159] AssertionError: Currently not supporting logprobs
INFO:     127.0.0.1:48296 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error

Steps to reproduce are located here: https://github.com/tenstorrent/tt-inference-server/tree/main/evals

Example of a run command:

lm_eval \
--model local-completions \
--model_args model=meta-llama/Meta-Llama-3.1-70B,base_url=http://127.0.0.1:8000/v1/completions,num_concurrent=32,max_retries=4,tokenized_requests=False,add_bos_token=True \
--gen_kwargs model=meta-llama/Meta-Llama-3.1-70B,stream=False \
--tasks mmlu \
--batch_size auto \
--output_path /home/mkordic/lm-evaluation-harness/eval_output  \
--seed 42  \
--log_samples

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions