Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My graphics card has insufficient memory. Can I use memory and graphics memory to run it? #32

Open
Jackxwb opened this issue May 5, 2024 · 1 comment

Comments

@Jackxwb
Copy link

Jackxwb commented May 5, 2024

My graphics card has insufficient memory. Can I use memory and graphics memory to run it?
My computer:

  • Window 10
  • GTX 1063
  • DDR4 24G

Run error message:

python chatbot.py --path V:\codellama-7b-instruct-pad
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\gradio_client\documentation.py:103: UserWarning: Could not get documentation group for <class 'gradio.mix.Parallel'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\gradio_client\documentation.py:103: UserWarning: Could not get documentation group for <class 'gradio.mix.Series'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
D:\AI\Llama2-Code-Interpreter\chatbot.py:104: GradioUnusedKwargWarning: You have unused kwarg parameters in Chatbot, please remove them: {'avatar_images': './assets/logo2.png'}
  chatbot = gr.Chatbot(height=820, avatar_images="./assets/logo2.png")
Traceback (most recent call last):
  File "D:\AI\Llama2-Code-Interpreter\chatbot.py", line 238, in <module>
    gradio_launch(model_path=args.path, load_in_4bit=True)
  File "D:\AI\Llama2-Code-Interpreter\chatbot.py", line 108, in gradio_launch
    interpreter = StreamingLlamaCodeInterpreter(
  File "D:\AI\Llama2-Code-Interpreter\code_interpreter\LlamaCodeInterpreter.py", line 79, in __init__
    self.model = LlamaForCausalLM.from_pretrained(
  File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\transformers\modeling_utils.py", line 3119, in from_pretrained
    raise ValueError(
ValueError:
                        Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
                        the quantized model. If you want to dispatch the model on the CPU or the disk while keeping
                        these modules in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom
                        `device_map` to `from_pretrained`. Check
                        https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
                        for more details.

model is Seungyoun/codellama-7b-instruct-pad

I have tried changing LlamaCodeInterpreter.py:79 to the following code, but encountered an error when running it: TypeError: LlamaForCausalLM.__init__() got an unexpected keyword argument 'load_in_8bit_fp32_cpu_offload'

self.model = LlamaForCausalLM.from_pretrained(
            model_path,
            device_map="auto",
            load_in_4bit=load_in_4bit,
            load_in_8bit=load_in_8bit,
            torch_dtype=torch.float16,
            load_in_8bit_fp32_cpu_offload=True,
        )

Complete operation log:

python chatbot.py --path V:\codellama-7b-instruct-pad
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\gradio_client\documentation.py:103: UserWarning: Could not get documentation group for <class 'gradio.mix.Parallel'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\gradio_client\documentation.py:103: UserWarning: Could not get documentation group for <class 'gradio.mix.Series'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
D:\AI\Llama2-Code-Interpreter\chatbot.py:104: GradioUnusedKwargWarning: You have unused kwarg parameters in Chatbot, please remove them: {'avatar_images': './assets/logo2.png'}
  chatbot = gr.Chatbot(height=820, avatar_images="./assets/logo2.png")
Traceback (most recent call last):
  File "D:\AI\Llama2-Code-Interpreter\chatbot.py", line 238, in <module>
    gradio_launch(model_path=args.path, load_in_4bit=True)
  File "D:\AI\Llama2-Code-Interpreter\chatbot.py", line 108, in gradio_launch
    interpreter = StreamingLlamaCodeInterpreter(
  File "D:\AI\Llama2-Code-Interpreter\code_interpreter\LlamaCodeInterpreter.py", line 79, in __init__
    self.model = LlamaForCausalLM.from_pretrained(
  File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\transformers\modeling_utils.py", line 2959, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
TypeError: LlamaForCausalLM.__init__() got an unexpected keyword argument 'load_in_8bit_fp32_cpu_offload'
@yarou1025
Copy link

try pip install gradio==4.44.0 for warning in gradio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants