You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello @LostXine, when running llara on multiple GPUs, I encountered the following error:
Exception has occurred: RuntimeError
CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
File "/home/lsf_storage/homes/ch/claude/LLaRA/train-llava/llava/model/language_model/llava_llama.py", line 92, in forward
return super().forward(
File "/home/ch/claude/LLaRA/eval/llara_adv_attack.py", line 235, in model_generation
outputs = model(**inputs)
File "/home/ch/claude/LLaRA/eval/llara_adv_attack.py", line 526, in query_bc
ans, _ , i = model_generation(tokenizer, model, image_processor, image_list, prepared_prompt)
File "/home/ch/claude/LLaRA/eval/llara_adv_attack.py", line 436, in eval_episode
paresed_action, prepared_prompt, ans, image = gen_action(tokenizer, model, image_processor,
File "/home/ch/claude/LLaRA/eval/llara_adv_attack.py", line 567, in <module>
eval_episode(args, query_bc, parse_bc)
RuntimeError: CUDA error: device-side assert triggered
However, everything works fine when I run llara on a single GPU. Are there any specific configurations required for multiple GPU usage?
The text was updated successfully, but these errors were encountered:
I did not try to run llara on multiple gpus for inference. The error log hints at some compatibility issues or hardware config issues but I'm not 100% sure. Could you confirm you are using the same version of some important packages (i.e. torch, cuda,..) as llava? I would like to test multiple GPU inference as well but unfortunately, I'm on travel now. I will try to get back to you before next weekend.
Thank you for your understanding.
I apologize for the delayed response. I can confirm that LLaRA has been successfully tested on multiple GPU configurations over the past several months, including:
2x RTX 3060
2x RTX 3090
2x A5000
Therefore, could you perform a fresh install of the python environment and pull the latest code and check it again? If that does not work, could you share more information on the server and python environment you are using?
Hello @LostXine, when running llara on multiple GPUs, I encountered the following error:
However, everything works fine when I run llara on a single GPU. Are there any specific configurations required for multiple GPU usage?
The text was updated successfully, but these errors were encountered: