torch.compile error
Hi, i have this problem when i am running torch.compile code snippet:
Traceback (most recent call last):
File "/data/gemma_torch/torch_compile.py", line 40, in
outputs = model.generate(**model_inputs, past_key_values=past_key_values, do_sample=True, temperature=1.0, max_new_tokens=128)
...
torch._dynamo.exc.Unsupported: reconstruct: UserDefinedObjectVariable(HybridCache)
from user code:
File "/home/.local/lib/python3.10/site-packages/transformers/models/gemma2/modeling_gemma2.py", line 1111, in forward
return CausalLMOutputWithPast(
Versions:
torch: 2.3.1
transformers 4.42.4
python: 3.10
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
GPU: NVIDIA L40S
Did anyone encounter this problem? What can i try to get it running?
Hi
@LD-inform
, It seems, there is issue with the installed torch version 2.3.1
. Could you please try again by upgrading the torch
library to the latest version 2.5.1 using !pip install -U torch
and let us know if the issue still persists.
Thank you @Renu11 .
i had to add this parameter to torch.compile(..., backend="eager")
without it, i had this error:
/.local/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1465, in _call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e) from e
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
RuntimeError: Should never be installed
however, the speed increased only by 5t/s (from 44t/s to 49t/s) in 2B model and in 9B model it even decreased from 26t/s to 20t/s.
i used it with different prompt as well but i doubt that is the problem:
input_text = "user\nExplain in details difference between integrals and derivatives.\nmodel"
is there anything else i can try?
Apologies for the delayed response. I tried replicating the same issue and found that it's been resolved in the latest versions of the transformers
and torch
libraries. Could you please confirm if you're still encountering this problem after updating your libraries? Thank you.