torch.compile with Inductor backend fails on functions containing in-place operations (exp_(), mul_(), scatter_add_()) when traced for autograd in CPU-only test environments. Error: 'BackendCompilerFailed: one of the variables needed for gradient computation has been modified by an inplace operation'. This is because Inductor's AOT autograd tracing conflicts with version tracking on in-place mutated tensors. The same code works correctly in eager mode and on GPU with full Inductor compilation to Triton kernels.
For testing @torch.compile'd functions that use in-place ops in CPU-only environments, disable compilation at the module level before importing the function. Access the unwrapped function via wrapped if available, or use torch.compiler.disable():
import my_module
my_module._compiled_fn = (
my_module._compiled_fn.__wrapped__
if hasattr(my_module._compiled_fn, '__wrapped__')
else torch.compiler.disable(my_module._compiled_fn)
)The math is identical in eager mode. @torch.compile is only needed for GPU performance (Triton kernel fusion), not correctness.