Add option to use torch._inductor.standalone_compile #17057
+98
−27
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds the option to use torch._inductor.standalone_compile to perform compilation instead of compile_fx. The goal of standalone_compile is to remove the hacks around vLLM's usage of compile_fx, we want to migrate to using it in PyTorch 2.8.
standalone_compile replaces how vLLM interacts with the torch.compile caches. Instead of vLLM trying to redirect them into its torch_compile_cache folder, vLLM can pass standalone_compile a filepath that is inside of the torch_compile_cache folder and standalone_compile will write the full precompiled artifact to it.
Right now this option is hidden behind a config flag. It is also not tested in vLLM CI (vLLM CI only tests against PyTorch 2.6). This option also needs more testing before we turn it on by default for PyTorch 2.8+. I am putting this PR out so that we can merge something that we can keep developing on top of.
Test Plan: