[WIP] Update diffusers-cli for agentic use#13966
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
|
||
| - Multi-stage workflows where you need intermediate tensor manipulation between pipelines → write Python. | ||
| - Training or fine-tuning → CLI only covers inference. | ||
| - Anything requiring custom `device_map`, `quantization_config`, or other low-level loader knobs not exposed by |
There was a problem hiding this comment.
Feels like quantization could be exposed to the CLI. Right now, one can only do that when using a prequantized checkpoint?
There was a problem hiding this comment.
Quantization has a fairly large API surface that might be better suited to writing a dedicated quantization script? e.g BnB quant config options have no overlap with TorchAO which in turn have no overlap with ModelOpt etc etc. TorchAO also supports using AOBaseConfig input which in turn has it's own input args.
We could explore trying to provide the option via a more restricted API though.
There was a problem hiding this comment.
No your reasoning makes sense. It's just that a user could expect it because quantization is sometimes the only way to do it locally. We can table it for now.
| parser.add_argument("--vae-tiling", action="store_true", help="Enable VAE tiling (lower peak VRAM).") | ||
| parser.add_argument("--vae-slicing", action="store_true", help="Enable VAE slicing (lower peak VRAM).") | ||
| parser.add_argument( | ||
| "--context-parallel", |
There was a problem hiding this comment.
How does it interact with --remote?
There was a problem hiding this comment.
I'm not sure I follow?
There was a problem hiding this comment.
How --context-parallel interact with --remote? Like do we want the users to run context parallel inference in case HF Jobs don't support it? Or do we want to just delegate to HF Jobs and propagate if there are errors?
|
Hi @DN6, thanks for the PR! It does not appear to link an issue it fixes. If this PR addresses an existing issue, please add a closing keyword (e.g. |

What does this PR do?
Some updates to the
diffusers-clito make it more agent friendly. This PRdiffusers-cliskill to showcase the features available via the CLI and how to use themdescribecommand that can we used to extract the inputs of a pipeline from an input repo idgeneratecommand that runs inference with any diffusers compatible pipelines. It also provides a number of optimization options (CP, cpu/group offload) + LoRA and allows running inference remotely on HF jobs.Fixes # (issue)
Before submitting
.ai/(e.g. viamake claude/make codex)? See Coding with AI agents..ai/review-rules.md?documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.