Skip to content

Add Ideogram4LoraLoaderMixin (LoRA loading for Ideogram4)#13921

Open
linoytsaban wants to merge 12 commits into
mainfrom
ideogram4-lora-loader
Open

Add Ideogram4LoraLoaderMixin (LoRA loading for Ideogram4)#13921
linoytsaban wants to merge 12 commits into
mainfrom
ideogram4-lora-loader

Conversation

@linoytsaban

Copy link
Copy Markdown
Collaborator

Adds Ideogram4LoraLoaderMixin and wires it into Ideogram4Pipeline, so Ideogram 4 pipelines can load_lora_weights / save_lora_weights / fuse like the other models.

This is the loading foundation split out of the Ideogram4 work so it can be reviewed/merged on its own. Two follow-ups are stacked on top of this branch:

Both depend on this mixin; this PR is independent of the training script's readiness.

@github-actions github-actions Bot added documentation Improvements or additions to documentation lora pipelines loaders size/L PR with diff > 200 LOC labels Jun 11, 2026
@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@linoytsaban linoytsaban requested a review from sayakpaul June 11, 2026 13:51

@sayakpaul sayakpaul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think the tests are missing.

support loading non-diffusers Ideogram4 LoRAs
@github-actions

Copy link
Copy Markdown
Contributor

Hi @linoytsaban, thanks for the PR! It does not appear to link an issue it fixes. If this PR addresses an existing issue, please add a closing keyword (e.g. Fixes #1234) to the PR description so the issue is linked. See the contribution guide for more details. If this PR intentionally does not fix a tracked issue, a maintainer can add the no-issue-needed label to silence this reminder.

@github-actions github-actions Bot added the tests label Jun 13, 2026
@linoytsaban linoytsaban requested a review from sayakpaul June 14, 2026 20:43
Comment on lines +193 to +195
@unittest.skip("Ideogram4 does not support call-time LoRA scaling via attention_kwargs.")
def test_set_adapters_match_attention_kwargs(self):
pass

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this case?

@sayakpaul sayakpaul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment on why the pipeline doesn't support attention_kwargs? How are the users supposed to control the LoRA strength? There also seems to be CI failures.

linoytsaban and others added 5 commits June 15, 2026 15:59
- pipeline: run the text encoder on its parameters' current device, then
  move features to the execution device, so encode_prompt works under
  enable_model_cpu_offload. The pipeline calls the text encoder's submodules
  directly to tap intermediate layers, which bypasses accelerate's onload
  hook, so the weights stay on CPU while inputs are on the execution device.
  Fixes test_lora_loading_model_cpu_offload.
- tests: override test_lora_fuse_nan to corrupt a weight under Ideogram4's
  `layers` tower (the base test probes transformer_blocks/blocks/etc.).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@linoytsaban

Copy link
Copy Markdown
Collaborator Author

@sayakpaul pushed fixes for the two CI failures you flagged — would appreciate your opinion on the pipeline change in particular.

test_lora_loading_model_cpu_offload needed a small pipeline change, not just a test tweak. Ideogram4Pipeline.encode_prompt builds the text conditioning by calling the Qwen3-VL encoder's submodules directly (embed_tokens, each decoder layer, rotary_emb) in order to tap several intermediate layers' hidden states and concatenate them into llm_features. Since it never calls text_encoder(...) (the module's forward), accelerate's enable_model_cpu_offload onload hook — which is attached to that forward — never fires, so the encoder's weights stay on CPU while the inputs are on the execution device → Expected all tensors to be on the same device in the token embedding.

Fix: run the text encoder on the device its parameters currently live on, then move the features to the pipeline's execution device. This is a no-op on the normal (non-offload) path (te_device == device) and only changes the offloaded case. Open to a different approach if you'd prefer (e.g. manually triggering the onload hook), but this felt the least invasive.

test_lora_fuse_nan was test-only: the base test finds a weight to corrupt by probing transformer_blocks/blocks/joint_transformer_blocks/single_transformer_blocks, but Ideogram4's transformer tower is named layers. Overrode it to use layers[0].attention.to_q, matching the Flux.2 / Lumina2 / Z-Image LoRA tests.

Comment on lines +371 to +378
# Run the encoder on the device its parameters currently live on, then move the features to the
# pipeline device. encode_prompt calls the text encoder's submodules directly, so under
# enable_model_cpu_offload the onload hook never fires and the weights stay on CPU; honoring their
# actual device avoids a device mismatch on the token embedding.
te_device = self.text_encoder.device
token_ids = token_ids.to(te_device)
attention_mask = attention_mask.to(te_device)
text_position_ids = text_position_ids.to(te_device)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it this way, the te_device would be the accelerator?

)
text_features = torch.stack(selected, dim=0).permute(1, 2, 3, 0).reshape(batch_size, max_sequence_length, -1)
text_features = (text_features * attention_mask.to(text_features.dtype).unsqueeze(-1)).to(torch.float32)
text_features = text_features.to(device)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why cast it back?

@sayakpaul sayakpaul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a single comment.

@sayakpaul sayakpaul requested a review from dg845 June 17, 2026 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation loaders lora models pipelines size/L PR with diff > 200 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants