Add `AttentionMixin` to transformers that are missing it by HaozheZhang6 · Pull Request #13941 · huggingface/diffusers

HaozheZhang6 · 2026-06-13T18:24:51Z

What does this PR do?

Several transformers build their attention with AttentionModuleMixin but don't inherit the model-level AttentionMixin, so attn_processors / set_attn_processor / fuse_qkv_projections / unfuse_qkv_projections raise AttributeError. This adds AttentionMixin to all of them, the same way WanVACETransformer3DModel was handled (#12186):

OvisImageTransformer2DModel (the original ovis_image model/pipeline review #13630 finding)
BriaTransformer2DModel, BriaFiboTransformer2DModel
AnyFlowTransformer3DModel, AnyFlowFARTransformer3DModel
ErnieImageTransformer2DModel
LongCatAudioDiTTransformer

The attention modules already use AttentionModuleMixin, so the four APIs work without further changes and resolve to each model's own processor. Each model gets a regression test that attn_processors is populated and set_attn_processor round-trips.

Note: AnyFlowTransformer3DModel, AnyFlowFARTransformer3DModel, and ErnieImageTransformer2DModel have some unrelated pre-existing failures in their model test suites on main (not touched here); the new test_attention_processor_api passes for all seven.

Addresses the AttentionMixin pattern from #13656.

Before submitting

Did you use an AI agent (Claude Code, Codex, Cursor, etc.) to help with this PR? If so:
- Pointed it at .ai/ — read AGENTS.md and review-rules.md.
- Self-reviewed the diff against .ai/review-rules.md.
Did you read the contributor guideline?
Was this discussed via a GitHub issue? Yes — [meta issue] Systematic model/pipeline review findings / tracking #13656 / ovis_image model/pipeline review #13630.
Did you write any new tests?

Who can review?

@hlky — batched per your suggestion, from your model reviews in #13656.

HaozheZhang6 · 2026-06-13T18:42:09Z

While here I checked the other transformers that build their attention with AttentionModuleMixin but don't inherit the model-level AttentionMixin — the same gap is in AnyFlowTransformer3DModel, AnyFlowFARTransformer3DModel, BriaTransformer2DModel, BriaFiboTransformer2DModel, and ErnieImageTransformer2DModel (all raise AttributeError on attn_processors). Happy to fix those in a follow-up batch if this approach looks right.

hlky · 2026-06-13T18:56:52Z

@HaozheZhang6 If you can identify a pattern from all the issues, then it is better to fix those all at once. AttentionMixin is a good case for that as the change is minimal. Other patterns where the change is more involved may be riskier to fix as a whole. There may be others beyond than those you found, I would either point your agent at the meta issue ask it to check all the sub-issues, or do another pass on repo as a whole. For this particular pattern the part I would exercise caution with is the tests, depending on maintainers view (I would ask @DN6 or @sayakpaul about this) it may be better to add the test to ModelTesterMixin rather than individually.

HaozheZhang6 · 2026-06-13T19:07:06Z

Makes sense — I'll consolidate them into this PR. I'll re-scan for the full set (including any beyond the five I listed) and add AttentionMixin + a regression test to each here, then close the separate Bria PR I'd just opened (#13943).

ovis_image, bria, bria_fibo, anyflow, anyflow_far, ernie_image and longcat_audio_dit build attention with AttentionModuleMixin but did not inherit the model-level AttentionMixin, so attn_processors / set_attn_processor / fuse_qkv_projections raised AttributeError (same gap as WanVACETransformer3DModel huggingface#12186). Add the mixin + a regression test for each. Addresses the AttentionMixin pattern from huggingface#13656. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

HaozheZhang6 · 2026-06-13T19:11:18Z

Done — all seven are in now: OvisImage, Bria, BriaFibo, AnyFlow, AnyFlowFAR, ErnieImage, and LongCatAudioDiTTransformer (that last one I'd missed in the first scan). Each gets AttentionMixin + a small regression test.

github-actions · 2026-06-14T08:45:31Z

Hi @HaozheZhang6, thanks for the PR! It does not appear to link an issue it fixes. If this PR addresses an existing issue, please add a closing keyword (e.g. Fixes #1234) to the PR description so the issue is linked. See the contribution guide for more details. If this PR intentionally does not fix a tracked issue, a maintainer can add the no-issue-needed label to silence this reminder.

github-actions Bot added size/S PR with diff < 50 LOC models tests and removed size/S PR with diff < 50 LOC labels Jun 13, 2026

HaozheZhang6 mentioned this pull request Jun 13, 2026

Add AttentionMixin to Bria transformers #13943

Closed

6 tasks

HaozheZhang6 changed the title ~~Add AttentionMixin to OvisImageTransformer2DModel~~ Add AttentionMixin to transformers that are missing it Jun 13, 2026

HaozheZhang6 force-pushed the ovis-image-attention-mixin branch from e11c8cc to f79a2b5 Compare June 13, 2026 19:10

github-actions Bot added the size/M PR with diff < 200 LOC label Jun 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `AttentionMixin` to transformers that are missing it#13941

Add `AttentionMixin` to transformers that are missing it#13941
HaozheZhang6 wants to merge 1 commit into
huggingface:mainfrom
HaozheZhang6:ovis-image-attention-mixin

HaozheZhang6 commented Jun 13, 2026 •

edited

Loading

Uh oh!

HaozheZhang6 commented Jun 13, 2026

Uh oh!

hlky commented Jun 13, 2026

Uh oh!

HaozheZhang6 commented Jun 13, 2026

Uh oh!

HaozheZhang6 commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HaozheZhang6 commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HaozheZhang6 commented Jun 13, 2026

Uh oh!

hlky commented Jun 13, 2026

Uh oh!

HaozheZhang6 commented Jun 13, 2026

Uh oh!

HaozheZhang6 commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HaozheZhang6 commented Jun 13, 2026 •

edited

Loading