Skip to content

fix(scheduler): store sigma_min/max before shifting to prevent double-shift in FlowMatchEulerDiscreteScheduler (#13243)#13962

Open
adhavan18 wants to merge 1 commit into
huggingface:mainfrom
adhavan18:fix/13243-flowmatch-double-sigma-shift
Open

fix(scheduler): store sigma_min/max before shifting to prevent double-shift in FlowMatchEulerDiscreteScheduler (#13243)#13962
adhavan18 wants to merge 1 commit into
huggingface:mainfrom
adhavan18:fix/13243-flowmatch-double-sigma-shift

Conversation

@adhavan18

Copy link
Copy Markdown

What does this PR fix?

Fixes #13243FlowMatchEulerDiscreteScheduler.set_timesteps was applying the shift formula twice, producing an incorrect (doubly-shifted) sigma schedule.

Root cause

In __init__, sigma_min and sigma_max were stored after the shift formula was applied:

sigmas = timesteps / num_train_timesteps
if not use_dynamic_shifting:
    sigmas = shift * sigmas / (1 + (shift - 1) * sigmas)  # shift applied ✓

self.sigmas = sigmas.to("cpu")
self.sigma_min = self.sigmas[-1].item()  # stored after shift ← BUG
self.sigma_max = self.sigmas[0].item()   # stored after shift ← BUG

Then in set_timesteps (default path, sigmas=None, timesteps=None):

timesteps = np.linspace(
    self._sigma_to_t(self.sigma_max),   # = already-shifted sigma_max * N
    self._sigma_to_t(self.sigma_min),   # = already-shifted sigma_min * N
    num_inference_steps,
)
sigmas = timesteps / self.config.num_train_timesteps  # recovers shifted values

# ...
sigmas = self.shift * sigmas / (1 + (self.shift - 1) * sigmas)  # shift applied AGAIN ← double shift

Because _sigma_to_t(σ) = σ × N and the division by N is its exact inverse, sigmas in set_timesteps ends up equal to the already-shifted sigma_min…sigma_max range — and then the shift formula runs a second time on those values.

Fix

Record sigma_min and sigma_max from the raw linear sigmas, before the shift is applied. The shift formula in set_timesteps then runs exactly once on the correct unshifted inputs.

sigmas = timesteps / num_train_timesteps

# Store bounds BEFORE shifting so set_timesteps gets unshifted inputs
self.sigma_min = sigmas[-1].item()
self.sigma_max = sigmas[0].item()

if not use_dynamic_shifting:
    sigmas = shift * sigmas / (1 + (shift - 1) * sigmas)

Impact

Any model that calls set_timesteps without passing explicit sigmas or timesteps — the default inference path for FLUX, Stable Diffusion 3, Wan, HiDream, CogVideoX, and all other FlowMatch-based models — was affected. With shift > 1, the schedule was always more compressed than intended.

Tests

Added test_set_timesteps_no_double_shift: verifies that set_timesteps(num_inference_steps=1000) produces the same sigma grid that __init__ stores, for a scheduler with shift=3.0. All 14 tests pass.

…-shift (huggingface#13243)

FlowMatchEulerDiscreteScheduler.__init__ computed sigma_min and sigma_max
from the already-shifted sigmas.  When set_timesteps regenerated the sigma
grid from those bounds via _sigma_to_t -> linspace -> /num_train_timesteps,
it recovered the shifted values and then applied the shift formula a second
time, producing a doubly-shifted (and therefore incorrect) schedule.

Fix: record sigma_min and sigma_max from the raw linear sigmas
(timesteps / num_train_timesteps) before the shift formula is applied, so
set_timesteps starts from the correct unshifted bounds and the shift is
applied exactly once.

Regression test: test_set_timesteps_no_double_shift verifies that
set_timesteps(num_inference_steps=1000) reproduces the same sigma grid
that __init__ stored, for a scheduler with shift=3.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] FlowMatchEulerDiscreteScheduler.__init__ computes sigma_min/sigma_max after shift, causing duplicate shift in set_timesteps

1 participant