Skip to content

fix(store): preserve per-volume DRBDMinor across VD-scoped REST modify (Bug 433)#177

Draft
Andrei Kvapil (kvaps) wants to merge 3 commits into
mainfrom
blue/round5-vd-modify-drops-drbd-minor
Draft

fix(store): preserve per-volume DRBDMinor across VD-scoped REST modify (Bug 433)#177
Andrei Kvapil (kvaps) wants to merge 3 commits into
mainfrom
blue/round5-vd-modify-drops-drbd-minor

Conversation

@kvaps

Copy link
Copy Markdown
Member

What

A legal, in-bounds VD-scoped modify — vd set-size (grow) or vd set-property — silently dropped the per-volume DRBDMinor (the /dev/drbd<N> device identity) from RD.Spec.VolumeDefinitions[].

Both VD-scoped store write paths (volumeDefinitions.Update and PatchVolumeDefinitionSpec) rebuild the inline entry through wireToCRDVD, whose wire shape apiv1.VolumeDefinition carries no DRBDMinor — so the round-trip zeroed it. In isolation the controller re-allocates the same minor (self-heals), but once a lower minor is freed by routine RD churn, the allocate-if-nil pass hands the resized volume that freed lower minor instead. The result is a permanent device-identity change on a live volume, driven purely by a resize / property edit — violating the CRD's own contract that a non-nil minor "is NEVER overwritten … preserved across a REST modify".

This is the same wire-rebuild-drops-operator-only-field class as the carry-across family (Bug 206 / 208 / 209). The RD-scoped path preserves the whole VolumeDefinitions slice wholesale (so the minor rode along for free); only the VD-scoped element rebuild dropped it.

Fix

Route both write-backs through a new wireToCRDVDPreserving, which carries the operator-only DRBDMinor across the rebuild.

ResourceDefinitionVolume is now registered in the typed-field carry-across tripwire — it was wrongly documented as having no operator-only fields, the gap that let this ship — so a future operator-only field on the VD element is a build-stop failure unless it is explicitly classified.

Tests (fail-on-bug, proven both directions)

  • L1 store unit (pkg/store/k8s/drbd_minor_carry_bug433_test.go): the minor survives resize / props / Update. FAILs on the pre-fix tree (minor comes back nil), PASSes with the carry-across.
  • Integration / envtest (tests/integration/group_blue_drbd_minor_bug433_test.go): the full REST → store → controller path. Pre-fix the minor diverges 20001 → 20000 (the freed lower minor); with the fix it holds at 20001.
  • L6 cli-matrix + L7 replay (vd-modify-preserves-drbd-minor.{sh,yaml} + a new drbd_minor replay await kind): the operator-CLI codification per the CLI-bug-fix protocol. On-stand execution is deferred — the DRBD stand is unavailable this session — so these are authored but not run here; the store unit and the envtest integration are the deterministic proof.

Each regression was verified to FAIL on the pre-fix tree and PASS with the fix.

Andrei Kvapil (kvaps) and others added 3 commits July 4, 2026 00:20
…Bug 433)

A legal in-bounds `vd set-size` grow or `vd set-property` silently
dropped RD.Spec.VolumeDefinitions[].DRBDMinor — the /dev/drbd<N> device
identity. Both VD-scoped write paths (volumeDefinitions.Update and
PatchVolumeDefinitionSpec) rebuild the inline entry through wireToCRDVD,
whose wire shape apiv1.VolumeDefinition carries no DRBDMinor, so the
round-trip zeroed it. In isolation the allocator re-picks the same
minor, but once a lower minor is freed by routine RD churn the
controller re-stamps a DIFFERENT minor — a permanent device-identity
change on a live resized volume, violating the CRD's "a non-nil minor
is NEVER overwritten … preserved across a REST modify" contract.

Route both write-backs through wireToCRDVDPreserving, which carries the
operator-only DRBDMinor across the rebuild. Same
wire-rebuild-drops-operator-only-field class as the carry-across family
(Bug 206/208/209); the RD-scoped path preserved the whole
VolumeDefinitions slice wholesale, which is why only the VD-scoped
element rebuild dropped it.

Register ResourceDefinitionVolume in the typed-field carry-across
tripwire — it was wrongly documented as having no operator-only fields,
the gap that let this ship — so a future field gets a build-stop failure
unless it is explicitly classified.

Regression (fail-on-bug, store layer): TestBug433_* assert the minor
survives resize / props / Update. They FAIL on the pre-fix tree (minor
comes back nil) and PASS with the carry-across.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
End-to-end (REST handler -> store -> controller allocator) fail-on-bug
regression for the per-volume DRBDMinor drop on a VD-scoped modify.

Scenario: RD-A grabs minor 20000, RD-B 20001; delete RD-A to free the
lower minor; issue a legal in-bounds `vd set-size` grow (and, in the
sibling test, a `vd set-property`) on RD-B vol0. On the pre-fix tree the
minor diverges 20001 -> 20000 (the freed lower minor); with the fix it
stays 20001. Complements the store-layer unit by proving the whole path
does not re-stamp a live volume`s device identity.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
… 433)

Per the CLI-bug-fix protocol a CLI-verb fix is not closed until the L6
cli-matrix cell and L7 replay land in the same PR.

- vd-modify-preserves-drbd-minor.sh (L6): at the operator-CLI level,
  create rd-a (lowest minor) + rd-b (next), delete rd-a to free the
  lower minor, then `vd set-size` and `vd set-property` on rd-b — assert
  its DRBDMinor AND /dev/drbd<N> device path stay put over a settle
  window. Captures the minors from the live RD spec, so it needs no
  hard-coded values and skips cleanly on a dirty minor range.
- vd-modify-preserves-drbd-minor.yaml (L7): the same operator sequence
  as a replay workflow, asserting minor stability via a new await kind.
- replay harness: add the `drbd_minor` await kind (reads
  RD.Spec.VolumeDefinitions[vol].drbdMinor; pair with hold_s for
  stability) + document it in the runner header.

On-stand run is DEFERRED: the DRBD stand is unavailable this session, so
these are authored but not executed here. The store-layer unit and the
envtest integration test are the deterministic fail-on-bug proof.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 430f677d-2c41-4d04-a9b2-4380d5fda4ca

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch blue/round5-vd-modify-drops-drbd-minor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request resolves Bug 433 by ensuring that the operator-only DRBDMinor field (the DRBD device identity) is preserved during volume-definition-scoped updates and patches. The fix introduces a wireToCRDVDPreserving helper function to carry the minor across the wire rebuild, and adds extensive unit, integration, and end-to-end replay tests to prevent regression. The review feedback suggests adding a defensive nil check for the prev pointer in the new helper function to prevent potential nil pointer dereferences.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +484 to +489
func wireToCRDVDPreserving(prev *crdv1alpha1.ResourceDefinitionVolume, wire *apiv1.VolumeDefinition) crdv1alpha1.ResourceDefinitionVolume {
out := wireToCRDVD(wire)
out.DRBDMinor = prev.DRBDMinor

return out
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To enforce defensive programming and prevent potential nil pointer dereferences in the future, it is safer to add a nil check for the prev pointer before accessing its fields. While the current callers pass a non-nil pointer, safeguarding this helper function ensures robustness against future refactoring or reuse.

Suggested change
func wireToCRDVDPreserving(prev *crdv1alpha1.ResourceDefinitionVolume, wire *apiv1.VolumeDefinition) crdv1alpha1.ResourceDefinitionVolume {
out := wireToCRDVD(wire)
out.DRBDMinor = prev.DRBDMinor
return out
}
func wireToCRDVDPreserving(prev *crdv1alpha1.ResourceDefinitionVolume, wire *apiv1.VolumeDefinition) crdv1alpha1.ResourceDefinitionVolume {
out := wireToCRDVD(wire)
if prev != nil {
out.DRBDMinor = prev.DRBDMinor
}
return out
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant