fix(rest): enforce size bounds on VD resize to match create (Bug 155)#175
fix(rest): enforce size bounds on VD resize to match create (Bug 155)#175Andrei Kvapil (kvaps) wants to merge 2 commits into
Conversation
The VD resize path (PUT .../volume-definitions/{vn}, `linstor vd
set-size`) did not enforce the [4 MiB, 16 TiB] size bounds the create
path enforces via validateVDSize (Bug 155). A below-floor force-shrink
or an over-ceiling grow was accepted (200) and stored verbatim, leaving
a spec the satellite can never materialise — it hot-loops on `drbdadm
create-md` / device resize forever, the exact Bug 155 failure mode
reached through the resize verb. It does not self-heal.
Gate size_kib in rejectVDPatchSize, reusing the create path's
validateVDSize + writeVDSizeRejection so the rejection envelope is
byte-identical across the create and resize verbs. The check runs
before the shrink-vs-force gate (like the Bug 383 non-positive floor):
force waives the shrink-direction opt-in, never the physical
floor/ceiling.
Regression tests (fail on the pre-fix tree, pass with the gate): L1
handler tests assert a below-floor force-shrink and an over-ceiling
grow are refused with 400 + FAIL_INVLD_VLM_SIZE and leave the stored
size untouched, plus an inclusive-boundary accept; an envtest pair
drives the same two scenarios through the real apiserver round-trip.
The existing LargeSizeKibRoundTrip test used 1 PiB — above the 16 TiB
ceiling the create path already refuses (TestBug155VDCreateRefusesAbsurd
Size) — so it was pinning the very create/resize asymmetry this gate
closes. Its int64-no-truncation guard now uses the largest in-bounds
size (16 TiB), still ~8x the int32 clamp point.
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request mirrors the volume definition size bounds validation (4 MiB to 16 TiB) from the CREATE path onto the RESIZE path (PUT). This prevents out-of-bounds sizes from being stored during a resize, which previously caused satellite hot-loops. Unit and integration tests have been added to verify this behavior. The feedback highlights that reusing the writeVDSizeRejection helper on the resize path may output a misleading correction message referencing the create command. Additionally, the integration tests should be tightened to assert an exact 400 Bad Request status code rather than just checking for non-2xx status codes to prevent false positives during server crashes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| return false | ||
| } | ||
|
|
||
| writeVDSizeRejection(w, rd, vn, *patch.SizeKib, sizeErr) |
There was a problem hiding this comment.
The writeVDSizeRejection helper has a hardcoded correction message (Correc) that instructs the operator to re-issue linstor vd c (the create command). Since this helper is now reused in the resize/PUT path (rejectVDPatchOutOfBounds), this correction message will be misleading to operators attempting a resize. Consider refactoring writeVDSizeRejection to accept the action/verb or make the correction message more generic (e.g., "retry the operation with a valid size").
| if status >= 200 && status < 300 { | ||
| t.Fatalf("resize ACCEPTED a sub-floor size %d KiB < %d KiB min (Bug-155 class via vd set-size)", | ||
| belowFloor, vdBoundsMinSizeKib) | ||
| } |
There was a problem hiding this comment.
Asserting that the status code is simply not in the 2xx range is too loose. If the server crashes or returns a 500 Internal Server Error, the test would pass because the stored size remains unchanged. It is better to assert the exact expected status code (400 Bad Request) to verify the API contract strictly.
if status != http.StatusBadRequest {
t.Fatalf("status: got %d, want 400 (sub-floor size must be refused like create); body=%s", status, body)
}| if status >= 200 && status < 300 { | ||
| t.Fatalf("resize ACCEPTED an over-max size %d KiB > %d KiB max (Bug-155 class via vd set-size)", | ||
| aboveMax, vdBoundsMaxSizeKib) | ||
| } |
There was a problem hiding this comment.
Asserting that the status code is simply not in the 2xx range is too loose. If the server crashes or returns a 500 Internal Server Error, the test would pass because the stored size remains unchanged. It is better to assert the exact expected status code (400 Bad Request) to verify the API contract strictly.
if status != http.StatusBadRequest {
t.Fatalf("status: got %d, want 400 (over-ceiling size must be refused like create); body=%s", status, body)
}Complete the CLI-bug-fix test tiers for the VD resize size-bounds gate (vd set-size), per the blockstor CLI-bug-fix protocol. - L6 cli-matrix cell tests/e2e/cli-matrix/vd-resize-bounds-rejected.sh: over-ceiling grow via CLI rejected, below-floor force-shrink via raw REST PUT rejected (400 + FAIL_INVLD_VLM_SIZE), in-bounds grow lands; size unchanged after each rejection. - L7 replay tests/operator-harness/replay/vd-resize-bounds-rejected.yaml: the CLI-driveable operator sequence (over-ceiling reject, below-floor reject, in-bounds grow succeeds) with size-unchanged assertions. - L1: add an in-bounds force-shrink (1 GiB -> 8 MiB) guard so the gate cannot regress the legitimate scenario-4.W13 force-shrink path. The on-live-stand run of the L6 cell and L7 replay is DEFERRED — the dev-kvaps DRBD oracle is unavailable this session; both files carry a run-deferred note and a stand-pending task tracks running them. The fix is already proven end-to-end at the L1 (handler) + integration (real apiserver via envtest) tiers, which fully cover a REST-layer input rejection; the L6/L7 tiers validate DRBD-state convergence, which is N/A for a refused request, so they are the required paper trail rather than the proof. Signed-off-by: Andrei Kvapil <kvapss@gmail.com> Co-Authored-By: Claude <noreply@anthropic.com>
What
The VD resize path (
PUT .../volume-definitions/{vn}, i.e.linstor vd set-size) did not enforce the[4 MiB, 16 TiB]size bounds the create path enforces viavalidateVDSize(Bug 155). A below-floor force-shrink (e.g. 3 MiB) or an over-ceiling grow (e.g. 16 TiB + 1 GiB) was accepted with200 OKand stored verbatim, leaving aVolumeDefinitionspec the satellite can never materialise — it hot-loops ondrbdadm create-md/ device resize indefinitely. This is the exact Bug 155 failure mode reached through the resize verb instead of create, and it does not self-heal.Why it mattered
linstor vd set-sizeandlinstor vd cshould be symmetric on physical size bounds. The create path already refuses out-of-range sizes (TestBug155VDCreateRefusesAbsurdSize); the resize path silently accepted them, so a realistic operator or CSI resize could wedge a resource.Fix
Gate
size_kibinrejectVDPatchSize, reusing the create path'svalidateVDSize+writeVDSizeRejectionso the rejection envelope (400+FAIL_INVLD_VLM_SIZE, with cause/correction) is byte-identical across the two verbs. The bounds check runs before the shrink-vs-force gate — consistent with the existing Bug 383 non-positive floor — becauseforcewaives the shrink-direction opt-in, never the physical floor/ceiling; checking bounds first gives the operator the accurate "invalid size" error instead of an "add --force" hint on a size that would be refused even with force. A prop-only PUT (nosize_kib) is untouched.Tests (fail-on-bug, proven both directions)
pkg/rest/vd_put_size_bounds_round4_test.go): a below-floor force-shrink and an over-ceiling grow are refused with400+FAIL_INVLD_VLM_SIZEand leave the stored size untouched; inclusive-boundary sizes (exactly 4 MiB / 16 TiB) still land200.tests/integration/vd_resize_bounds_test.go): the same two scenarios driven through the real apiserver + store round-trip, asserting both the wire rejection and that the durable spec is unchanged.Both fail on the pre-fix tree (PUT returns
200and mutates the stored spec) and pass with the gate.Heads-up for reviewers
TestVolumeDefinitionsUpdateLargeSizeKibRoundTrippreviously used1 PiB— above the 16 TiB DRBD ceiling the create path already refuses — so it was inadvertently pinning the very create/resize asymmetry this PR closes. Its genuine purpose (a multi-TiBsize_kibsurvives the wire without int32 truncation) is preserved by switching to the largest in-bounds size (16 TiB, still ~8× the int32 clamp point).Scope
REST-layer input-validation fix, fully proven at the L1 + integration (envtest) tiers. The L6 cli-matrix / L7 replay runs on the live DRBD stand are deferred (stand unavailable); this change does not touch the satellite/DRBD data plane.