Skip to content

Batch MPP claims into single ChannelMonitorUpdate#4552

Open
Abeeujah wants to merge 1 commit into
lightningdevkit:mainfrom
Abeeujah:parallelize-mppclaims
Open

Batch MPP claims into single ChannelMonitorUpdate#4552
Abeeujah wants to merge 1 commit into
lightningdevkit:mainfrom
Abeeujah:parallelize-mppclaims

Conversation

@Abeeujah

@Abeeujah Abeeujah commented Apr 11, 2026

Copy link
Copy Markdown
Contributor
Batch same-channel MPP claims into a single commitment

When an MMP payment has multiple parts that arrive over the same final-hop
channel, we previously processed each part sequentially through the full
commitment dance: claim one part, build a ChannelMonitorUpdate, ship a
commitment_signed, wait for the peer's RAA + next commitment_signed, then
move to the next part. This added significant claim latency and wasted
round trips on unnecessary monitor updates.

Instead, queue all MPP part claims into the channel's holding cell first,
then flush them together in a single pass. The holding-cell flush produces
one combined ChannelMonitorUpdate (preimage + commitment steps) and one
commitment_signed carrying all update_fulfill_htlc messages at once,
eliminating the intermediate round trips.

To enable this, the old get_update_fulfill_htlc_and_commit is split into
queue_fulfill_htlc (pushes into the holding cell) and the existing holding-
cell flush path. ClaimedMPPPayment RAA blockers are deliberately ignored
during both the queue and flush phases so the first MPP part on a channel
does not force subsequent parts onto the standalone preimage path.

When the channel cannot immediately flush its holding cell (e.g. peer is
disconnected or another monitor update is in flight), a standalone preimage-
only ChannelMonitorUpdate is persisted for HTLC-preimage safety, matching
the prior behavior.

Tests has been updated to reflect this new batching of MPP claims

  • test_single_channel_multiple_mpp
  • auto_retry_partial_failure
  • test_keysend_dup_hash_partial_mpp

closes #3986

@ldk-reviews-bot

ldk-reviews-bot commented Apr 11, 2026

Copy link
Copy Markdown

👋 I see @TheBlueMatt was un-assigned.
If you'd like another reviewer assignment, please click here.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz April 11, 2026 17:52
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/chanmon_update_fail_tests.rs
@ldk-claude-review-bot

ldk-claude-review-bot commented Apr 11, 2026

Copy link
Copy Markdown
Collaborator

The line I wanted to annotate (7803) is unchanged context, not part of the diff, so I'll note it in the summary instead.

Re-review Summary — PR #4552 (commit 626a4b6)

This commit has been substantially reworked since my prior review pass, and the major previously-flagged issues now appear resolved:

  • N actions per batch (RESOLVED): Completion actions and ClaimedMPPPayment RAA blockers are now registered exactly once per touched channel in the new flush_pending_holding_cell_claim loop, not per-HTLC.
  • monitor_updating_paused panic (RESOLVED): build_preimage_only_monitor_update now calls monitor_updating_paused(...) before the update is submitted, preventing the MONITOR_UPDATE_IN_PROGRESS assertion failure on the deferred path.
  • Preimage-loss-on-close window (RESOLVED): flush_pending_holding_cell_claim detects a vanished/closed channel and falls through to the closed-channel path (bumping closed_channel_monitor_update_ids and writing a preimage-only update), so the queued preimage is persisted.
  • claim_htlc_while_disconnected_dropping_mon_update_legacy (RESOLVED): now calls get_update_fulfill_htlc directly with force_holding_cell=false, so no spurious MONITOR_UPDATE_IN_PROGRESS/ID shifting.
  • Deferred-flush test coverage (RESOLVED): new test_single_channel_multiple_mpp_awaiting_raa exercises the build_preimage_only_monitor_update path.
  • Fail/claim conflict (PARTIALLY RESOLVED): now returns NewClaim instead of DuplicateClaim, restoring preimage persistence in the non-batched case.

Remaining / newly-introduced observation (not posted inline — target line is unchanged context)

  • channel.rs:7803-7807 — In the fail/claim-conflict branch the HTLC is not pushed into the holding cell (the function returns before the push at 7818). With force_holding_cell == true, monitor_update is None, so this returns NewClaim { monitor_update: None, update_blocked: true }, which get_update_fulfill_htlc_and_commit maps to Queued {}. That tells claim_payment_internal the claim was queued, so the channel is added to touched_channels and flushed — but the holding cell still contains the FailHTLC, so the flush will fail the HTLC and never persist the preimage. This is debug_assert!(false)-guarded and structurally prevented by the claimable_payments lock, so it is unreachable in practice, but in release mode the Queued {} result is now actively misleading. Consider returning DuplicateClaim {} when monitor_update.is_none() here, or pushing the claim before returning, so the reported state matches reality.

No other new issues found beyond what was raised in prior passes.

Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channel.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 88d62c4 to 5880633 Compare April 12, 2026 11:08
@Alkamal01

Alkamal01 commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

let _ = htlc_value_msat in the NewClaims branch of claim_batch_funds_from_channel — that value is being silently dropped. If it's needed downstream for fee accounting or event data, this is a bug. If it's not needed, it probably shouldn't be in UpdateFulfillsCommitFetch::NewClaims at all.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz jkczyz requested review from TheBlueMatt and removed request for jkczyz April 14, 2026 22:09
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 5880633 to d801ec9 Compare April 16, 2026 15:08
@Abeeujah

Copy link
Copy Markdown
Contributor Author

let _ = htlc_value_msat

It's intentionally left unused right now, pending any design decision that would be made for the msat values that field holds, in the case of a single claim, there's a completion_action function computing with the value, In this case, it is a batch claim, so it's open ended to how batch claims should work

  • Do we sum all since they were claimed at a single swoop and pass to completion_action
  • Should a Batch Claim Event be implemented as there is for a single claim

This is why that field exist, the absense of the direction it should take is the reason it is intentionally currently left unused.

@codecov

codecov Bot commented Apr 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 70.22901% with 78 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.36%. Comparing base (44828f7) to head (fc83c75).
⚠️ Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 68.34% 59 Missing and 4 partials ⚠️
lightning/src/ln/channel.rs 76.19% 15 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4552      +/-   ##
==========================================
+ Coverage   86.12%   86.36%   +0.23%     
==========================================
  Files         157      158       +1     
  Lines      108922   109493     +571     
  Branches   108922   109493     +571     
==========================================
+ Hits        93812    94561     +749     
+ Misses      12495    12381     -114     
+ Partials     2615     2551      -64     
Flag Coverage Δ
fuzzing-fake-hashes 5.06% <0.00%> (?)
fuzzing-real-hashes 22.89% <64.17%> (?)
tests 86.09% <70.22%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Alkamal01

Copy link
Copy Markdown
Contributor

let _ = htlc_value_msat

It's intentionally left unused right now, pending any design decision that would be made for the msat values that field holds, in the case of a single claim, there's a completion_action function computing with the value, In this case, it is a batch claim, so it's open ended to how batch claims should work

* Do we sum all since they were claimed at a single swoop and pass to `completion_action`

* Should a Batch Claim Event be implemented as there is for a single claim

This is why that field exist, the absense of the direction it should take is the reason it is intentionally currently left unused.

let _ = htlc_value_msat

It's intentionally left unused right now, pending any design decision that would be made for the msat values that field holds, in the case of a single claim, there's a completion_action function computing with the value, In this case, it is a batch claim, so it's open ended to how batch claims should work

* Do we sum all since they were claimed at a single swoop and pass to `completion_action`

* Should a Batch Claim Event be implemented as there is for a single claim

This is why that field exist, the absense of the direction it should take is the reason it is intentionally currently left unused.

Makes sense, but could you drop a quick comment in the code explaining this? Anyone reading it later will just see let _ = htlc_value_msat and have no idea why. Also maybe open a follow-up issue so the decision doesn't get lost after merge.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@TheBlueMatt TheBlueMatt left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would this not be simpler by using the holding cell instead? When we're failing, we call queue_fail_htlc which just shoves the failure into the holding cell then call check_free_peer_holding_cells to release them all at once. Rather than rebuilding that logic for claims, we should be able to do something similar. We'd have to handle the monitor updates with the preimages a bit differently, but not drastically so.

@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from d801ec9 to 5992824 Compare April 27, 2026 14:00
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 5992824 to f466457 Compare April 27, 2026 15:44
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch 2 times, most recently from 086a669 to 2a28f1d Compare May 14, 2026 12:39
Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment thread lightning/src/ln/channelmanager.rs
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 2a28f1d to fc83c75 Compare May 14, 2026 16:24
@Abeeujah Abeeujah requested a review from TheBlueMatt May 15, 2026 11:33
@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

// When the caller asked us to push into the holding cell for batching
// (`force_holding_cell`) we skip producing a `ChannelMonitorUpdate` here; the caller is
// responsible for either flushing the holding cell (producing one combined update with
// every queued preimage and the commitment) or building a preimage-only update for

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a massive layer violation - the ChannelManager shouldn't be making decisions around how to build ChannelMonitorUpdates, we shouldn't need a manual build_preimage_only_monitor_update, etc. Instead, we should be able to queue all the updates, then use the existing check_free_holding_cells logic (or check_free_peer_holding_cells to filter by the channels we actually updated) and have the channel automatically include the preimage updatesteps in the ChannelMonitorUpdate it builds.

@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from fc83c75 to 4e596b0 Compare May 29, 2026 20:38
Comment thread lightning/src/ln/channel.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 4e596b0 to 6abcee3 Compare May 29, 2026 21:15
Comment thread lightning/src/ln/channelmanager.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 6abcee3 to 92fd2d9 Compare May 29, 2026 22:02
Comment thread lightning/src/ln/channel.rs Outdated
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch 2 times, most recently from 6f6a028 to 20e8dd0 Compare June 5, 2026 12:09
@Abeeujah Abeeujah requested a review from TheBlueMatt June 5, 2026 13:57
@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 4th Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot ldk-reviews-bot requested a review from tankyleo June 15, 2026 20:56
@ldk-reviews-bot

Copy link
Copy Markdown

✅ Added second reviewer: @tankyleo

@TheBlueMatt TheBlueMatt removed their request for review June 15, 2026 20:56
@TheBlueMatt

Copy link
Copy Markdown
Collaborator

Gonna let @tankyleo do a pass or two here, sorry for the delay.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @tankyleo! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @tankyleo! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

When multiple parts of an MPP payment arrive on the same channel, claim
them via the holding cell so they are released as one combined
ChannelMonitorUpdate (every PaymentPreimage step plus a single
commitment_signed) instead of one round-trip per part.

claim_payment_internal now forces the holding-cell path whenever there
is more than one source, records each touched channel, and flushes them
after queueing. The single-source case keeps the existing inline fast
path.

FundedChannel::get_update_fulfill_htlc gains a force_holding_cell flag
that suppresses the per-claim ChannelMonitorUpdate so the flush can
emit one combined update. When the holding cell cannot be freed
immediately (awaiting RAA, monitor update in progress, disconnected,
quiescent, ...), build_preimage_only_monitor_update writes a one-step
preimage update so the preimage stays durable across restarts; the
commitment_signed follows when the holding cell is naturally flushed.

Update test_single_channel_multiple_mpp, test_simple_mpp, and
auto_retry_partial_failure to reflect the new single-round-trip
behavior, and drop the threaded reproducer that targeted the old
per-claim path.
@Abeeujah Abeeujah force-pushed the parallelize-mppclaims branch from 20e8dd0 to 626a4b6 Compare June 20, 2026 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better parallelize MPP claims

5 participants