Clear monitor-pending RAA once regenerated by wpaulino · Pull Request #4684 · lightningdevkit/rust-lightning

wpaulino · 2026-06-11T21:38:01Z

The chanmon_consistency fuzz target found a reconnect ordering where signer_pending_revoke_and_ack and monitor_pending_revoke_and_ack could both describe the same owed revoke_and_ack.

The channel first received a commitment_signed whose monitor update completed, but the signer could not provide the next point or secret, leaving signer_pending_revoke_and_ack set. Later, receiving the peer revoke_and_ack freed holding-cell HTLCs and produced a held monitor update. While that monitor update was still blocked, channel_reestablish saw the peer one state behind and recorded monitor_pending_revoke_and_ack, plus the corresponding monitor-pending commitment_signed, so the messages could be replayed once monitor updating was restored.

If the signer unblocked before the held monitor update was released, signer_maybe_unblocked generated and sent the RAA using signer_pending_revoke_and_ack. The monitor-pending flag was not cleared at that point, so monitor_updating_restored later generated the same RAA again when the held update completed. The peer had already advanced after accepting the signer-unblocked RAA, so it rejected the duplicate secret as not corresponding to its current pubkey and force-closed.

Fix this by clearing monitor_pending_revoke_and_ack whenever get_last_revoke_and_ack successfully constructs an RAA, alongside signer_pending_revoke_and_ack. All resend paths regenerate RAAs through this helper, so successful generation through either pending path satisfies the other pending record. If generation fails, pending signer state is still left set and monitor-pending state remains available for monitor restoration to retry.

This failure was discovered in https://github.com/lightningdevkit/rust-lightning/actions/runs/26905971318/job/79370860747.

ldk-reviews-bot · 2026-06-11T21:38:05Z

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

ldk-claude-review-bot · 2026-06-11T21:41:48Z

No new issues found.

The production change (lightning/src/ln/channel.rs:10262-10268) clears monitor_pending_revoke_and_ack in signer_maybe_unblocked when the signer-pending path successfully regenerates an RAA. On re-verification:

The clear is placed after the resend-order reblocking logic (10250-10261), so an RAA that gets nulled by the CommitmentFirst reorder correctly leaves monitor_pending_revoke_and_ack set for later retry. This placement is correct.
The reverse duplicate (monitor path then signer path) is already prevented by get_last_revoke_and_ack clearing signer_pending_revoke_and_ack on success (line 10352).
monitor_updating_restored clears monitor_pending unconditionally (10047), so the held update no longer regenerates the RAA after the signer path sent it.

Minor non-blocking observation (not a code bug): the PR description and inline comment frame the fix as happening "whenever get_last_revoke_and_ack successfully constructs an RAA," but the change is actually in signer_maybe_unblocked, not the helper. The implementation is nonetheless functionally complete and the chosen location is in fact safer than placing it in the helper would be.

The symmetric commitment_signed resend path remains untreated, as noted in my prior review — still out of scope for the bug this PR targets.

ldk-reviews-bot · 2026-06-15T00:00:48Z

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

TheBlueMatt · 2026-06-15T13:26:39Z

Why does this not need backport to 0.1/0.2?

TheBlueMatt

I'm a bit confused here, why is it safe to always send an RAA (via get_last_revoke_and_ack) based on only signer_pending_... or monitor_pending_...? eg if we reconnect and find that we owe a commitment_signed that is blocked on the signer but have a blocked revoke_and_ack on a monitor update, we'll set signer_pending_raa and then send it if the signer completes even if the monitor is pending.

ldk-reviews-bot · 2026-06-15T13:39:54Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

The `chanmon_consistency` fuzz target found a reconnect ordering where `signer_pending_revoke_and_ack` and `monitor_pending_revoke_and_ack` could both describe the same owed `revoke_and_ack`. The channel first received a `commitment_signed` whose monitor update completed, but the signer could not provide the next point or secret, leaving `signer_pending_revoke_and_ack` set. Later, receiving the peer `revoke_and_ack` freed holding-cell HTLCs and produced a held monitor update. While that monitor update was still blocked, `channel_reestablish` saw the peer one state behind and recorded `monitor_pending_revoke_and_ack`, plus the corresponding monitor-pending `commitment_signed`, so the messages could be replayed once monitor updating was restored. If the signer unblocked before the held monitor update was released, `signer_maybe_unblocked` generated and sent the already monitor-safe RAA using `signer_pending_revoke_and_ack`. The monitor-pending flag was not cleared at that point, so `monitor_updating_restored` later generated the same RAA again when the held update completed. The peer had already advanced after accepting the signer-unblocked RAA, so it rejected the duplicate secret as not corresponding to its current pubkey and force-closed. Fix this by clearing `monitor_pending_revoke_and_ack` in the signer-resume path only once a signer-pending RAA is actually being returned.

wpaulino · 2026-06-15T17:32:06Z

Why does this not need backport to 0.1/0.2?

I think we can, though I do wonder why this went uncaught for so long if it actually was an issue in those releases as well. Something about our current fuzz harness made this much easier to find.

EDIT: It's reachable in the fuzzer now because there's a path to reload with a stale manager.

TheBlueMatt

This is a trivial fix.

TheBlueMatt · 2026-06-16T22:09:18Z

Backported in #4706

TheBlueMatt · 2026-06-17T21:18:58Z

Backported to 0.1 in #4710.

v0.1.10 - Jun 18, 2026 - "Loupe de Loupe" API Updates =========== * `DefaultMessageRouter` will now always generate blinded message paths that provide no privacy (where our node is the introduction node) for nodes with public channels. This works around an issue which will appear for any nodes with LND peers that enable onion messaging - such peers will refuse to forward BOLT 12 messages from unknown third parties, which most BOLT 12 payers rely on today (#4647). * Explicit `amount_msats` of 0 is rejected in BOLT 12 `Offer`s; `OfferBuilder` now maps 0-amounts to an amount of `None` (#4324). Bug Fixes ========= * Async `ChannelMonitorUpdate` persistence operations which complete, but are not marked as complete in a persisted `ChannelManager` prior to restart, followed immediately by a block connection and then another restart could result in some channel operations hanging leading for force-closures (#4377). * If an MPP payment is claimed but `ChannelMonitorUpdate`s for some parts are still being completed asynchronously, further channel updates (e.g. forwarding another payment) are pending and the node restarts, the channel could have become stuck (#4520). * The presence of unconfirmed transactions actually no longer causes `ElectrumSyncClient` to spuriously fail to sync (#4590). * `FilesystemStore::list_all_keys` will no longer fail if there are stale intermediate files lying around from a previous unclean shutdown (#4618). * When forwarding an HTLC while in a blinded path with proportional fees over 200%, LDK will no longer spuriously allow a forward that pays us 1 msat too little in fees (#4697). * Fixed a rare case where a channel could get stuck on reconnect when using both async `ChannelMonitorUpdate` persistence and async signing (#4684). * `Event::PaymentSent::fee_paid_msat` is no longer `None` in cases where `ChannelManager::abandon_payment` was called before the payment ultimately completes anyway (#4651). * Syncing a `ChainMonitor` using the `Confirm` trait will no longer write some full `ChannelMonitor`s to disk several times per block (#4544). * `OMDomainResolver` now correctly accounts for failed queries when rate limiting, ensuring we continue to respond to queries after failures (#4591). * Calling `ChannelManager::send_payment_with_route` without a `route_params` and with an invalid `Route` will no longer panic (#4707). * `lightning-custom-message`'s handling of `peer_connected` events now ensures that sub-handlers will see a `peer_disconnected` event if a different sub-handler refused the connection by `Err`ing `peer_connected` (#4595). * Incomplete MPP keysend payments will no longer see their HTLCs held until expiry (#4558). * `InvoiceRequestBuilder` will no longer accept a `quantity` of `0` for a BOLT 12 `Offer`, allowing any quantity up to a bound (#4667). * `lightning-custom-message` handlers that return `Ok(None)` when asked to deserialize a message in their defined range no longer cause panics (#4709). * Several spurious debug assertions were fixed (#4537, #4618). Security ======== 0.1.10 fixes a sanitization issue and several denial-of-service vulnerabilities. * `Bolt11Invoice::recover_payee_pub_key` no longer panics if called on an invoice which set an explicit public key, rather than relying on public key recovery. This method is called from `payment_parameters_from_invoice` and `payment_parameters_from_variable_amount_invoice` (#4717). * Maliciously-crafted unpayable invoices which have overflowing feerates will no longer cause an `unwrap` failure panic (#4716). * `possiblyrandom` did not properly generate random data except when it was explicitly configured to. By default this means LDK is vulnerable to various HashDoS attacks (#4719). * `OMNameResolver` will no longer panic when looking up payment instructions which include unicode characters at the start of a TXT record (#4718). * `PrintableString` did not properly sanitize unicode format characters, allowing an attacker to corrupt the rendering of logs or UI (#4593, #4605). * RGS data is now limited in how large of a graph it is able to cause a client to store in memory. Note that RGS data is still considered a DoS vector in general and you should only use semi-trusted RGS data (#4713). * Counterparty-provided strings in failure messages are no longer logged in full, reducing the ability of such a counterparty to spam our logs (#4714). * Reading a corrupted `ChannelManager` or `ProbabilisticScorer` can no longer cause us to allocate large amounts of memory (#4712). Thanks to Project Loupe for reporting most of the issues fixed in this release.

wpaulino added this to the 0.3 milestone Jun 11, 2026

wpaulino requested a review from TheBlueMatt June 11, 2026 21:38

wpaulino self-assigned this Jun 11, 2026

wpaulino added the backport 0.3 label Jun 11, 2026

TheBlueMatt reviewed Jun 15, 2026

View reviewed changes

wpaulino force-pushed the bogus-async-raa-regenerated branch from 38f6df2 to 27223fd Compare June 15, 2026 17:29

wpaulino requested a review from TheBlueMatt June 15, 2026 17:29

wpaulino added backport 0.1 backport 0.2 labels Jun 15, 2026

TheBlueMatt approved these changes Jun 16, 2026

View reviewed changes

TheBlueMatt merged commit 55fb60b into lightningdevkit:main Jun 16, 2026
1 check passed

wpaulino deleted the bogus-async-raa-regenerated branch June 16, 2026 16:36

TheBlueMatt mentioned this pull request Jun 16, 2026

[0.2] Another round of backports + initial release notes #4706

Merged

TheBlueMatt removed the backport 0.2 label Jun 16, 2026

TheBlueMatt mentioned this pull request Jun 17, 2026

[0.1] Further backports and draft release notes #4710

Merged

TheBlueMatt removed the backport 0.1 label Jun 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clear monitor-pending RAA once regenerated#4684

Clear monitor-pending RAA once regenerated#4684
TheBlueMatt merged 1 commit into
lightningdevkit:mainfrom
wpaulino:bogus-async-raa-regenerated

wpaulino commented Jun 11, 2026

Uh oh!

ldk-reviews-bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

ldk-claude-review-bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

ldk-reviews-bot commented Jun 15, 2026

Uh oh!

TheBlueMatt commented Jun 15, 2026

Uh oh!

TheBlueMatt left a comment

Uh oh!

ldk-reviews-bot commented Jun 15, 2026

Uh oh!

wpaulino commented Jun 15, 2026 •

edited

Loading

Uh oh!

TheBlueMatt left a comment

Uh oh!

Uh oh!

TheBlueMatt commented Jun 16, 2026

Uh oh!

TheBlueMatt commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

wpaulino commented Jun 11, 2026

Uh oh!

ldk-reviews-bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-claude-review-bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Jun 15, 2026

Uh oh!

TheBlueMatt commented Jun 15, 2026

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Jun 15, 2026

Uh oh!

wpaulino commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented Jun 16, 2026

Uh oh!

TheBlueMatt commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ldk-reviews-bot commented Jun 11, 2026 •

edited

Loading

ldk-claude-review-bot commented Jun 11, 2026 •

edited

Loading

wpaulino commented Jun 15, 2026 •

edited

Loading