Add splice RBF support#888
Conversation
|
👋 Thanks for assigning @benthecarman as a reviewer! |
|
@tnull Any feedback on the last few commits would be appreciated: https://github.com/lightningdevkit/ldk-node/pull/888/changes/492ec9124ae02a4048ecb5270309174b05b366d2..c770999887e1287bef8abe4f4c92ca36e9434067 |
198bfbf to
d5b6bda
Compare
d5b6bda to
91166b9
Compare
tnull
left a comment
There was a problem hiding this comment.
Excuse the delay. Took a first look, generally makes sense, though it's kind off odd to block the broadcast queue on persistence. Hopefully we'd never get a lot of backpressure from remote persistence, for example.
I have yet to review the (rather verbose) new code for updating/persisting the pending payments in wallet.rs in more detail.
91166b9 to
c4d3282
Compare
|
@jkczyz Is this ready for review or are we waiting for any upstream changes still? |
Ah, was unaware of that cross dependency, will see to review #882 next to keep things moving then. |
|
🔔 1st Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @tankyleo! This PR has been waiting for your review. |
Camillarhi
left a comment
There was a problem hiding this comment.
Thanks for this. I took a look at this, and I have a few questions. I have taken a little look at splicing in general.
For this PR, if a user calls bump_fee_rbf on a record with funding_details, the replacement is picked up as a WalletEvent::TxReplaced event on the next sync, and this will currently rewrite the funding_details as NONE. Is bumping a funding/splice payment through bump_fee_rbf meant to be possible at all, or should those records be excluded here in favor of rbf_channel? AND is overwriting an existing funding_details with None intended in the TxReplaced path?
|
🔔 4th Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 5th Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 6th Reminder Hey @tankyleo! This PR has been waiting for your review. |
c4d3282 to
bec7723
Compare
|
Rebased and dropped commits added in #912.
Sorry, that should have said #872. Relevant commits were moved to #912, which has been merged.
Good catch @Camillarhi! Updated to disallow using Overwriting the existing |
|
🔔 7th Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 8th Reminder Hey @tankyleo! This PR has been waiting for your review. |
benthecarman
left a comment
There was a problem hiding this comment.
Needs rebase.
Would be nice if we had an api to see if a channel has a currently in-progress splice. Currently the only way to know is to call one of the splice functions and see if it fails.
| "Failed to splice channel: a prior splice contribution is pending" | ||
| "Failed to splice channel: a prior splice contribution is pending; use rbf_channel instead" | ||
| ); | ||
| return Err(Error::ChannelSplicingFailed); |
There was a problem hiding this comment.
It says use rbf_channel instead but that function doesn't let us change the amount in/out. Some like use rbf_channel to bump fee would be more accurate. Also would be nice if this had a separate error.
There was a problem hiding this comment.
Fixed the message, but I'm not sure it justifies a new error type. Callers will be able to check SpliceDetails once lightningdevkit/rust-lightning#4687 is included. @tnull Any preference?
There was a problem hiding this comment.
Should that be added to the 0.3 milestone?
There was a problem hiding this comment.
Fixed the message, but I'm not sure it justifies a new error type. Callers will be able to check
SpliceDetailsonce lightningdevkit/rust-lightning#4687 is included. @tnull Any preference?
No preference, fine to leave it as-is.
|
also needs rebase and I imagine this will have conflicts with #660 |
Yeah, good point. #930 adds a store for handling retries on disconnect. But I have an upstream change to expose this information on |
|
🔔 2nd Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @tnull! This PR has been waiting for your review. |
When a splice is already pending, the user needs a way to replace its funding transaction at a higher feerate. This adds rbf_channel() to handle that case and guards splice_in/splice_out against being called while a pending splice exists, directing users to rbf_channel instead. Also fixes signing for RBF replacements, which requires accessing outputs spent by unconfirmed transactions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
bec7723 to
f78025d
Compare
|
Rebased. Still need to address the comments, but I will need to investigate some test flakiness first. |
bump_channel_funding_fee only bumps a pending splice's fee; it cannot change the splice amount, so phrasing it as an alternative to splicing was misleading. Co-Authored-By: Claude <noreply@anthropic.com>
Channel-opening and splice transactions transition to Succeeded when ChannelReady fires, not after ANTI_REORG_DELAY confirmations. This matches the point at which the Lightning layer considers the channel usable: a zero-conf channel graduates as soon as its counterparty signals, and a high-conf channel waits however many confirmations the peer requires, rather than always stopping at six. For splice RBF, the payment records whichever candidate actually confirmed, with that candidate's amount and this node's share of the fee — not the fee-estimate used for weight at coin-selection time, and not the whole-tx fee for a multi-contributor splice. A channel closure whose funding or splice never confirmed discards its payment record instead of leaving it pending forever. Generated with assistance from Claude Code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The TxReplaced wallet event rebuilt the payment record from scratch, dropping its funding details. When a wallet sync fell between a splice broadcast and its RBF, the replacement of the original candidate cleared those details, so the payment no longer graduated to Succeeded on ChannelReady. Funding records are managed by the classify path and the Lightning lifecycle handlers, so leave them untouched on replacement. Co-Authored-By: Claude <noreply@anthropic.com>
bump_fee_rbf accepted channel-funding and splice payments because they are recorded as outbound, unconfirmed on-chain payments. Replacing such a transaction via wallet RBF would broadcast one LDK isn't tracking, and for splices the shared input can't be wallet-signed. Reject these and leave splice fee-bumping to rbf_channel. Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Once ChannelReady graduates a funding payment to Succeeded (removing its pending record), a concurrent wallet-sync event for one of its candidates could re-stamp the old Pending status and revert the graduation. Treat a Succeeded on-chain payment as terminal during sync, refining only the confirmation status of the candidate that locked. Co-Authored-By: Claude <noreply@anthropic.com>
Previously the BroadcasterInterface implementation wrote the payment record synchronously when LDK invoked it. With a remote KV store this could block LDK's message handling for hundreds of milliseconds per call, noticeably during force-close bursts or splice broadcasts. Persistence now happens asynchronously and must complete before the transaction is sent to the chain client. If persistence fails, the broadcast is dropped: a payment record must exist for every on-chain tx we emit, otherwise a crash could leave the tx confirmed with no matching record. Generated with assistance from Claude Code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f78025d to
328bdc3
Compare
|
🔔 4th Reminder Hey @tnull @benthecarman! This PR has been waiting for your review. |
|
🔔 1st Reminder Hey @tnull @benthecarman! This PR has been waiting for your review. |
| /// # Experimental API | ||
| /// | ||
| /// This API is experimental and may change in the future. | ||
| pub fn bump_channel_funding_fee( |
There was a problem hiding this comment.
Is the plan for ldk-node to eventually monitor pending splice candidates and automatically bump them? It seems a bit awkward to make users/applications notice that a pending splice is stuck and manually call this API.
There was a problem hiding this comment.
Wonder how this method can be used to bump the fee much higher and not only to the next step. Invoke repeatedly while they monitor what the current pending splice fee rate is?
There was a problem hiding this comment.
Is the plan for ldk-node to eventually monitor pending splice candidates and automatically bump them? It seems a bit awkward to make users/applications notice that a pending splice is stuck and manually call this API.
Hmm, maybe. We previously considered that for regular onchain payments but for now opted to keep it manual. But indeed some auto-bumping could make sense, though we'd of course need to work out a sane API so the user can clearly limit how much they are willing to spend on fees..
There was a problem hiding this comment.
Manual is a good first step of course, even though not being able to provide a fee rate to this call might feel uncontrolled.
Eventually I don't think you can ask an end-user to make any kind of decision about how and when to RBF.
There was a problem hiding this comment.
Thanks, and excuse the considerable delay here!
Generally looks good, but I'm a bit confused about the second commit - so far we use the anti reorg delay just to avoid necessarily having to implement full reorg protection logic for the payment store. It seems mixing concepts here is a bit odd - is there a way to avoid tieing the onchain 'payment's status to anything channel specific?
Also recently put up some long planned refactors over at #937, in particular there we also tried getting rid of duplicating the full payment details in the pending payment store and rather keep the payment store the authoritative store. Any thoughts on whether this would be compatible with the changes in this PR?
Feel free to squash fixups, btw.
| /// | ||
| /// # Experimental API | ||
| /// | ||
| /// This API is experimental and may change in the future. |
There was a problem hiding this comment.
Given we don't offer API stability in general, what does concretely mean beyond the API signature possibly changing going forward? If that's all that's already expected at this point, so maybe we can drop this note?
| let mut psbt = Psbt::from_unsigned_tx(unsigned_tx).map_err(|e| { | ||
| log_error!(self.logger, "Failed to construct PSBT: {}", e); | ||
| })?; | ||
| // Use list_output rather than get_utxo to include outputs spent by unconfirmed |
There was a problem hiding this comment.
This comment won't make sense after this is merged as it only annotates the transition from list_output to get_utxo. Would be better to explain what's happening and why we're using list_output in particular, not only that we do.
| pub conflicting_txids: Vec<Txid>, | ||
| /// Set when the payment's transaction is an interactive-funding broadcast (channel | ||
| /// open or splice). The record transitions to [`PaymentStatus::Succeeded`] on | ||
| /// `ChannelReady` instead of after [`ANTI_REORG_DELAY`] confirmations. |
There was a problem hiding this comment.
Hmm, could you expand on why this is necessary? It seems a rather unintuitive change given we only require ANTI_REORG_DELAY so our payment store doesn't necessarily need full reorg logic. Not quite sure I'm following why the status of the 'payment' / transaction should be tied to anything to do with channels?
| ) -> Result<BroadcastPackage, Error> { | ||
| let wallet_opt = self.wallet.lock().expect("lock").as_ref().and_then(Weak::upgrade); | ||
| if let Some(wallet) = wallet_opt { | ||
| tokio::task::spawn_blocking(move || { |
There was a problem hiding this comment.
Now that we made everything async internally, we should be able to make this a spawn, no?
| if let Some(channel_details) = | ||
| open_channels.iter().find(|c| c.user_channel_id == user_channel_id.0) | ||
| { | ||
| let min_feerate = |
There was a problem hiding this comment.
Codex:
- Medium: /home/tnull/workspace/ldk-node-pr-888/src/lib.rs:1826 computes the current ChannelFunding fee estimate, but /home/tnull/workspace/ldk-node-pr-888/
src/lib.rs:1845 to rbf_prior_contribution. In LDK that means “use the minimum RBF feerate”, so this API ignores the current fee estimate when fees rise. It
can also fail if the current estimate fell enough that max_feerate = estimate * 1.5 is below the required RBF minimum. This should choose an explicit
target like max(current_estimate, min_rbf_feerate) and set the max from that target.
| pending.details.fee_paid_msat = aggregate.fee_paid_msat; | ||
| } | ||
|
|
||
| // Preserve the confirmation status already on the record (set by wallet sync if |
There was a problem hiding this comment.
Codex:
- Medium: /home/tnull/workspace/ldk-node-pr-888/src/wallet/mod.rs:1373 can graduate a funding payment to Succeeded while preserving an Unconfirmed on-chain
status, then removes the pending funding record at /home/tnull/workspace/ldk-node-pr-888/src/wallet/mod.rs:1382. After that, /home/tnull/workspace/ldk-
node-pr-888/src/wallet/mod.rs:1557 no longer recognizes it as a funding payment because it only checks pending_payment_store, and it only rejects
ConfirmationStatus::Confirmed. A user can hit a race where ChannelReady has fired but BDK has not yet updated the confirmation status, allowing plain
wallet RBF against a channel funding tx LDK already considers locked. Reject PaymentStatus::Succeeded here, or persist enough funding classification in the
main payment store to keep rejecting these after graduation.
|
🔔 2nd Reminder Hey @benthecarman! This PR has been waiting for your review. |
Adds a public
Node::rbf_channelAPI that replaces an in-flight splice transaction with a higher-feerate version.An RBF leaves multiple candidate splice transactions in flight; only one of them will confirm on chain, and the payment record needs to reflect whichever one did — its amount and this node's share of the fee. The transition from
PendingtoSucceededshould also match when the Lightning protocol actually considers the new channel state usable — the momentChannelReadyfires — rather than afterANTI_REORG_DELAYconfirmations, which doesn't fit the zero-conf extreme on one end or higher-confirmation-depth peer configurations on the other.To support those properties, this PR:
Every broadcast from LDK now triggers a store write. For a remote KV store, running that write on LDK's thread would block message handling for hundreds of milliseconds per call, so persistence happens asynchronously while still guaranteeing that the record is committed before the transaction reaches the chain client.
Based on #878.