Author's note 05-04-21: I no longer think the actual protocol presented below is practical. Although most of it could work (there are notes below showing what needs to be fixed) it doesn't provide a major improvement over the existing lightning design. The existing lightning design has an important advantage that you are able to set static keys for the balance outputs of each commitment transaction (so if Alice broadcasts her commitment transaction it goes straight to Bob's wallet). However, the idea of Revocable signatures and scorched earth punishments I do believe are applicable to lightning today and in the future and I hope to expand on these in the future.
This a proposal for a new channel symmetric channel construction that uses the key idea from a recent paper called "Generalized Bitcoin-Compatible Channels"[1] and refines it.
This is a significant refinement of the original proposal posted to the lightning-dev mailing list.
As presently specified the two parties in a lightning channel are assigned different commitment transactions. This transaction asymmetry is logically necessary for the protocol to identify which party broadcasted a commitment transaction and potentially punish them if the other party provides proof it has been revoked (i.e. knows the revocation key).
It would be simpler if we could have a unified set of transactions for a state rather than two sets of transactions representing the same off-chain state for each party. Riard first considered this problem in [8] while trying to add a punishment mechanism to eltoo[9] style channel updates. They proposed that you could identify the broadcasting party using witness asymmetry i.e. both parties broadcast the same transaction but have different witnesses. Unfortunately, the solution proposed is rather convoluted.
More recently in [1], Aumayr et al. introduced a much more elegant witness asymmetric solution using adaptor signatures. Instead of being assigned different transactions, the parties are assigned different adaptor signatures as witnesses for the same transaction. The adaptor signatures force the party broadcasting a commitment transaction to reveal a "publishing secret" to the other party. The honest party can then use this publishing secret along with knowledge of the usual revocation secret to punish the malicious party for broadcasting an old commitment transaction.
Improvements over Aymayr et al.[1]
Our protocol is a refinement of the Aymayr et al.'s proposal in the following respects:
Aumayr et al. combine their idea with a "punish-then-split" mechanism similar to the original eltoo proposal. Unfortunately, it therefore inherits the same issue with staggering time-locks. BOLT 3 [4] describes why it avoids this problem as a design choice:
The reason for the separate transaction stage for HTLC outputs is so that HTLCs can timeout or be fulfilled even though they are within the to_self_delay delay. Otherwise, the required minimum timeout on HTLCs is lengthened by this delay, causing longer timeouts for HTLCs traversing the network.
There is further background discussion to this choice in [2] and Towns proposes a solution for eltoo in [3]. In short, the problem is that it creates a risk for the party that needs to fulfill an HTLC with the secret in time because they must wait for the relative time-lock before completing it. It is possible that this issue is not as severe as first thought and that the benefits of punish-then-split outweigh the costs. But to avoid dealing with this question we provide an alternate transaction structure that mimics the current lightning timeout structure.
In 1, each party must communicate a publication point for each commitment transaction. The secret key for the publication point is revealed if they publish the commitment transaction. However, in order to extract the secret this point must be recalled upon seeing the transaction witness. Clearly this leads to O(n) storage complexity where n is the number of commitment transactions.
We overcome this limitation with revocable signatures which preserve the main qualities of the mechanism but allow the victim to punish the revoked transaction with only static data.
A useful benchmark for our protocol is the current lightning network except with the naive addition of PTLCs with asymmetric PTLC-success and PTLC-timeout transactions. This means that adding a PTLC requires creating two new commitment transactions for each party each with their own PTLC output and pre-signed transactions spending from them. An additional relative time-locked claim transaction needs to be signed in the case that the the PTLC transaction spends towards the party who owns the commitment transaction (to give the other party time to punish). This leads to a total of 6 signatures exchanged.
Here is what we believe are the improvements our proposal offers over this:
- Conceptual simpler: Internally there are half the number of possible transactions and all outputs are a 2-of-2.
- Less PTLC communication: Since the commitment transactions and PTLC outputs are shared by both parties there is no need to sign two sets of PTLC transactions per PTLC -- only 4 signatures need to be sent.
- Eager punishment: as soon as a revoked transaction is broadcast the punishment can be applied to the channel without having to wait for the revoked transaction to be confirmed.
- Future compatible: Little fundamentally changes when moving between ECDSA, non-aggregated Schnorr and aggregated Schnorr (i.e. MuSig).
These improvements are admittedly minor. However, we believe this idea is worth exploring as an intermediate upgrade path due to its simplification of the transaction structure. Complexity is the number one enemy of security and also stifles the implementation of new features. The reduction in communication is also more pronounced in other protocols like Discreet Log Contracts[10] where many thousands of signatures need to be exchanged to add a protocol output to the commitment transaction.
A helpful dichotomy that distinguishes our approach is between revokable transactions (current lightning approach) and revokable signatures (our approach). Presently, in lightning channels a transaction is revoked by the party owning that transaction revealing its associated revocation secret. If the revoked commitment transaction is confirmed then all of its outputs may be taken by the other party by using their knowledge of the revocation secret.
In our witness asymmetric design a party will revoke their signature on a transaction rather than the transaction itself. Revoking a signature happens in the same way as a transaction: you reveal a revocation secret. Our goal is that when you broadcast a revoked signature you reveal your static private key. Knowledge of your static private key is enough to claim all the funds from the funding output, all commitment transaction outputs and any PTLC-success/timeout outputs. This has an interesting implication: just by seeing the revoked signature attached to an old commitment transaction in your mempool, you can now claim the funding output immediately without waiting for the revoked transaction to confirm.
To fully emulate the coin access structure of current lightning channels for any given state (e.g. if the other party broadcasted the commitment transaction you can take your balance right away) we must extend our notion of revocable signatures.
Each revocable signature needs to be able to be anticipated.
Note that typical Schnorr signatures have this property: given a message m a public key X and a nonce R we can anticipate the signature as S = R - H(R || X || m) * X.
This allows us to use the revelation of s to reflect a state in the protocol.
This anticipated signature essentially replaces the role of the "publishing secret" from [1] since we use the revelation of the underlying signature to identify which party published the commitment transaction.
The main difference is that anticipated signatures do not require storing a "publishing point" per commitment transaction to enact the punishment.
Revocable signatures for non-aggregated multi-signatures can be implemented using single signer adaptor signatures.
This applies to OP_CHECKMULTISIG with Bitcoin as it is today or Tapscript Schnorr 2-of-2.
Let Ra be Alice's revocation key and A and B be Alice Bob's static public keys respectively.
Bob can create a revocable signature for Alice by giving her an adaptor signature under B encrypted by Ra + A (i.e. Alice will reveal the discrete log of Ra + A if she decrypts and broadcasts the signature).
For Alice to revoke her signature to Bob she reveals the discrete log of Ra to Bob.
Now if Alice were to decrypt and broadcast the signature Bob can extract the discrete logarithm of A from it (i.e. learn her static secret key since he knows the discrete logarithm of Ra he can subtract it from Ra + A).
Although ECDSA has adaptor signatures [6] they are difficult to anticipate because of the modular inversion done on the secret nonce.
Instead we can just anticipate the revelation of Ra + A (in the above example) to know when Alice has broadcasted her revocable (but not necessarily revoked) signature.
Making a key aggregated signature revocable is rather straightforward -- the revocation key is simple the nonce you used to create the signature.
Let (a, A), (b, B) and (ra, Ra), (rb, Rb) be the static key-pairs and nonce pairs for a particular two-party signing protocol execution of Alice and Bob respectively.
We omit the protocol to actually produce these signatures for now.
Imagine that Alice receives s = ra + rb + H(A + B || Ra + Rb || m) as the joint signature from the signing protocol (and keeps it secret from Bob).
Clearly Alice can verifiably revoke the signature by revealing ra to Bob since if Bob were to see the signature afterwards Bob could extract a as
a = s - (ra - rb)/H(Ra + Rb || A + B || m) - b
The anticipation point for a aggregated 2-of-2 is simply s*G = Ra + Rb + H(A + B || Ra + Rb || m) * (A + B).
Given either of the above revocable signature schemes we propose a new witness asymmetric channel construction.
We describe the protocol with respect to two parties Alice and Bob.
2-of-2(A,B) refers to a multi-signature output that requires the owners of A (Alice) and B (Bob) to authorize it.
The multi-signature scheme needs to be revocable as described above.
In practice 2-of-2(A,B) should be interpret as some deterministic randomization of A,B so that all 2-of-2(A,B) outputs do not look the same however we omit this for clarity.
We refer to the anticipated signature of Alice and Bob's revocable signature on the commitment transaction as publicationA_i and publicationB_i respectively.
These are revealed when Alice or Bob broadcasts their revocable (bit not necessarily revoked) signature.
The structure of channel funding does not change much from the current BOLT spec.
Before creating the Fund transaction the two parties exchange their static public keys they will use for the channel (A and B).
The Fund transaction spends from the funding party's (or funding parties') outputs to a 2-of-2(A,B).
A commitment transaction represents a state of the channel.
They each spend from the 2-of-2(A,B) on the Fund transaction.
A commitment transaction has two balance outputs and zero or more PTLC outputs.
Before signing the ith commitment transaction the parties must exchange revocation public keys Ra_i and Rb_i.
Each commitment transaction is signed by both parties so that each party has a single (but different) revocable signature on it e.g. Alice has a revocable signature spending on the commitment transaction with Ra_i as the revocation key.
The two balance outputs assign some proportion of the funds exclusively to each party. Each one has a scriptPubkey of:
- Alice:
2-of-2(A, publicationB_i) - Bob:
2-of-2(publicationA_i, B)
Author's note 05-04-21: This doesn't work in the case of Schnorr aggregated 2-of-2 since the publication points are a function of the commitment transaction itself (and so can't be used in computing an output).
Each Balance output has a relative time locked claim transaction spending it to an address owned by the deserving party.
Note that if Alice publishes a valid state she reveals publicationA_i and Bob can take his funds immediately.
As soon as Bob sees a revoked signature attached to an old commitment transaction published by Alice he learns the secrets for both publicationA_i and A so can immediately take both his and Alice's balance outputs.
PTLC outputs are added directly to the commitment transaction. Each PTLC has two pre-signed transactions spending from it, PTLC-success and PTLC-timeout, which both spend to a single destination address. Broadcasting the PTLC-success transaction reveals the PTLC secret in the usual way using an adaptor signatures. The PTLC-timeout transaction has an absolute time-lock on it through the nlocktime transaction value. Only the party which stands to gain from the PTLC transactions receives a signature on it i.e. if Alice is offering a PTLC to Bob then Alice will have a signature on the PTLC-timeout transaction and Bob will have a signature on the PTLC-success transaction.
A PTLC output scriptPubkey is a simple combination of the static keys: 2-of-2(A,B).
For a PTLC output being offered by Alice to Bob the outputs of the PTLC transactions are:
- PTLC-success:
2-of-2(publicationA_i, B) - PTLC-timeout:
2-of-2(A, publicationB_i)
Note the following scenarios:
- Alice confirms a valid commitment transaction and the PTLC-success transaction confirms first: Bob immediately claims the funds since he knows both
publicationA_iandB. - Alice confirms a valid commitment transaction and the PTLC-timeout transaction confirms first: Alice waits for the relative time-lock of her claim transaction and then broadcasts that.
- Alice confirms a revoked commitment transaction: Bob can take the PTLC output right away since it's a
2-of-2(A,B)and he can the secret forAfrom the revoked signature attached to the revoked transaction. - Alice confirms a revoked commitment transaction and the PTLC-timeout transaction: Bob can take the PTLC-timeout output since he can extract
Afrom the revoked signature attached to the revoked transaction.
The implications of switching the direction of the PTLC are straightforward.
In the above structure there is no reason why the static keys A and B need to be chosen per channel.
Instead they can be the main node static keys which define Alice and Bob's identity on the network.
Losing this secret key is a "scorched earth" punishment: it applies to all channels with all peers and means the peer can never re-use their identity again!
The weakness of only punishing the funds in a single channel is that you have to ensure that there is enough in the channel to create a reasonable deterrent. This necessarily means freezing some portion of the funds. From BOLT 2:
The channel reserve is specified by the peer's
channel_reserve_satoshis: 1% of the channel total is suggested. Each side of a channel maintains this reserve so it always has something to lose if it were to try to broadcast an old, revoked commitment transaction. Initially, this reserve may not be met, as only one side has funds; but the protocol ensures that there is always progress toward meeting this reserve, and once met, it is maintained.*
With this scheme there would be no need for channel_reserve_satoshis in the majority of cases.
Honest parties would simply have to ensure that their peers have a lot to lose across the whole network.
If misbehavior occurs -- even if it is only observed long after the funds have been stolen -- then the wronged peer will be able to extract the peer's secret key from the on-chain transactions and publish it publicly.
Any other peers with channels with the malicious peer could then clean sweep all the channel funds.
Small users could rest easy knowing that if their Bitcoin were stolen through a revoked transactions while they were offline, whenever they came back online they would be able to publish their peer's secret key.
As long as the cheating peer is a somewhat well-established node on the network, the loss of their identity and other channel funds should overwhelm whatever gain they could get from the small channel.
To an extent this obviates the need for 3rd party watchtowers as long as you are only making channels with well established nodes with a lot to lose.
It's important to note that scorched earth punishments do not require the precise channel structure above but simply that revocable signatures are used as the revocation mechanism and the node's static key is used appropriately in transaction outputs. It could even be applied to transaction asymmetric designs (i.e. lightning as it is today).