Abstract: This article describes the design of an opcode recently added to the Bitcoin Cash scripting language called “OP_CHECKDATASIG” [1]. This opcode was activated on the Bitcoin Cash network during the recent upgrade. It allows Script to validate arbitrary messages from outside the blockchain, opening up many exciting possible use cases (see Appendix A).
An earlier version of this article was originally published at Yours.org [11].
Background
When someone sends a Bitcoin Cash transaction, they sign it to prove to the network that the owner of the private key authorizes the transaction. The signature is typically verified using an opcode called “OP_CHECKSIG” that calculates a hash based on a portion of the data in the transaction, and checks the signature against that. The signature is equivalent to a contract that says the owner of the private key authorizes the transaction. OP_CHECKSIG has a few different ways it can hash the transaction (known as the “Sighash”), to allow different conditions on the transfer. In general, however, it is equivalent to a contract that only defines the transfer of money. But what if you could also sign other pieces of information in the transaction? This would allow other information to be included and signed as part of the transfer contract. This is the idea behind OP_CHECKDATASIG.
One way to think of OP_CHECKDATASIG is as an un-bundling of OP_CHECKSIG. The design of OP_CHECKSIG bundles two distinct concepts together: calculating the Sighash, and checking the signature of that hash. If we imagine re-implementing OP_CHECKSIG as two instructions, OP_CHECKDATASIG would be the second instruction. It simply checks a signature against a supplied message and public key. This unbundling means that OP_CHECKDATASIG is not designed for any particular use-case, but can instead be used flexibly for any use that requires checking a signature against a supplied message. A non-exhaustive list of uses that have emerged so far can be found in Appendix A.
The Story
The motivation behind OP_CHECKDATASIG was to continue the process of improving the Bitcoin Cash Script language. As such, the instruction is intended to be a generic and flexible building block with many potential uses.
OP_CHECKDATASIG is based on Andrew Stone’s OP_DATASIGVERIFY proposal [2]. The original motivation behind the opcode was to be able to validate signatures from an “oracle” on messages from outside the blockchain. This use case has been described in an article by Andrew Stone [3]. After the original proposal was made, it went through a series of changes and design refinements based on review and discussion by stakeholders and subject-matter experts [4].
Overall, the theme of the changes was to make the design mirror existing opcodes more closely, and making no assumptions about how it would be used. This makes it a more conservative and minimal change than the original proposal. For example, the OP_DATASIGVERIFY proposal used a pubkey-recoverable signature encoding, different from OP_CHECKSIG. Though the design choices of the original proposal may have had certain advantages, in general the reviewers preferred an approach of sticking close to the design of what is already in the protocol. This helps lower risk by creating minimal change in the implementation. All the little quirks of OP_CHECKSIG are well understood, having been battle-tested on the blockchain for many years. Sticking to the same underlying primitives keep the design in a well understood and safe territory. As a result, the implementation details of OP_CHECKDATASIG are very close to OP_CHECKSIG [5].
An interesting thing happened on this journey, however. Though the driving motivation of the design changes were conservatism and safety, by making the implementation mirror OP_CHECKSIG, some novel potential use-cases were discovered.
The original OP_DATASIGVERIFY proposal took a message of any size as input, then hashed it before checking the signature. Through the review process, following the idea of structuring OP_CHECKDATASIG as the second step in OP_CHECKSIG, it seemed like a nice design to make the opcode take a hash value as input. This also fit with the philosophy of keeping the opcode as a simple minimal building block.
Upon further review, it was realized that passing in a value to be signed without hashing allows it to be used in a potentially insecure manner. This problem was identified by Andrew Stone on June 16th, as well as other reviewers. This not be a problem if used properly, but made it possible for Script authors to use it insecurely if they were careless. So the design was changed back to hashing the input with double-SHA256, which is the standard hash method in Bitcoin.
However, in the period where the design did not hash the input, some people (awemany in particular) realized something interesting: the fact that the opcode did not hash the input meant that you could pass in a Sighash value from another transaction as the message, and the signatures would match. This means that it becomes possible for OP_CHECKDATASIG to test whether it has been supplied with the valid signature for a completely different transaction.
Changing the opcode to doing a double-SHA256 hash on the input would still allow this sighash-as-message use, however now you would have to pass in the full pre-image of the hash, which is the serialized transaction data. This data would typically be hundreds of bytes, and could easily be more than 520 bytes for large transactions. Since pushing data onto the stack is limited to 520 bytes, this would prevent large transactions from passed into the opcode. Luckily, the reviewers found a solution that threaded the needle between both options, yielding both safety and convenience: do a single-SHA256 hash in the input.
Doing a single-SHA256 means the input is hashed, so it is secure, but it also means that a partial sighash which has only gone through one round of SHA256 can be passed in (only 32 bytes on the stack, no matter the size of the original transaction), which when hashed again with another round will yield the full sighash.
Mark B. Lundeberg also noticed that PGP signatures use a single-SHA256. So moving to single-SHA256 makes the signature compatible with PGP, also opening up other potential uses.
These capabilities open up exciting possibilities. It means that OP_CHECKDATASIG can make spendability dependent on completely separate unrelated transaction being signed. This can work even for transactions on different blockchains, as long as the transaction signing algorithm is compatible with Bitcoin Cash. This group includes Bitcoin, Litecoin, Dash, ZCash. I will leave it to people more creative that I to find novel ways to use this capability. There have already been several ideas floated such as atomic digital goods purchase, and double-spend prevention [6, 7]. A list of some uses that have emerged so far is provided in Appendix A, and I look forward to many new uses haven’t yet been thought of.
Conclusion
The addition of OP_CHECKDATASIG adds a useful new capability to the Bitcoin Cash scripting language: the ability to validate messages from outside the blockchain. By reaching out and engaging subject matter experts and different stakeholders, the initial proposal was modified to address concerns. Through this process, not only did it retain the core utility, but the capabilities were expanded with more exciting potential uses. The modifications made the new version a more conservative change that sticks closer to the existing system, and also a more flexible tool with novel capabilities.
Acknowledgements
My (Antony Zegers aka Mengerian) role in this process was to reach out to reviewers and coordinate technical discussion and feedback. The information in this article, and the design of the opcode, are a synthesis of the ideas and input of all the reviewers.
I would like to thank Andrew Stone for making the original proposal, Amaury Sechet for coding the first complete implementation in Bitcoin ABC, and all the reviewers, including Clemens Ley, Chris Pacia, Amaury Sechet, Andrew Stone, Mark B. Lunderberg, awemany, and others for their contributions.
Postscript 1
It turns out Blockstream’s Elements project “OP_CHECKSIGFROMSTACK” opcode is very similar to OP_CHECKDATASIG [8]. Even minor details such as the order of inputs on the stack is the same. I was personally unaware of OP_CHECKSIGFROMSTACK until a reddit post appeared accusing us of copying it [9]. The reddit post appeared near the end of the design process, during the discussion about using single rather than double SHA256 hash, so that choice was made with the knowledge that OP_CHECKSIGFROMSTACK also uses single-SHA256 hash in the input. The similarity of the rest of the implementation details, however, seems to be a case of convergent design. The fact that the two implementations independently arrived at a such a similar outcome gives some reassurance that the design choices make sense.
Postscript 2
Another interesting event indirectly related this opcode was awemany’s discovery of the severe vulnerability CVE-2018–17144 [10]. This vulnerability was one of the most catastrophic bugs in Bitcoin ever discovered. Although the vulnerability has no direct connection to OP_CHECKDATASIG, it is interesting that it was discovered by awemany while implementing OP_CHECKDATASIG for Bitcoin Unlimited, porting much it over from Bitcoin ABC. The bug was discovered while he was digging into the code to understand divergences between the codebases. It is often thought that adding features creates a risk of adding bugs, so it is interesting that the process of adding an opcode actually led to the elimination of a bug. This seems to be a testament to good development practices. This event also provides an illustration of the benefit of having multiple independent implementation groups.
References
[1] OP_CHECKDATASIG specification https://github.com/bitcoincashorg/bitcoincash.org/blob/master/spec/op_checkdatasig.md
[2] Andrew Stone’s OP_DATASIGVERIFY https://github.com/BitcoinUnlimited/BitcoinUnlimited/blob/bucash1.3.0.0/doc/opdatasigverify.md
[3] Andrew Stone’s article on Scripting https://medium.com/@g.andrew.stone/bitcoin-scripting-applications-decision-based-spending-8e7b93d7bdb9
[4] Initial Peer Review of Proposal https://github.com/bitcoincashorg/bitcoincash.org/pull/10
[5] OP_CHECKSIG https://en.bitcoin.it/wiki/OP_CHECKSIG
[6] Use of OP_CHECKDATASIG for atomic digital good purchase https://www.reddit.com/r/btc/comments/96fxvy/op_checkdatasig_is_copying_blockstream_and_is/e4520xa/
[7] Use of OP_CHECKDATASIG for double-spend fraud assurance https://bitco.in/forum/threads/gold-collapsing-bitcoin-up.16/page-1213#post-75916
[8] Elements project “CHECKSIGFROMSTACK” https://elementsproject.org/features/opcodes
[9]Reddit post on OP_CHECKSIGFROMSTACK https://www.reddit.com/r/btc/comments/96fxvy/op_checkdatasig_is_copying_blockstream_and_is/
[10] Awemany article “600 Microseconds” on CVE-2018–17144 https://medium.com/@awemany/600-microseconds-b70f87b0b2a6
[11] Original Yours.org article https://www.yours.org/content/the-story-of-op_checkdatasig-f79679d52b23
[12] Yours.org Subsidy article https://www.yours.org/content/dear-ryan--why-op_checkdatasig-is-not-a-subsidy-3c240f0b8f19
[13] Yours.org Subsidy response article https://www.yours.org/content/my-response-to-ryan%E2%80%99s-response--op_checkdatasig-is-not-a-subsidy-6cf3529516c8
Appendix A — List of OP_CHECKDATASIG Use Cases:
This is a non-exhaustive list of use-cases for OP_CHECKDATASIG that have emerged since it was created.
- Oracles: https://medium.com/@g.andrew.stone/bitcoin-scripting-applications-decision-based-spending-8e7b93d7bdb9
- Zero-Conf Forfeits: https://gist.github.com/awemany/619a5722d129dec25abf5de211d971bd
- Digital Good Purchase via PGP Signature: https://gist.github.com/markblundeberg/af59d7cd234cbdb14dcf9e00f0ea2c17
- Pay to ID: https://gist.github.com/markblundeberg/bd28871548108fc66d958018b1bde085
- Cold Wallet Timeout (OP_CHECKSIGFROMSTACK): https://bitcoinops.org/en/newsletters/2018/10/09/
- Enforced multi-sig signing order: If the Redeem Script is[OP_2 OP_PICK <pubKey2> OP_CHECKDATASIGVERIFY OP_DUP OP_HASH160 <pubKey1Hash> OP_EQUALVERIFY OP_CHECKSIG], Then the scriptSig to spend it has to be [<sig1> <pubKey1> <sig2>] Where <sig2> is the signature of “<sig1>” as the message. This means it can’t be created until after <sig1> has signed the transaction
- Stablecoin: https://www.yours.org/content/futures-based--stable-coin--asset-solution-b784632e457f
- Covenants: https://fc17.ifca.ai/bitcoin/papers/bitcoin17-final28.pdf https://blockstream.com/2016/11/02/covenants-in-elements-alpha/
- Tokens: https://www.reddit.com/r/btc/comments/a0n8x1/native_bch_tokens_are_coming_thanks_to_op/
- Secure Multi-Party Computation: https://youtu.be/tGzsz_-oSss
- Chess on the Blockchain: https://tobiasruck.com/content/lets-play-chess-on-bch/
- SIGHASH_NOINPUT emulation using CHECKDATASIG covenants: https://gist.github.com/markblundeberg/79db7714c38bbba114dc192324f8382b
- Player v Player gaming: https://twitter.com/vinarmani/status/1109053882901192704
- Non-Custodial, Permissionless Inheritance https://www.reddit.com/r/btc/comments/bbp3bv/announcement_introducing_last_will_smart_contract/
- On-chain SLP token auction: https://old.reddit.com/r/btc/comments/bhq3hx/announcing_slp_agora_a_decentralized_onchain_app/
- “Blind Escrow”: https://local.bitcoin.com/faq#how-does-the-escrow-script-work
- Recurring Payments: https://github.com/KarolTrzeszczkowski/Mecenas-recurring-payment-EC-plugin
- Spending Constraints: https://honest.cash/v2/pein_sama/spending-constraints-with-op_checkdatasig-172
Appendix B: Design Details
This is a list of differences between OP_DATASIGVERIFY and OP_CHECKDATASIG:
- Signature format same as OP_CHECKSIG, rather than pubkey-recoverable signature.
- Takes pubkey as input, rather the pubkeyhash.
- No “type” field with the signature. All signatures are treated as strict DER encoded ECDSA, or 64-byte Schnorr.
- Hashes the input with single-SHA256 rather then double-SHA256.
- Includes non-verify version, which does not immediately mark transaction as invalid if it fails, simply returns “False”. Returns “True” if successful. For verify behavior, use OP_CHECKDATASIGVERIFY.
- Does not leave input message on the stack, all three input values are removed.
- Order of inputs is different.
The following sections will expand on some of the reasoning for the changes.
Signature format
The original design of OP_DATASIGVERIFY used pubkey-recoverable a signature format similar to what is used in the signmessage/verifymessage RPC. All of the reviewers suggested making the signature format mirror the existing OP_CHECKSIG implementation.
The reason for sticking close to the OP_CHECKSIG format is largely to lower risk, and keep the implementation manageable. Since OP_CHECKSIG has been part of consensus for a long time, its characteristics are well understood. This means that specific of the encoding can be treated exactly the same as what is already there. It is possible that Andrew Stone’s suggested signature format had advantages, but the reviewers felt it also introduced potential unknowns. Issues such as potential malleability, and sighash accounting, would have taken significant amounts of work and study to resolve, and even then would have some risk just because it is different from what is already there. Mirroring OP_CHECKSIG closely allowed all the quirks such as low-s, nullfail, and sighash counting to be done in exactly the same way, thus not introducing any unknowns.
This choice also means that the opcode has to take the public key as an input, rather than the public key hash.
No “type byte” field with the signature.
All signatures are treated as strict DER encoded ECDSA. At first glance, it may seem that to keep the signature format similar to OP_CHECKSIG, we may want to add a “type byte” at the end, in place of the sighash byte. The sighash byte, however, has nothing to do with the signature, it specifies how to process the transaction data to generate the hash that is to be checked against the signature. Since the message checked by OP_CHECKDATASIG comes from externally supplied data, it is unnecessary to have a flag specifying how the data is to be generated.
Any potential future migration to a new signature type such as Schnorr would have to accommodate the OP_CHECKSIG family of opcodes. Since they have no explicit provision for signature versioning, some method would have to be used that does not rely on signature version byte. This implies that there is little benefit to including a type byte for OP_CHECKDATASIG.
Message hashed with single-SHA256
The original OP_DATASIGVERIFY proposal took a message of any size as input, then hashed it with double-SHA256 before checking the signature. Changing this to single-SHA256 is just as secure, and makes the opcode far more flexible by being compatible with Sighashes from other transactions, and other signature systems such as PGP.
Stack Handling
The reasoning for changing the order of inputs on the stack is that message could either be supplied by the scriptSig, or the scriptPubKey, depending on the use-case. For example, in the future maybe it could be generated by an opcode in the scriptPubKey. Changing the order to [<signature>, <messageHash>, <pubKey>] makes it easy to accommodate both cases.
For similar reasons, it was decided to remove all inputs from the stack after execution, like all other opcodes do. It is easy to construct a script that leaves the message on the stack, as OP_DATASIGVERIFY did, using OP_OVER.
Implementation Notes
Disabled vs. Reserved Op Code numbers: Other op codes that have been re-enabled on Bitcoin Cash were formerly disabled. When these op codes were disabled, their use was disallowed from all transaction Scripts.
OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY, on the other hand, use op code numbers that were never previously defined and were considered “reserved”. These reserved op codes were treated differently then disabled op codes, and could appear in transaction script if they were in unexecuted IF branches. This has a few consequences:
- The opcode numbers for OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY appear many times in the blockchain in unexecuted IF branches.
- Because of this, activation has to be handled differently than for the “re-enabled” opcodes (see https://reviews.bitcoinabc.org/D1563).
Handling Signature without SigHash byte: Some code refactoring was required to handle signature without the Sighash byte. A nice side effect was to greatly increase the speed of the sigencoding tests (https://reviews.bitcoinabc.org/D1580)
Appendix C: Bitcoin ABC Implementation
Implementation of this feature in Bitcoin ABC consisted of 32 sets of changes, cataloged as follows:
Prepare for activation: D1563, D1564
Refactors and code improvements: D1565, D1569, D1573, D1574, D1576, D1575, D1578, D1589, D1595
Test additions and fixes: D1566, D1567, D1568, D1570, D1571, D1580, D1596, D1599, D1619, D1620
Separate signature and sighash-type-byte handling: D1572, D1577, D1579
Sigops counting: D1597, D1601, D1605, D3053
Implementation: D1621, D1646, D1653, D1666
Activation: D1625