-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathscripting.tex
1142 lines (835 loc) · 60.1 KB
/
scripting.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Scripting 1}
\label{ch:scripting-1}
\begin{summary}
This chapter goes deeper into what constitutes a transaction and how scripting is used to lock bitcoins and later unlock them to spend them. Several examples are provided on how to create transactions by calling a node’s API or programmatically.
\end{summary}
\section{Transactions}
A transaction sends bitcoins from one address to another and it consists of 1+ inputs and 1+ outputs. The inputs of a transaction consist of outputs of previous transactions. When an output is spend it can never be used again\footnote{Think of cash. If you give a 20 euro note you can never reuse it. You might be given change but it will be different notes or coins.}. All the bitcoins are transferred elsewhere (to a recipient, back to yourself as change, etc.). Outputs that are available to be spend are called \emph{Unspent Transaction Outputs (UTXOs)} and Bitcoin nodes keep track of the complete UTXO set.
\begin{note}
Each time funds are sent to an address a new output (UTXO) is created. Thus, the balance of an address depends on all the UTXOs that correspond to it. Bitcoin wallets hide UTXOs to make the whole experience friendlier but some wallets allow you to specify which UTXOs you want to spend if needed. When we create transactions programmatically we will deal primarily with UTXOs.
\end{note}
When an output (UTXO) is created we also specify the conditions under which this output can be spend. When you specify an input (the UTXO of a previous transaction) to spend from you have to prove that you satisfy the conditions set by the UTXO.
The spending conditions and the proof that authorizes transfer are not fixed. A scripting language is used to define them. When a new output is created a script is placed in the UTXO called \keyword{scriptPubKey} or more informally locking script.
When we want to spend that UTXO we create a new transaction with an input that references the UTXO that we wish to spend together with an unlocking script or more formally a \keyword{scriptSig}.
The standard transaction output types supported by the Bitcoin protocol are:
\begin{itemize}
\item P2PK (Pay to Public Key - not used anymore)
\item P2MS (legacy multisignature transactions; now P2SH/P2WSH/P2TR are used instead)
\item P2PKH (Pay to Public Key Hash)
\item P2SH (Pay to Script Hash)
\item P2WPKH (Pay to Witness Public Key Hash)
\item P2WSH (Pay to Witness Script Hash)
\item P2TR (Pay to Taproot)
\item OP\_RETURN (allows for storing up to 80 bytes in an output)
\item Non-standard\footnote{Valid non-standard transactions (containing scripts other than those defined by the standard transaction output type scripts) are rejected and not relayed by nodes. However, they can be mined if it is arranged with a miner.} (any other valid transaction)
\end{itemize}
The most common transaction output type offering a standard way of transferring bitcoins around is P2PKH (and P2WPKH), which is effectively ``pay to a Bitcoin address''. It is also possible, and used in the past, to pay directly to a public key with P2PK but that is not used anymore. Another very important transaction output type is P2SH (and P2WSH) which allows locking scripts of arbitrary complexity to be used.
To define a locking and unlocking script we make use of a scripting language, simply called \emph{Script}\footnote{https://en.bitcoin.it/wiki/Script}. This relatively simple language consists of several operations each of them identified by an \emph{opcode} in hexadecimal. It is a simple stack-based language that uses reverse polish notation (e.g. \keyword{2 3 +}) and does not contain potentially dangerous programming constructs, like loops; it is a domain-specific language.
\subsection*{P2PKH}
Let’s examine the standard transaction of spending a Pay to Public Key Hash. The locking script (\keyword{scriptPubKey}) that secures the funds in a P2PKH address is the following:
\begin{emphbox}
\begin{verbatim}
OP_DUP OP_HASH160 <PKHash> OP_EQUALVERIFY OP_CHECKSIG
\end{verbatim}
\end{emphbox}
As we have seen in section \ref{sec:addresses} the public key hash (PKHash) can be derived from the Bitcoin address and vice versa. Thus, the above script locks the funds that have been sent in the address that corresponds to that PKHash.
To spend the funds the owner of the private key that corresponds to that address/PKHash needs to provide an unlocking script that if we prepend to the locking script the whole script will evaluate to true. An unlocking script for a P2PKH will look like this:
\begin{emphbox}
\begin{verbatim}
<Signature> <PublicKey>
\end{verbatim}
\end{emphbox}
Using the private key we provide an ECDSA signature of part of the transaction that we create (see section~\ref{sec:signatures} for more details). We also provide the public key\footnote{As we have already discussed in section~\ref{sec:addresses} the public key only appears in the blockchain after we spend from an address. This is where it appears!} for additional verification.
The validation to spend a UTXO consists of running the script of \keyword{scriptSig} plus \keyword{scriptPubKey}. Both scripts are added in the stack and executed as one script.
\subsection*{Validation of P2PKH Spending}
The validation process is described below in a step by step explanation during script execution. In each step, the script element evaluated will be highlighted (left column) and the current stack (right column) will also be displayed.
%\begin{table}
\begin{center}
\begin{longtable}[H]{ |L{0.47\linewidth}|L{0.47\linewidth}| }
\hline
\multicolumn{2}{|l|}{\emph{Step 0: Execution starts. Stack is empty.}}\\
\hline
\textsf{<Signature> <PublicKey> OP\_DUP OP\_HASH160 <PKHash> OP\_EQUALVERIFY OP\_CHECKSIG} & \\
\hline
\hline
\multicolumn{2}{|l|}{\emph{Step 1: First element is evaluated. It consists of data so it goes into the stack.}} \\
\hline
\textsf{{\color{blue}<Signature>} <PublicKey> OP\_DUP OP\_HASH160 <PKHash> OP\_EQUALVERIFY OP\_CHECKSIG} & \textsf{<Signature>} \\
\hline
\hline
\multicolumn{2}{|l|}{\emph{Step 2: Second element is also data and goes into the stack.}} \\
\hline
\textsf{<Signature> {\color{blue}<PublicKey>} OP\_DUP OP\_HASH160 <PKHash> OP\_EQUALVERIFY OP\_CHECKSIG} & \textsf{<Signature> <PublicKey>} \\
\hline
\hline
\multicolumn{2}{|l|}{\emph{Step 3: Next element is an operator that duplicates the top element of the stack.}}\\
\hline
\textsf{<Signature> <PublicKey> {\color{blue}OP\_DUP} OP\_HASH160 <PKHash> OP\_EQUALVERIFY OP\_CHECKSIG} & \textsf{<Signature> <PublicKey> <PublicKey>} \\
\hline
\hline
\multicolumn{2}{|L{0.94\linewidth}|}{\emph{Step 4: Next element is an operator that calcuates the HASH160 of the top stack element. HASH160 is equivalent to RIPEMD160( SHA256( element ) ) which is what is needed to calculate the PKH from a public key.}}\\
\hline
\textsf{<Signature> <PublicKey> OP\_DUP {\color{blue}OP\_HASH160} <PKHash> OP\_EQUALVERIFY OP\_CHECKSIG} & \textsf{<Signature> <PublicKey> <PKHash>} \\
\hline
\hline
\multicolumn{2}{|L{0.94\linewidth}|}{\emph{Step 5: Next element is data and it is pushed into the stack.}}\\
\hline
\textsf{<Signature> <PublicKey> OP\_DUP OP\_HASH160 {\color{blue}<PKHash>} OP\_EQUALVERIFY OP\_CHECKSIG} & \textsf{<Signature> <PublicKey> <PKHash> <PKHash>} \\
\hline
\hline
\multicolumn{2}{|L{0.94\linewidth}|}{\emph{Step 6: Next element is an operator that checks if the top two elements of the stack are equal and fails the script if they are not. Effectively this validates that the public key provided is indeed the one that corresponds to the PKH (or address) that we are trying to spend.}}\\
\hline
\textsf{<Signature> <PublicKey> OP\_DUP OP\_HASH160 <PKHash> {\color{blue}OP\_EQUALVERIFY} OP\_CHECKSIG} & \textsf{<Signature> <PublicKey>} \\
\hline
\hline
\multicolumn{2}{|L{0.94\linewidth}|}{\emph{Step 7: Next element is an operator that expects two elements from the stack; a signature and a public key that corresponds to that signature. If the signature is valid it returns true, otherwise false.}}\\
\hline
\textsf{<Signature> <PublicKey> OP\_DUP OP\_HASH160 <PKHash> OP\_EQUALVERIFY {\color{blue}OP\_CHECKSIG}} & \textsf{OP\_TRUE} \\
\hline
\end{longtable}
%\caption{Validation steps of P2PKH}
%\label{tab:p2pkh-validation}
\end{center}
%\end{table}
Since the script finished and the only element in the stack is now \keyword{OP\_TRUE}\footnote{Or OP\_1, i.e. true. All the operators, or opcodes, and their explanations can be found at https://en.bitcoin.it/wiki/Script} the node validated ownership of the UTXO and it is allowed to be spent. Success!
To help clarify how addresses, locking scripts and UTXOs relate please see figure~\ref{fig:utxos-pkhashes-addresses}. Addresses 1Zed, 1Alice and 1Bob are short for the actual bitcoin addresses of Zed, Alice and Bob respectively. The diagram emphasises what happens when funds are sent to an address.
\vspace{1em}
\begin{figure}[H]
\begin{center}
\includegraphics[scale=0.5]{images/utxos-pkhashes-addresses}
\caption{UTXOs, PKHashes and Addresses relationships.}
\label{fig:utxos-pkhashes-addresses}
\end{center}
\end{figure}
This section explained how funds residing in UTXOs are locked/unlocked and how scripts are evaluated for validation. In the next section we will go through several examples of how we can create simple transactions programmatically.
\section{Creating P2PKH Transactions}
In the previous section we went through transactions, their inputs and outputs and how the funds are locked. In this post we will go through different ways of creating a simple payment transaction from the command line and then programmatically.
\subsection*{Automatically Create a Transaction}
We can use Bitcoin's build-in command \keyword{sendtoaddress} to send bitcoins to an address.
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ ./bitcoin-cli sendtoaddress mnB6gSoVfUAPu6MhKkAfgsjPfBWmEEmFr3 0.1
18f23a2c3bea97d30e0e09376222b6888943e7dc86df43ff5dfa1ff59c10d8ec
\end{lstlisting}
\end{emphbox}
In this example we use the node to send \keyword{0.1} bitcoins to a testnet address. Notice that we do not specify any details on which UTXOs to spend from. The node wallet will decide which UTXOs it will spend and in which address the change (there is almost always change) will go; i.e. we do not have any control when sending funds this way.
\begin{note}
Notice that the result is the transaction identifier (txid) of this transaction.
\end{note}
\subsection*{Creating a Transaction using a Node}
In this example we want to select the inputs explicitly. We need to know the txids and the output indexes (vout). As an example, we can get those with:
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ ./bitcoin-cli listunspent 0
[
{
"txid": "b3b7464d3472a9e83da4d5c179620b71724a62eac8bc14ac4543190227183940",
"vout": 0,
"address": "n1jnMQCyt9DHR3BYKzdbmXWM8M5UvH9nMW",
"account": "",
"scriptPubKey": "76a914ddcf9faf5625d6a96790710bbcef98af9a8719e388ac",
"amount": 1.30000000,
"confirmations": 0,
"spendable": true,
"solvable": true
}
...
]
\end{lstlisting}
\end{emphbox}
The above command lists all UTXOs (even those with 0 confirmations; i.e. in the mempool). Now we can create a transaction specifying the UTXO that we want to spend.
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ ./bitcoin-cli createrawtransaction '''
> [
> {
> "txid": "b3b7464d3472a9e83da4d5c179620b71724a62eac8bc14ac4543190227183940",
> "vout": 0
> }
> ]
> ''' '''
> {
> "mqazutWCSnuYqEpLBznke2ooGimyCtwCh8": 0.2
> }'''
0100000001403918...efbe09488ac00000000
\end{lstlisting}
\end{emphbox}
The result is the serialized raw transaction in hexadecimal. Note that this is not signed yet. To see the details of the raw transaction we can use:
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ ./bitcoin-cli decoderawtransaction 0100000001403918...efbe09488ac00000000
{
"txid": "a7b54334096108e8f69ecfa19263cfbf2f12210165ef5fc2e98ef8e4e466392e",
"hash": "a7b54334096108e8f69ecfa19263cfbf2f12210165ef5fc2e98ef8e4e466392e",
"size": 85,
"vsize": 85,
"version": 1,
"locktime": 0,
"vin": [
{
"txid": "b3b7464d3472a9e83da4d5c179620b71724a62eac8bc14ac4543190227183940",
"vout": 0,
"scriptSig": {
"asm": "",
"hex": ""
},
"sequence": 4294967295
}
],
"vout": [
{
"value": 0.20000000,
"n": 0,
"scriptPubKey": {
"asm": "OP_DUP OP_HASH160 6e751b60fcb566418c6b9f68bfa51438aefbe094\\
OP_EQUALVERIFY OP_CHECKSIG",
"hex": "76a9146e751b60fcb566418c6b9f68bfa51438aefbe09488ac",
"reqSigs": 1,
"type": "pubkeyhash",
"addresses": [
"mqazutWCSnuYqEpLBznke2ooGimyCtwCh8"
]
}
}
]
}
\end{lstlisting}
\end{emphbox}
We can confirm that this is unsigned because the unlocking script or \keyword{scriptSig} is empty. We now need to sign this transaction before it becames a valid transaction.
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ ./bitcoin-cli signrawtransactionwithwallet 01000000014039...be09488ac00000000
{
"hex": "0100000001403918270...38aefbe09488ac00000000",
"complete": true
}
\end{lstlisting}
\end{emphbox}
Now we have the final signed raw transaction (in attribute \keyword{hex}). If we use \keyword{decoderawtransaction} now you will see the unlocking script is properly set. We can test if this is a valid transaction before sending it to the node for broadcasting with the \keyword{mempoolaccept} command. Finally, we need to send it to the node for broadcasting.
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ ./bitcoin-cli sendrawtransaction 0100000001403918270...38aefbe09488ac00000000
error code: -26, error message:, 256: absurdly-high-fee
\end{lstlisting}
\end{emphbox}
In this instance we get an error saying that the transaction has an exceptionally high fee. We have not specified any output for change so \keyword{1.1} bitcoins would be given to miners (\keyword{1.3-0.2}). Most wallets have similar protection mechanisms to help safeguard from user errors.
\subsection*{Using HTTP JSON-RPC}
JSON-RPC is a simple protocol that specifies how to communicate with remote procedure calls using JSON as the format. It can be used with several transport protocols but most typically it is used over HTTP.
A user name and password has to be provided in \keyword{bitcoin.conf}. By default only local connections are allowed, but other connections can be allowed for trusted IPs with the \keyword{rpcallowip} configuration option.
\begin{emphbox}
rpcuser=kostas
rpcpassword=too\_long\_to\_guess
\end{emphbox}
Then we could use a tool like \keyword{curl} to make the JSON-RPC request:
\begin{emphbox}
\begin{lstlisting}[style=Bash]
$ curl --user kostas --data-binary '{"jsonrpc": "1.0", "id":"curltest", \\
"method": "getblockcount", "params": [] }' -H 'content-type: text/plain;'\\
http://127.0.0.1:18332/
Enter host password for user 'kostas':
{
"result": 1746817,
"error": null,
"id": "curltest"
}
\end{lstlisting}
\end{emphbox}
Thus, we can also send the commands seen before to construct transactions via JSON-RPC.
\subsection*{Calling Node Commands Programmatically}
\label{ssec:calling-node-programmatically}
Library \keyword{python-bitcoin-utils}\footnote{The library can be installed directly with \keyword{pip install bitcoin-utils} in your working Python environment} wraps a Bitcon RPC library\footnote{https://github.com/jgarzik/python-bitcoinrpc} and provides a proxy object that allows a Python program to call all the CLI commands programmatically via JSON-RPC. For example:
\vspace{1em}
\begin{lstlisting}[style=Python]
from bitcoinutils.setup import setup
from bitcoinutils.proxy import NodeProxy
# always remember to setup the network
setup('testnet')
# get a node proxy using default host and port
proxy = NodeProxy('rpcuser', 'rpcpw').get_proxy()
# call the node's getblockcount JSON-RPC method
count = proxy.getblockcount()
print(count)
\end{lstlisting}
\vspace{1em}
All API calls can be used, including the ones to create a transaction with either \keyword{sendtoaddress} or \keyword{createrawtransaction} + \keyword{signrawtransaction} + \keyword{sendrawtransaction}.
\subsection*{Creating Transactions Programmatically}
The Bitcoin node allows the creation of basic transactions but it does not support arbitrary scripts. We can create those programmatically by explicitly specifying the locking/unlocking conditions. We will use the \keyword{python-bitcoin-utils} library to demonstrate.
There are several examples included in the library that you can consult, like the most common transaction, a P2PKH payment\footnote{\url{https://github.com/karask/python-bitcoin-utils/blob/b31c82e7005e06db7f780688cfcd9332d479f39d/examples/p2pkh_transaction.py}} with one input and two outputs (the second output being the change).
\vspace{1em}
\begin{lstlisting}[style=Python]
# Copyright (C) 2018-2020 The python-bitcoin-utils developers
#
# This file is part of python-bitcoin-utils
#
# It is subject to the license terms in the LICENSE file found in the top-level
# directory of this distribution.
#
# No part of python-bitcoin-utils, including this file, may be copied,
# modified, propagated, or distributed except according to the terms contained
# in the LICENSE file.
from bitcoinutils.setup import setup
from bitcoinutils.utils import to_satoshis
from bitcoinutils.transactions import Transaction, TxInput, TxOutput
from bitcoinutils.keys import P2pkhAddress, PrivateKey
from bitcoinutils.script import Script
def main():
# always remember to setup the network
setup('testnet')
# create transaction input from tx id of UTXO (contained 0.4 tBTC)
txin = TxInput('fb48f4e23bf6ddf606714141ac78c3e921c8c0bebeb7c8abb2c799e9ff96ce6c', 0)
# create transaction output using P2PKH scriptPubKey (locking script)
addr = P2pkhAddress('n4bkvTyU1dVdzsrhWBqBw8fEMbHjJvtmJR')
txout = TxOutput(to_satoshis(0.1), Script(['OP_DUP', 'OP_HASH160', addr.to_hash160(),
'OP_EQUALVERIFY', 'OP_CHECKSIG']) )
# create another output to get the change - remaining 0.01 is tx fees
# note that this time we used to_script_pub_key() to create the P2PKH
# script
change_addr = P2pkhAddress('mmYNBho9BWQB2dSniP1NJvnPoj5EVWw89w')
change_txout = TxOutput(to_satoshis(0.29), change_addr.to_script_pub_key())
#change_txout = TxOutput(to_satoshis(0.29), Script(['OP_DUP', 'OP_HASH160',
# change_addr.to_hash160(),
# 'OP_EQUALVERIFY', 'OP_CHECKSIG']))
# create transaction from inputs/outputs -- default locktime is used
tx = Transaction([txin], [txout, change_txout])
# print raw transaction
print("\nRaw unsigned transaction:\n" + tx.serialize())
# use the private key corresponding to the address that contains the
# UTXO we are trying to spend to sign the input
sk = PrivateKey('cRvyLwCPLU88jsyj94L7iJjQX5C2f8koG4G2gevN4BeSGcEvfKe9')
# note that we pass the scriptPubkey as one of the inputs of sign_input
# because it is used to replace the scriptSig of the UTXO we are trying to
# spend when creating the transaction digest
from_addr = P2pkhAddress('myPAE9HwPeKHh8FjKwBNBaHnemApo3dw6e')
sig = sk.sign_input( tx, 0, Script(['OP_DUP', 'OP_HASH160',
from_addr.to_hash160(), 'OP_EQUALVERIFY',
'OP_CHECKSIG']) )
# get public key as hex
pk = sk.get_public_key().to_hex()
# set the scriptSig (unlocking script)
txin.script_sig = Script([sig, pk])
signed_tx = tx.serialize()
# print raw signed transaction ready to be broadcasted
print("\nRaw signed transaction:\n" + signed_tx)
if __name__ == "__main__":
main()
\end{lstlisting}
\vspace{1em}
This produces the serialized raw transaction in hexadecimal. We can then use \keyword{sendrawtransaction} to broadcast this to the Bitcoin network. Notice that the transaction identifier used to create the input is hardcoded for simplicity; in an actual program we would call \keyword{listunspent} to find the available UTXOs.
\section{Signatures}
\label{sec:signatures}
In this section we will explain how and what needs to be signed to prove ownership of funds residing in specific addresses.
When we create a new transaction we need to provide a signature for each UTXO that we want to spent. For a P2PKH UTXO the signature proves:
\begin{itemize}
\item that the signer is the owner of the private key
\item the proof of authorization is undeniable
\item the parts of the tx that were signed cannot be modified after it has been signed
\end{itemize}
The digital signature algorithm used is ECDSA and each signature is serialized using DER. There are different ways to sign inputs of a transaction so as to provide different commitments. For example: ``I agree to spend this input and sign it as long as no one can change the outputs I specified''. To specify these commitments, i.e. which parts of the transaction will be signed, we add a special 1 byte flag called SIGHASH at the end of the DER signature.
Each transaction input needs to be signed separately from others. Parts of the new transaction will be hashed to create a digest and the digest is what is signed and included in the unlocking script. To determine which parts are hashed the following rules are followed:
\begin{itemize}
\item all other inputs’ unlocking scripts (scriptSigs) should be empty
\item the input’s unlocking script (scriptSig), the one that we sign, should be set to the locking script (scriptPubKey) of the UTXO that we are trying to spend
\item follow additional rules according to the SIGHASH flag
\end{itemize}
The possible SIGHASH flags, values and meaning are:
\begin{description}
\item[\keyword{ALL (0x01)}] Signs all the inputs and outputs, protecting everything except the signature scripts against modification. This transaction is final and no one can modify it without invalidating this signature.
\item[\keyword{NONE (0x02)}] Signs all of the inputs but none of the outputs, allowing anyone to change where the satoshis are going. A miner will always be able to send the funds to his address.
%This flag is useful in combination with another inputs’ signatures using, say, \keyword{ALL} to secure the outputs.
It can also be used to send funds somewhere as long as everybody else also sends funds; in this case it is assumed that one of the signers will use SIGHASH \keyword{ALL} (or \keyword{SINGLE}) to lock the outputs before broadcasting. It can be used as a blank check to a miner.
\item[\keyword{SINGLE (0x03)}] Sign all the inputs and only one output, the one corresponding to this input (the output with the same output index number as this input), ensuring nobody can change your part of the transaction but allowing other signers to change their part of the transaction. The corresponding output must exist. It can be used if someone creates a transaction with some inputs that they cannot spend and specific outputs and then send to other participants, one of which will to sign (with \keyword{ALL} or \keyword{SINGLE}), if they agree. A concrete example is provided later in this section.
\item[\keyword{ALL|ANYONECANPAY (0x81)}] Signs only this one input and all the outputs. It allows anyone to add or remove inputs, so anyone can contribute additional satoshis but they cannot change how many satoshis are sent not where they go. It can be used to implement kickstarter-style crowdfunding.
\item[\keyword{NONE|ANYONECANPAY (0x82)}] Signs only this one input and allows anyone to add or remove other inputs or outputs, so anyone who gets a copy of this input can spend it however they like. This input can be spend even in another transaction!
\item[\keyword{SINGLE|ANYONECANPAY (0x83)}] Signs this one input and its corresponding output. Allows anyone to add or remove other inputs. A potential use would be as a means to exchange colored coin\footnote{https://en.bitcoin.it/wiki/Colored\_Coins} tokens with satoshis. For example, one can create a transaction that has an input which holds 10 of their tokens and an output of 10 million satoshis to an address that they own. This transaction would be invalid since the inputs do not provide the 10 million satoshis but it can be shared with others. If someone wants to claim the 10 tokens they can add an input with at least 10 million satoshis and an output that sends the 10 tokens to them. This would complete the transaction which can then be broadcasted.
\end{description}
\begin{figure}[H]
\begin{center}
\includegraphics[scale=0.5]{images/example-transaction}
\caption{Example transaction with two inputs and two outputs.}
\label{fig:example-transaction}
\end{center}
\end{figure}
For an example of \keyword{SINGLE}, consider creating the transaction shown above. Alice needs to pay 1.5 bitcoins to Zed and they agreed with Bob that he will contribute 0.5 of that amount. Then Alice creates a transaction with two inputs, UTXO1 that she owns (with 1 BTC) and UTXO2 that Bob owns (with 1 BTC) and a single output that sends 1.5 bitcoins to Zed. She signed UTXO1 with SIGHASH \keyword{SINGLE} and sends the incomplete transaction to Bob. Bob cannot choose a UTXO other than UTXO2 since it will invalidate Alice’s signature of UTXO1. However, he is free to add other outputs so he creates an output that sends the remaining bitcoins (he agreed to pay only 0.5) to one of his addresses. He can then sign UTXO2 with SIGHASH \keyword{ALL} which will effectively finalize the transaction\footnote{No more input or output changes are allowed without invalidating this signature.}. Note that:
\begin{itemize}
\item The sequence numbers of other inputs are not included in the signature, and can be updated.
\item As already demonstrated above, with multiple inputs, each signature hash type can sign different parts of the transaction. If a 2-input transaction has one input signed with \keyword{NONE} and the other with \keyword{ALL}, the \keyword{ALL} signer can choose where to spend the funds without consulting the \keyword{NONE} signer.
\end{itemize}
\section{Pay to script hash (P2SH)}
\label{sec:p2sh}
This section explains the rationale behind Pay to Script Hash (P2SH) type of addresses and demonstrates with code.
\subsection*{Multi-signature transaction output type}
To demonstrate the advantages of P2SH we will first go through a simple use case using first P2MS, the legacy multisignature standard transaction output type, and then implement the same script with the P2SH standard transaction output type.
Consider the scenario where we accept funds in an address that is not controlled by one person. For example it is typical for companies to allow spending from corporate accounts only if, say, 2 people agree. These are called multi-signature accounts. A multi-signature account requires M-of-N signatures in order to spend the funds. An address’ locking script could enforce that. For example a 2-of-3 multi-signature locking script would look like:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
2 <Director's Public Key> <CFO's Public Key> <COO's Public Key>
3 CHECKMULTISIG
\end{lstlisting}
\end{emphbox}
Indeed, this is the locking script of the multisignature standard transaction output type. If someone needs to send money (lock funds) to that output script then they need to know it! The company would need to send this script to all their customers that wish to pay them.
This is not practical since the whole script is recorded on the blockchain for every transaction and more importantly has privacy issues; the company is revealing the public keys that control the funds.
If you think about it the precise locking/unlocking script of the funds should not concern the customers at all. Only the company should know how to spend them.
\subsection*{Pay to script hash (P2SH)}
P2SH is a type of transaction output (BIP-16\footnote{https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki}) that moves the responsibility for supplying the conditions to redeem a transaction (locking script) from the sender of the funds to the redeemer (receiver).
The locking script of such a transaction is quite simple:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
OP_HASH160 [20-byte-hash-value] OP_EQUAL
\end{lstlisting}
\end{emphbox}
The 20-byte hash value is the hash of the redeem script:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
RIPEMD-160( SHA-256( 2 <Director's Public Key> <CFO's Public Key>
<COO's Public Key> 3 CHECKMULTISIG ) )
\end{lstlisting}
\end{emphbox}
Using this hash we create a Bitcoin address (same process as described in section~\ref{ssec:legacy-addresses} but instead of \keyword{OP\_HASH160(pubkey)} we use \keyword{OP\_HASH160(redeem script)}) using the version prefix of \keyword{0x05} that creates addresses that start with \keyword{3}.
We then only need to disseminate this address to the company’s customers. They can send funds oblivious to how these funds are locked. The company knows how they are unlocked so they can prepare the appropriate unlocking script.
As an example, to spend the funds, the company can create the following script:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
<Director's signature> <CFO's signature>
<2 <Director's Public Key> <CFO's Public Key> <COO's Public Key>
3 CHECKMULTISIG>
\end{lstlisting}
\end{emphbox}
It has two parts, the redeem script’s unlocking conditions (which in this case are two of the signatures) plus the redeem script. Notice how the redeem script is revealed only when the company spends the funds; and it only reveals the two signatures, not all three.
Validation occurs in the following steps:
Initially, the scriptSig part is pushed into the stack and then the stack is copied for later use. The copy would look like:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
<Director's signature> <CFO's signature>
<2 <Director's Public Key> <CFO's Public Key> <COO's Public Key>
3 CHECKMULTISIG>
\end{lstlisting}
\end{emphbox}
Then the execution continues normally:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
<Director's signature> <CFO's signature>
<2 <Director's Public Key> <CFO's Public Key> <COO's Public Key>
3 CHECKMULTISIG> OP_HASH160 [20-byte-hash-value] OP_EQUAL
\end{lstlisting}
\end{emphbox}
If the hashed redeem script is equal to the 20-byte hash value the above script will result \keyword{OP\_TRUE} in the top of the stack, which validates that the correct redeem script was passed. We can now proceed in validating the actual redeem script. We do this by using the copy of the stack, by first deserializing the top element (redeem script) and then executing normally:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
<Director's signature> <CFO's signature> 2
<Director's Public Key> <CFO's Public Key> <COO's Public Key>
3 CHECKMULTISIG
\end{lstlisting}
\end{emphbox}
If that also results to \keyword{OP\_TRUE} then the funds can be spend.
\subsection*{Summary / Advantages}
P2SH moves the responsibility for supplying the conditions to redeem a transaction (locking script) from the sender of the funds to the redeemer (receiver).
\begin{itemize}
\item P2SH allows us to create arbitrary redeem scripts; we can thus create quite complex scripts and not be limited to the few standard transaction output types.
\item Reduces the size of the funding transactions typically resulting in saving blockchain sace.
\item Increases privacy by hiding the locking conditions.
\end{itemize}
\subsection*{Example: create a P2SH address based on a P2PK script}
As we have seen P2SH allows us to wrap arbitrary scripts hiding the script itself until it is spend. To demonstrate we will wrap a simple P2PK script and display the P2SH address that corresponds to that script.
\vspace{1em}
\begin{lstlisting}[style=Python]
from bitcoinutils.setup import setup
from bitcoinutils.keys import P2shAddress, PrivateKey
from bitcoinutils.script import Script
def main():
# always remember to setup the network
setup('testnet')
#
# This script creates a P2SH address containing a P2PK locking script
#
# secret key corresponding to the pubkey needed for the P2PK locking
p2pk_sk = PrivateKey('cRvyLwCPLU88jsyj94L7iJjQX5C2f8koG4G2gevN4BeSGcEvfKe9')
# get the public key
p2pk_pk = p2pk_sk.get_public_key()
# get the address (from the public key)
p2pk_addr = p2pk_pk.get_address()
# create the redeem script
redeem_script = Script([p2pk_pk.to_hex(), 'OP_CHECKSIG'])
# create a P2SH address from a redeem script
addr = P2shAddress.from_script(redeem_script)
print(addr.to_string())
if __name__ == "__main__":
main()
\end{lstlisting}
\vspace{1em}
The result is address \keyword{2MvzN3FntupGqY66FuGzoK9HFXqPFyMxfVU} which can be shared to anyone that wishes to send us some funds.
\subsection*{Example: spend funds from a P2SH address}
Assuming that someone has sent funds to the P2SH address we just created let us spend it programmatically. Again, we will hardcode the UTXOs for brevity but we can always use the proxy object mentioned in section~\ref{ssec:calling-node-programmatically} to get the UTXOs programmatically.
\vspace{1em}
\begin{lstlisting}[style=Python]
from bitcoinutils.setup import setup
from bitcoinutils.utils import to_satoshis
from bitcoinutils.transactions import Transaction, TxInput, TxOutput
from bitcoinutils.keys import P2pkhAddress, P2shAddress, PrivateKey
from bitcoinutils.script import Script
def main():
# always remember to setup the network
setup('testnet')
#
# This script spends from a P2SH address containing a P2PK script
#
# create transaction input from tx id of UTXO (contained 0.1 tBTC)
txin = TxInput('7db363d5a7fabb64ccce154e906588f1936f34481223ea8c1f2c935b0a0c945b', 0)
# secret key needed to spend P2PK that is wrapped by P2SH
p2pk_sk = PrivateKey('cRvyLwCPLU88jsyj94L7iJjQX5C2f8koG4G2gevN4BeSGcEvfKe9')
p2pk_pk = p2pk_sk.get_public_key().to_hex()
# create the redeem script - needed to sign the transaction
redeem_script = Script([p2pk_pk, 'OP_CHECKSIG'])
to_addr = P2pkhAddress('n4bkvTyU1dVdzsrhWBqBw8fEMbHjJvtmJR')
txout = TxOutput(to_satoshis(0.09), to_addr.to_script_pub_key() )
# no change address - the remaining 0.01 tBTC will go to miners)
# create transaction from inputs/outputs -- default locktime is used
tx = Transaction([txin], [txout])
# print raw transaction
print("\nRaw unsigned transaction:\n" + tx.serialize())
# use the private key corresponding to the address that contains the
# UTXO we are trying to spend to create the signature for the txin -
# note that the redeem script is passed to replace the scriptSig
sig = p2pk_sk.sign_input(tx, 0, redeem_script )
# set the scriptSig (unlocking script)
txin.script_sig = Script([sig, redeem_script.to_hex()])
signed_tx = tx.serialize()
# print raw signed transaction ready to be broadcasted
print("\nRaw signed transaction:\n" + signed_tx)
print("\nTxId:", tx.get_txid())
if __name__ == "__main__":
main()
\end{lstlisting}
\vspace{1em}
\begin{note}
As already mentioned, the first time we spend from the address, i.e. broadcasting this transaction, reveals the redeem script to everyone.
\end{note}
\section{Segregated Witness (SegWit)}
\label{sec:segwit}
In this section we will briefly explain what Segregated Witness is, what it changes and how we can create segwit scripts programmatically.
Segregated Witness is a consensus change that introduces an update on how transactions are constructed. In particular it separates (segregates) the unlocking script (witness) from the rest of the input; a transaction input does not contain an unlocking script anymore and the latter is found in another structure that goes alongside the transaction.
To illustrate see figure~\ref{fig:tx-with-without-segwit} (i) for the original transaction that we used as an example in chapter~\ref{ch:howbitcoinworks}. We can see that the unlocking script is part of the input that we are spending. In (ii) we can see the segwit equivalent transaction where the unlocking script (\keyword{scriptSig}) is empty and it is moved outside the transaction.
\begin{figure}[H]
\begin{center}
\includegraphics[scale=0.5]{images/tx-with-without-segwit}
\caption{Example transaction with (i) a non-segwit input and (ii) a segwit input.}
\label{fig:tx-with-without-segwit}
\end{center}
\end{figure}
The segwit upgrade is described in detail in BIPs\footnote{Bitcoin Improvement Proposals} 141, 143, 144 and 145 and it provides several benefits\footnote{https://bitcoincore.org/en/2016/01/26/segwit-benefits/}. We will examine two of them here: \emph{transaction malleability} and \emph{effective block size increase}.
\subsection*{Transaction malleability}
Each transaction is uniquely identified by its \emph{transaction identifier} or \keyword{txid}. The \keyword{txid} is constructed by hashing\footnote{The SHA-256 hashing algorithm is applied twice.} the serialized transaction (the blue part of figure~\ref{fig:tx-with-without-segwit}).
It is possible to slightly change the unlocking script, e.g. by a miner, so as the resulting transaction is semantically identical to the original; thus still valid. This can be accomplished, for example, by slightly changing the structure of the signature\footnote{DER allows for some flexibility on how to encode the signature.}.
That is a problem because given how the \keyword{txid} is created even the slightest modification will change the \keyword{txid}. While the transaction is identical, i.e. funds will be moved exactly as intended, our ability to monitor this transaction is problematic given that we will be checking for confirmations in a \keyword{txid} that is not valid anymore.
With segwit inputs, however, the unlocking script (the green part of figure~\ref{fig:tx-with-without-segwit} (ii) ) is not part of the \keyword{txid} construction and thus it is impossible to modify it. A non-malleable \keyword{txid} is more reliable and allows for more advanced scripts/solutions like the lightning network.
\subsection*{Effective block size increase}
The actual block size remains the same, at 1MB. However, the unlocking scripts are now not counted as part of the block and thus more transactions can fit into the 1MB limit.
Segwit introduces the concept of block \emph{weight}, a new metric for the size of blocks. A block can have a maximum weight of 4MBs. The non-witness part bytes of a transaction are now multiplied by 4 to get its weight while the witness part bytes are multiplied by 1, a discount of 75\%. This allows for an effective block size increase of about 2.8x, if all transactions use segwit\footnote{E.g. block 748918 reached 2.77 MBs because it included a lot of P2WSH 2-of-3 multi-signature scripts.}.
The \emph{virtual} size, or \keyword{vsize} of a transaction is the size in bytes of a transaction including the segwit discount. For non-segwit transactions\footnote{I.e. transaction with no segwit inputs.} \keyword{size} and \keyword{vsize} are identical.
\subsection*{Segwit transaction output types}
Segwit introduces two new transaction types, \emph{Pay-to-Witness-Public-Key-Hash (P2WPKH)} and \emph{Pay-to-Witness-Script-Hash (P2WSH)}, which are the segwit equivalent of P2PKH and P2SH respectively. They are sometimes called \emph{native} segwit to differentiate from \emph{nested} segwit.
The locking script (\keyword{scriptPubKey}) of these new types consists of two elements, a \keyword{version byte} and the \keyword{witness program}. The version byte introduces versioning in the witness program of the script. That is another benefit of segwit, since it allows for easy updates based on a new version.
Remember, that when signing to spend any output we need to provide the locking script, which is used to substitute the \keyword{scriptPubKey} before we calculate the transaction digest and sign it. For segwit transaction types each witness program corresponds to a predefined template script that is called \keyword{scriptCode}.
For example the \keyword{scriptCode} for a P2WPKH output is:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
OP_DUP OP_HASH160 <pubkey-hash> OP_EQUALVERIFY OP_CHECKSIG
\end{lstlisting}
\end{emphbox}
where the pubkey-hash is substituted with the witness program. Note that this is used for calculating the digest and not for validation (see next section).
\subsubsection*{P2WPKH}
In segwit version 0, a P2WPKH witness program is just the 20-byte public key hash. The unlocking script (\keyword{scriptSig}) should be empty and the \emph{script witness} (or just \emph{witness}\footnote{The witness structure is an array that contains one or more script witnesses that correspond to the Segwit inputs of the transaction. Each witness is effectively the \keyword{scriptSig} that unlocks that input.}) contains the unlocking script.
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
scriptPubKey: 0 6b85f9a17492c691c1d861cc1c722ff683b27f5a
scriptSig:
witness: <signature> <pubkey>
\end{lstlisting}
\end{emphbox}
The validation is executed as follows:
\begin{enumerate}
\item The `0' in scriptPubKey specifies that the following is a version 0 witness program.
\item The length of the witness program (20-bytes) indicates that it is a P2WPKH type.
\item The witness must consist of exactly two items.
\item The \keyword{HASH160} of the <pubkey> must match the 20-bytes witness program.
\item Finally, the signature is verified by: \keyword{<signature> <pubkey> CHECKSIG}.
\end{enumerate}
The \keyword{scriptCode} for P2WPKH is identical to P2PKH.
\subsubsection*{P2WSH}
In segwit version 0, a P2WSH witness program is just the 32-byte script hash. The unlocking script (\keyword{scriptSig}) should be empty and the witness contains the script that unlocks the funds as well as the \emph{witness script}\footnote{To clarify, witness script for a P2WSH is exactly what redeem script is for a P2SH and script witness or just witness is the equivalent of the scriptSig or unlocking script.}.
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
scriptPubKey: 0 3b892c61cc15f9a17<32 bytes>c1c722ff683b27f5a
scriptSig:
witness: 0 <signature1> <1 <pubkey1> <pubkey2> 2 CHECKMULTISIG>
\end{lstlisting}
\end{emphbox}
The validation is executed as follows:
\begin{enumerate}
\item The `0' in scriptPubKey specifies that the following is a version 0 witness program.
\item The length of the witness program (32-bytes) indicates that it is a P2WSH type.
\item The witness must consist of the unlocking script followed by the witness script)
\item The \keyword{SHA256} of the witness script must match the 32-bytes witness program
\item Finally, the witness script is deserialized and executed after the remaining witness stack: \keyword{0 <signature1> \emph{1 <pubkey1> <pubkey2> 2 CHECKMULTISIG}}.
\end{enumerate}
The \keyword{scriptCode} for P2WSH is the witness script.
\subsection*{Nested Segwit transaction output types}
It is possible for a non-segwit aware wallet to pay to a segwit address by embedding P2WPKH or P2WSH into a P2SH. The recipient will provide a P2SH address to the sender who can send funds as usual. The recipient can then use the redeem script which is actually the witness script to spend the funds.
\subsubsection*{P2SH(P2WPKH)}
Get the hash of the P2WPKH \keyword{scriptPubKey} and use it in P2SH as usual:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
HASH160 (0 6b85f9a17492c691c1d861cc1c722ff683b27f5a)
\end{lstlisting}
\end{emphbox}
And thus:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
scriptPubKey: <HASH160 <20-byte-script-hash> EQUAL
scriptSig: <0 <20-byte-key-hash>>
witness: <signature> <pubkey>
\end{lstlisting}
\end{emphbox}
Note that for spending\footnote{From a segwit-aware wallet.} the \keyword{scriptSig} contains the redeem script as a single element and without any extra unlocking data before it since those are in witness (check section~\ref{sec:p2sh} for how P2SH works).
For validation the \keyword{scriptSig} is hashed with \keyword{HASH160} and compared to the \keyword{<20-byte-script-hash>} in \keyword{scriptPubKey}. If they are equal we can then verify the public key and signature as a native P2WPKH.
\subsubsection*{P2SH(P2WSH)}
Get the hash of the P2WSH \keyword{scriptPubKey} and use it in P2SH as usual:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
HASH160 (0 3b892c61cc15f9a17<32 bytes>c1c722ff683b27f5a)
\end{lstlisting}
\end{emphbox}
And thus:
\begin{emphbox}
\begin{lstlisting}[style=Pseudomath]
scriptPubKey: <HASH160 <20-byte-script-hash> EQUAL
scriptSig: <0 <32-byte-witness-script-hash>>
witness: 0 <signature1> <1 <pubkey1> <pubkey2> 2 CHECKMULTISIG>
\end{lstlisting}
\end{emphbox}
Note that for spending the \keyword{scriptSig} contains the witness script as a single element and without any extra unlocking data before it since those are in witness (check section~\ref{sec:p2sh} for how P2SH works).
For validation the \keyword{scriptSig} is hashed with \keyword{HASH160} and compared to the \keyword{<20-byte-script-hash>} in \keyword{scriptPubKey}. If they are equal we can then verify the witness script as a native P2WSH.
\subsection*{Implemented as a soft-fork}
Changing the transaction format is normally a hard-fork. The new transactions would not be accepted by the nodes running the old software. To go around that and implement the new functionality as a soft-fork additional effort was required. Without going into too much details:
\begin{itemize}
\item The original transaction format was not changed. The \keyword{scriptSig} would just be empty and the \keyword{witnesses} would go in a new structure.
\item The header \emph{should} represent the whole block and thus a witness merkle root is calculated (from all transactions' witness scripts) and included in an \keyword{OP\_RETURN} output (see section~\ref{sec:storing-data}) of the coinbase transaction. Being in the coinbase means that the witnesses are represented in the header via the \keyword{hashMerkleRoot} (see section~\ref{sec:mining-technical} for a reminder).
\item Witness data are provided only when nodes ask for them and thus old nodes will get blocks without the witness data, i.e. new nodes will remove witness data before relaying the blocks to old nodes. To old nodes, segwit blocks would look like blocks that contain some non-standard transactions.
\item Old nodes trying to spend a segwit output would violate the clean stack rule\footnote{If there are more than one elements after execution the stack is not \emph{clean} and is considered non standard for legacy transactions. For segwit v0 (P2WSH) and Tapscript they are considered invalid.}; OP\_0 <bytes in hex> will remain in the stack.
\end{itemize}
\subsection*{Example: spend a native segwit output type}
Assuming that a P2WPKH address have some funds (UTXOs) we can use the following code to send some funds to a legacy address. As usual, we hardcode the UTXOs for simplicity.
\vspace{1em}
\begin{lstlisting}[style=Python]
from bitcoinutils.setup import setup
from bitcoinutils.utils import to_satoshis
from bitcoinutils.transactions import Transaction, TxInput, TxOutput
from bitcoinutils.keys import P2pkhAddress, PrivateKey
from bitcoinutils.script import Script
def main():
# always remember to setup the network
setup('testnet')
# the key that corresponds to the P2WPKH address
priv = PrivateKey("cVdte9ei2xsVjmZSPtyucG43YZgNkmKTqhwiUA8M4Fc3LdPJxPmZ")
pub = priv.get_public_key()
fromAddress = pub.get_segwit_address()
print(fromAddress.to_string())
# amount is needed to sign the segwit input
fromAddressAmount = to_satoshis(0.01)
# UTXO of fromAddress
txid = '13d2d30eca974e8fa5da11b9608fa36905a22215e8df895e767fc903889367ff'
vout = 0
toAddress = P2pkhAddress('mrrKUpJnAjvQntPgz2Z4kkyr1gbtHmQv28')
# create transaction input from tx id of UTXO
txin = TxInput(txid, vout)
# script code required for signing; for p2wpkh it is the same as p2pkh
script_code = Script(['OP_DUP', 'OP_HASH160', pub.to_hash160(),
'OP_EQUALVERIFY', 'OP_CHECKSIG'])
# create transaction output
txOut = TxOutput(to_satoshis(0.009), toAddress.to_script_pub_key())
# create transaction without change output - if at least a single input is
# segwit we need to set has_segwit=True
tx = Transaction([txin], [txOut], has_segwit=True)
print("\nRaw transaction:\n" + tx.serialize())
sig = priv.sign_segwit_input(tx, 0, script_code, fromAddressAmount)
tx.witnesses.append( Script([sig, pub.to_hex()]) )
# print raw signed transaction ready to be broadcasted
print("\nRaw signed transaction:\n" + tx.serialize())
print("\nTxId:", tx.get_txid())
if __name__ == "__main__":
main()
\end{lstlisting}
\vspace{1em}
\subsection*{Example: spend a nested segwit output type}
As described above we can also nest the new segwit address output types in a P2SH so that old non-segwit clients can send funds to new segwit-supporting clients. This example spends such a UTXO demonstrating a couple of additional details that need to be taken into consideration.
\vspace{1em}
\begin{lstlisting}[style=Python]
from bitcoinutils.setup import setup
from bitcoinutils.utils import to_satoshis
from bitcoinutils.keys import PrivateKey, P2pkhAddress, P2shAddress
from bitcoinutils.transactions import Transaction, TxInput, TxOutput
from bitcoinutils.script import Script
def main():
# always remember to setup the network
setup('testnet')
# the key that corresponds to the P2WPKH address
priv = PrivateKey('cNho8fw3bPfLKT4jPzpANTsxTsP8aTdVBD6cXksBEXt4KhBN7uVk')
pub = priv.get_public_key()
# the p2sh script and the corresponding address
redeem_script = pub.get_segwit_address().to_script_pub_key()
p2sh_addr = P2shAddress.from_script(redeem_script)
# the UTXO of the P2SH-P2WPKH that we are trying to spend
inp = TxInput('95c5cac558a8b47436a3306ba300c8d7af4cd1d1523d35da3874153c66d99b09', 0)
# exact amount of UTXO we try to spent
amount = 0.0014
# the address to send funds to
to_addr = P2pkhAddress('mvBGdiYC8jLumpJ142ghePYuY8kecQgeqS')
# the output sending 0.001 -- 0.0004 goes to miners as fee -- no change
out = TxOutput(to_satoshis(0.001), to_addr.to_script_pub_key())
# create a tx with at least one segwit input
tx = Transaction([inp], [out], has_segwit=True)
# script code is the script that is evaluated for a witness program type; each
# witness program type has a specific template for the script code;
# the script code that corresponds to P2WPKH is the same as P2PKH
script_code = pub.get_address().to_script_pub_key()
# calculate signature using the appropriate script code
# remember to include the original amount of the UTXO
sig = priv.sign_segwit_input(tx, 0, script_code, to_satoshis(amount))
# script_sig is the redeem script passed as a single element
inp.script_sig = Script([redeem_script.to_hex()])
# finally, the unlocking script is added as a witness
tx.witnesses.append(Script([sig, pub.to_hex()]))
# print raw signed transaction ready to be broadcasted
print("\nRaw signed transaction:\n" + tx.serialize())
if __name__ == "__main__":
main()
\end{lstlisting}
\vspace{1em}
\section{Pay To Multi-signature (P2MS)}
Pay to multi-signature or \emph{Pay-to-Multisig} is a standard output type that was introduced before P2SH. Its aim was to provide a way for bitcoins to be locked by several public keys which could belong to different people. Typically, only a subset of signatures would be required. For example a 2-of-3 multisig would require at least 2 signatures from 2 of the corresponding private keys.
In section~\ref{sec:p2sh} we have seen an example of an 2-of-3 multisig wrapped in P2SH. This is the typical way to use multisig after P2SH was created because P2MS has several drawbacks:
\begin{itemize}
\item It is limited to 3 public keys while P2SH allows up to 15.
\item It has no address format. To send funds to a P2MS the sender needs to know the multisig script.
\item The public keys are visible even before an output is spent.
\end{itemize}
To construct and lock funds in a P2MS output we use the exact script that we described in section~\ref{sec:p2sh} and send the funds there directly.
\begin{note}
The \keyword{CHECKMULTISIG} opcode has a bug where it pops an extra element from the stack. For backward compatibility the bug cannot be fixed and thus to avoid the issue we add an additional dummy value at the beginning of the unlocking script. Typically, \keyword{OP\_0} is used as a dummy but anything is valid.
\end{note}
\section{Storing Data (OP\_RETURN)}
\label{sec:storing-data}
The blockchain ensures that all existing entries are tamper-proof resistant; modifications and deletions are not allowed. This makes it quite useful for \emph{permanently} storing data that will stand the test of time, which is ideal for certain applications like notary services, certificate ownership and others.
\subsection*{Indirectly}
However, Bitcoin’s blockchain was not designed for storage in general and data could only be stored indirectly. Examples include adding data to coinbase transactions, to transaction outputs and to multi-signature addresses.
Let's go through one of the above examples, and explain how it would be possible to store data in the outputs of fake addresses. Remember that the output address is represented as the public key hash or 20 bytes (40 hex characters). Those can be faked to represent the data that we need to store. For example\footnote{Borrowed from Ken Shirriff's excellent blog post: \url{https://www.righto.com/2014/02/ascii-bernanke-wikileaks-photographs.html}}, to store ``\keyword{3Nelson-Mandela.jpg?}'' (20 characters) we would need to convert it to hexadecimal which is ``\keyword{334E656C736F6E2D4D616E64656C612 E6A70673F}'' and would correspond to address ``\keyword{15gHNr4TCKmhHDEG31L2XFNvpnEcnPS Qvd}''. Amazingly, the first transaction's\footnote{The transaction with id 8881a937a437ff6ce83be3a89d77ea88ee12315f37f7ef0dd3742c30eef92dba.} outputs of that address, if added together, contain an actual image of Nelson Mandela!
The satoshis sent to such a fake address will be lost forever since there is no (known) private key that corresponds to it. In the past, when the value of bitcoin was small it was easy to afford to store this way, but even if it wasn't such methods are not encouraged to say the least.
\begin{note}
Storing data this way creates an overhead for the UTXO set in every node in the network. These addresses will never be used (i.e. the satoshis there are lost) but the system is not aware so they need to keep track of them as unspent outputs!
\end{note}
\subsection*{Directly}
Storing data on the blockchain with the above methods was frowned upon the community because of the overhead it was placing on the running nodes. Others argued that as long as the transaction fee is paid there is no reason why it should be considered spam. The compromise was the introduction of an operator, called \keyword{OP\_RETURN} specifically dealing with storing small amount of data on the blockchain.
The \keyword{OP\_RETURN} opcode was followed by a maximum of 80 bytes of data\footnote{Note that OP\_RETURN outputs larger than 80 bytes are still valid but are not considered standard. Thus, they will not be relayed by nodes but a miner can include them in the blockchain. For an example see transaction with id d29c9c0e8e4d2a9790922af73f0b8d51f0bd4bb19940d9cf910ead8fbe85bc9b.}. No satoshis were required to be sent (other than the transaction fees) and more importantly, \keyword{OP\_RETURN} was not bloating the UTXO set.
Since 80 bytes is a very small amount of data it is usually used to store a hash (or digest or digital fingerprint) of some data ensuring the integrity of the data rather than the immutable existence of the data itself. Alternatively, it can be used to encode meta-protocol information, such as used by Counterparty\footnote{https://counterparty.io/}, the OMNI protocol\footnote{https://www.omnilayer.org/} and verifiable PDFs\footnote{https://verifiable-pdfs.org/}.%\footnote{https://ieeexplore.ieee.org/document/8525400}.
An example script would be:
\begin{emphbox}