Bitcoin Cash Protocol Upgrades

Restore disabled script opcodes, May 2018

layout: specification
title: Restore disabled script opcodes, May 2018
category: spec
date: 2018-04-05
activation: 1526400000
version: 0.4
updated: 2018-05-23

Introduction

In 2010 and 2011 the discovery of serious bugs prompted the deactivation of many opcodes in the Bitcoin script language. It is our intention to restore the functionality that some of these opcodes provided in Bitcoin Cash. Rather than simply re-enable the opcodes, the functionality that they provide has been re-examined and in some cases the opcodes have been re-designed or new opcodes have been added to address specific issues.

This document contains the specifications for the opcodes that are to be added in the May 2018 protocol upgrade. We anticipate that additional opcodes will be proposed for the November 2018, or later, protocol upgrades.

The opcodes that are to be added are:

Word OpCode Hex Input Output Description
OP_CAT 126 0x7e x1 x2 out Concatenates two byte sequences
OP_SPLIT 127 0x7f x n x1 x2 Split byte sequence x at position n
OP_AND 132 0x84 x1 x2 out Boolean AND between each bit of the inputs
OP_OR 133 0x85 x1 x2 out Boolean OR between each bit of the inputs
OP_XOR 134 0x86 x1 x2 out Boolean EXCLUSIVE OR between each bit of the inputs
OP_DIV 150 0x96 a b out a is divided by b
OP_MOD 151 0x97 a b out return the remainder after a is divided by b
OP_NUM2BIN 128 0x80 a b out convert numeric value a into byte sequence of length b
OP_BIN2NUM 129 0x81 x out convert byte sequence x into a numeric value

Splice operations: OP_CAT, OP_SPLIT**

Bitwise logic: OP_AND, OP_OR, OP_XOR

Arithmetic: OP_DIV, OP_MOD

New operations:

  • x OP_BIN2NUM -> n, convert a byte sequence x into a numeric value
  • n m OP_NUM2BIN -> out, convert a numeric value n into a byte sequence of length m

Further discussion of the purpose of these new operations can be found below under bitwise operations.

** A new operation, OP_SPLIT, has been designed as a replacement for OP_SUBSTR, OP_LEFTand OP_RIGHT. The original operations can be implemented with varying combinations of OP_SPLIT, OP_SWAP and OP_DROP.

Script data types

It should be noted that in script operation data values on the stack are interpreted as either byte sequences or numeric values. All data on the stack is interpreted as a byte sequence unless specifically stated as being interpreted as a numeric value.

For accuracy in this specification, a byte sequences is presented as {0x01, 0x02, 0x03}. This sequence is three bytes long, it begins with a byte of value 1 and ends with a byte of value 3.

The numeric value type has specific limitations:

  1. The used encoding is little endian with an explicit sign bit (the highest bit of the last byte).
  2. They cannot exceed 4 bytes in length.
  3. They must be encoded using the shortest possible byte length (no zero padding)
    1. There is one exception to rule 3: if there is more than one byte and the most significant bit of the second-most-significant-byte is set it would conflict with the sign bit. In this case a single 0x00 or 0x80 byte is allowed to the left.
  4. Zero is encoded as a zero length byte sequence. Single byte positive or negative zero (0x00 or 0x80) are not allowed.

The new opcode x OP_BIN2NUM -> out can be used convert a byte sequence into a numeric value where required.

The new opcode x n OP_NUM2BIN can be used to convert a numeric value into a zero padded byte sequence of length n whilst preserving the sign bit.

Definitions

  • Stack memory use. This is the sum of the size of the elements on the stack. It gives an indication of impact on memory use by the interpreter.
  • Operand order. In keeping with convention where multiple operands are specified the top most stack item is the last operand. e.g. x1 x2 OP_CAT –> x2 is the top stack item and x1 is the next from the top.
  • empty byte sequence. Throughout this document OP_0 is used as a convenient representation of an empty byte sequence. Whilst it is a push data op code, its effect is to push an empty byte sequence on to the stack.

Specification

Global conditions apply to all operations. These conditions must be checked by the implementation when it is possible that they will occur:

  • for all e : elements on the stack, 0 <= len(e) <= MAX_SCRIPT_ELEMENT_SIZE
  • for each operator, the required number of operands are present on the stack when the operand is executed

These unit tests should be included for every operation:

  1. executing the operation with an input element of length greater than MAX_SCRIPT_ELEMENT_SIZE will fail
  2. executing the operation with an insufficient number of operands on the stack causes a failure

Operand consumption

In all cases where not explicitly stated otherwise the operand stack elements are consumed by the operation and replaced with the output.

Splice operations

OP_CAT

Opcode (decimal): 126
Opcode (hex): 0x7e

Concatenates two operands.

x1 x2 OP_CAT -> out

Examples:

  • {Ox11} {0x22, 0x33} OP_CAT -> 0x112233

The operator must fail if len(out) > MAX_SCRIPT_ELEMENT_SIZE. The operation cannot output elements that violate the constraint on the element size.

Note that the concatenation of a zero length operand is valid.

Impact of successful execution:

  • stack memory use is constant
  • number of elements on stack is reduced by one

The limit on the length of the output prevents the memory exhaustion attack and results in the operation having less impact on stack size than existing OP_DUP operators.

Unit tests:

  1. maxlen_x y OP_CAT -> failure. Concatenating any operand except an empty vector, including a single byte value (e.g. OP_1), onto a maximum sized array causes failure
  2. large_x large_y OP_CAT -> failure. Concatenating two operands, where the total length is greater than MAX_SCRIPT_ELEMENT_SIZE, causes failure
  3. OP_0 OP_0 OP_CAT -> OP_0. Concatenating two empty arrays results in an empty array
  4. x OP_0 OP_CAT -> x. Concatenating an empty array onto any operand results in the operand, including when len(x) = MAX_SCRIPT_ELEMENT_SIZE
  5. OP_0 x OP_CAT -> x. Concatenating any operand onto an empty array results in the operand, including when len(x) = MAX_SCRIPT_ELEMENT_SIZE
  6. x y OP_CAT -> concat(x,y). Concatenating two operands generates the correct result

OP_SPLIT

OP_SPLIT replaces OP_SUBSTR and uses it’s opcode.

Opcode (decimal): 127
Opcode (hex): 0x7f

Split the operand at the given position. This operation is the exact inverse of OP_CAT

x n OP_SPLIT -> x1 x2
where n is interpreted as a numeric value

Examples:

  • {0x00, 0x11, 0x22} 0 OP_SPLIT -> OP_0 {0x00, 0x11, 0x22}
  • {0x00, 0x11, 0x22} 1 OP_SPLIT -> {0x00} {0x11, 0x22}
  • {0x00, 0x11, 0x22} 2 OP_SPLIT -> {0x00, 0x11} {0x22}
  • {0x00, 0x11, 0x22} 3 OP_SPLIT -> {0x00, 0x11, 0x22} OP_0

Notes:

  • this operator has been introduced as a replacement for the previous OP_SUBSTR, OP_LEFTand OP_RIGHT. All three operators can be simulated with varying combinations of OP_SPLIT, OP_SWAP and OP_DROP. This is in keeping with the minimalist philosophy where a singlenprimitive can be used to simulate multiple more complex operations.
  • x is split at position n, where n is the number of bytes from the beginning
  • x1 will be the first n bytes of x and x2 will be the remaining bytes
  • if n == 0, then x1 is the empty array and x2 == x
  • if n == len(x) then x1 == x and x2 is the empty array.
  • if n > len(x), then the operator must fail.
  • x n OP_SPLIT OP_CAT -> x, for all x and for all 0 <= n <= len(x)

The operator must fail if:

  • !isnum(n). Fail if n is not a numeric value.
  • n < 0. Fail if n is negative.
  • n > len(x). Fail if n is too high.

Impact of successful execution:

  • stack memory use is constant (slight reduction by len(n))
  • number of elements on stack is constant

Unit tests:

  • OP_0 0 OP_SPLIT -> OP_0 OP_0. Execution of OP_SPLIT on empty array results in two empty arrays.
  • x 0 OP_SPLIT -> OP_0 x
  • x len(x) OP_SPLIT -> x OP_0
  • x (len(x) + 1) OP_SPLIT -> FAIL
  • include successful unit tests

Bitwise logic

The bitwise logic operators expect ‘byte sequence’ operands. The operands must be the same length.

  • In the case of ‘byte sequence’ operands OP_CAT can be used to pad a shorter byte sequence to an appropriate length.
  • In the case of ‘byte sequence’ operands where the length of operands is not known until runtime a sequence of 0x00 bytes (for use with OP_CAT) can be produced using OP_0 n OP_NUM2BIN
  • In the case of numeric value operands x n OP_NUM2BIN can be used to pad a numeric value to length n whilst preserving the sign bit.

OP_AND

Opcode (decimal): 132
Opcode (hex): 0x84

Boolean and between each bit in the operands.

x1 x2 OP_AND -> out

Notes:

  • where len(x1) == 0 and len(x2) == 0 the output will be an empty array.

The operator must fail if:

  1. len(x1) != len(x2). The two operands must be the same size.

Impact of successful execution:

  • stack memory use reduced by len(x1)
  • number of elements on stack is reduced by one

Unit tests:

  1. x1 x2 OP_AND -> failure, where len(x1) != len(x2). The two operands must be the same size.
  2. x1 x2 OP_AND -> x1 & x2. Check valid results.

OP_OR

Opcode (decimal): 133
Opcode (hex): 0x85

Boolean or between each bit in the operands.

x1 x2 OP_OR -> out

The operator must fail if:

  1. len(x1) != len(x2). The two operands must be the same size.

Impact of successful execution:

  • stack memory use reduced by len(x1)
  • number of elements on stack is reduced by one

Unit tests:

  1. x1 x2 OP_OR -> failure, where len(x1) != len(x2). The two operands must be the same size.
  2. x1 x2 OP_OR -> x1 | x2. Check valid results.

OP_XOR

Opcode (decimal): 134
Opcode (hex): 0x86

Boolean xor between each bit in the operands.

x1 x2 OP_XOR -> out

The operator must fail if:

  1. len(x1) != len(x2). The two operands must be the same size.

Impact of successful execution:

  • stack memory use reduced by len(x1)
  • number of elements on stack is reduced by one

Unit tests:

  1. x1 x2 OP_XOR -> failure, where len(x1) != len(x2). The two operands must be the same size.
  2. x1 x2 OP_XOR -> x1 xor x2. Check valid results.

Arithmetic

Note about canonical form and floor division

Operands for all arithmetic operations are assumed to be numeric values and must be in canonical form See data types for more information.

Floor division

Note that when considering integer division and modulo operations with negative operands, the rules applied in the C language and most languages (with Python being a notable exception) differ from the strict mathematical definition. Script follows the C language set of rules. Namely:

  1. Non-integer quotients are rounded towards zero
  2. The equation (a/b)*b + a%b == a is satisfied by the results
  3. From the above equation it follows that: a%b == a - (a/b)*b
  4. In practice if a is negative for the modulo operator the result will be negative or zero.

OP_DIV

Opcode (decimal): 150
Opcode (hex): 0x96

Return the integer quotient of a and b. If the result would be a non-integer it is rounded towards zero.

a b OP_DIV -> out

where a and b are interpreted as numeric values

The operator must fail if:

  1. !isnum(a) || !isnum(b). Fail if either operand is not a numeric value.
  2. b == 0. Fail if b is equal to any type of zero.

Impact of successful execution:

  • stack memory use reduced
  • number of elements on stack is reduced by one

Unit tests:

  1. a b OP_DIV -> failure where !isnum(a) or !isnum(b). Both operands must be numeric values
  2. a 0 OP_DIV -> failure. Division by positive zero (all sizes), negative zero (all sizes), OP_0
  3. 27 7 OP_DIV -> 3, 27 -7 OP_DIV -> -3, -27 7 OP_DIV -> -3, -27 -7 OP_DIV -> 3. Check negative operands. Pay attention to sign.
  4. check valid results for operands of different lengths 0..4

OP_MOD

Opcode (decimal): 151
Opcode (hex): 0x97

Returns the remainder after dividing a by b. The output will be represented using the least number of bytes required.

a b OP_MOD -> out

where a and b are interpreted as numeric values

The operator must fail if:

  1. !isnum(a) || !isnum(b). Fail if either operand is not a numeric value.
  2. b == 0. Fail if b is equal to any type of zero.

Impact of successful execution:

  • stack memory use reduced (one element removed)
  • number of elements on stack is reduced by one

Unit tests:

  1. a b OP_MOD -> failure where !isnum(a) or !isnum(b). Both operands must be numeric values.
  2. a 0 OP_MOD -> failure. Division by positive zero (all sizes), negative zero (all sizes), OP_0
  3. 27 7 OP_MOD -> 6, 27 -7 OP_MOD -> 6, -27 7 OP_MOD -> -6, -27 -7 OP_MOD -> -6. Check negative operands. Pay attention to sign.
  4. check valid results for operands of different lengths 0..4 and returning result zero

New operations

OP_NUM2BIN

OP_NUM2BIN replaces OP_LEFT and uses it’s opcode

Opcode (decimal): 128
Opcode (hex): 0x80

Convert the numeric value into a byte sequence of a certain size, taking account of the sign bit. The byte sequence produced uses the little-endian encoding.

a b OP_NUM2BIN -> x

where a and b are interpreted as numeric values. a is the value to be converted to a byte sequence, it can be up to MAX_SCRIPT_ELEMENT_SIZE long and does not need to be minimally encoded. b is the desired size of the result, it must be minimally encoded and <= 4 bytes long. It must be possible for the value a to be encoded in a byte sequence of length b without loss of data.

See also OP_BIN2NUM.

Examples:

  • 2 4 OP_NUM2BIN -> {0x02, 0x00, 0x00, 0x00}
  • -5 4 OP_NUM2BIN -> {0x05, 0x00, 0x00, 0x80}

The operator must fail if:

  1. b is not a minimally encoded numeric value.
  2. b < len(minimal_encoding(a)). a must be able to fit into b bytes.
  3. b > MAX_SCRIPT_ELEMENT_SIZE. The result would be too large.

Impact of successful execution:

  • stack memory use will be increased by b - len(a) - len(b), maximum increase is when b = MAX_SCRIPT_ELEMENT_SIZE
  • number of elements on stack is reduced by one

Unit tests:

  1. a b OP_NUM2BIN -> failure where !isnum(b). b must be a minimally encoded numeric value.
  2. 256 1 OP_NUM2BIN -> failure. Trying to produce a byte sequence which is smaller than the minimum size needed to contain the numeric value.
  3. 1 (MAX_SCRIPT_ELEMENT_SIZE+1) OP_NUM2BIN -> failure. Trying to produce an array which is too large.
  4. other valid parameters with various results

OP_BIN2NUM

OP_BIN2NUM replaces OP_RIGHT and uses it’s opcode

Opcode (decimal): 129
Opcode (hex): 0x81

Convert the byte sequence into a numeric value, including minimal encoding. The byte sequence must encode the value in little-endian encoding.

a OP_BIN2NUM -> x

See also OP_NUM2BIN.

Notes:

  • if a is any form of zero, including negative zero, then OP_0 must be the result

Examples:

  • {0x02, 0x00, 0x00, 0x00, 0x00} OP_BIN2NUM -> 2. 0x0200000000 in little-endian encoding has value 2.
  • {0x05, 0x00, 0x80} OP_BIN2NUM -> -5 - 0x050080 in little-endian encoding has value -5.

The operator must fail if:

1.the numeric value is out of the range of acceptable numeric values (currently size is limited to 4 bytes)

Impact of successful execution:

  • stack memory use is equal or less than before. Minimal encoding of the byte sequence can produce a result which is shorter.
  • the number of elements on the stack remains constant

Unit tests:

  1. a OP_BIN2NUM -> failure, when a is a byte sequence whose numeric value is too large to fit into the numeric value type, for both positive and negative values.
  2. {0x00} OP_BIN2NUM -> OP_0. Byte sequences, of various lengths, consisting only of zeros should produce an OP_0 (zero length array).
  3. {0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} OP_BIN2NUM -> 1. A large byte sequence, whose numeric value would fit in the numeric value type, is a valid operand.
  4. The same test as above, where the length of the input byte sequence is equal to MAX_SCRIPT_ELEMENT_SIZE.
  5. {0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80} OP_BIN2NUM -> -1. Same as above, for negative values.
  6. {0x80} OP_BIN2NUM -> OP_0. Negative zero, in a byte sequence, should produce zero.
  7. {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80} OP_BIN2NUM -> OP_0. Large negative zero, in a byte sequence, should produce zero.
  8. other valid parameters with various results

Reference implementation

References

OP_CODES: https://en.bitcoin.it/wiki/Script#Opcodes