Du kannst diesen Artikel auch auf Deutsch lesen.
Part 1: How does Bitcoin Script work, and why is it so difficult?
Part 2: What's Miniscript, and how does it make Bitcoin scripting easier?
Part 3: Writing a Miniscript parser and analyzer in Go

In the previous part of this series, we looked into what Miniscript is and how it maps to Bitcoin Script.

To understand how Miniscript works in detail, it's helpful to see an example of its implementation, including how to analyze them for correctness, how to create receive addresses, and how to spend coins.

Let's dive in and write an implementation in Go.

We will keep https://bitcoin.sipa.be/miniscript/ open as a reference, as it contains the specification and properties of all Miniscript fragments.

Every section will contain a link at the top and at the bottom to the Go playground, where you can run the code, see the output, and tinker with it.

In a nutshell, we will convert a miniscript into an Abstract Syntax Tree, and then perform a series of tree transformations and tree walks to perform correctness analysis, create the corresponding Bitcoin Script, create receive addresses, and so on.

⚠️ Disclaimer: the implementation below is not reviewed and not tested. Do not use it in production. This is for educational purposes only. ⚠️


Outline

  1. Parse miniscript, create an Abstract Syntax Tree
  2. Check that only known fragments occur, check arguments
  3. Expand wrappers
  4. Desugar
  5. Typecheck
  6. Produce Bitcoin script
  7. Generate receive address

Step one - convert it into an Abstract Syntax Tree

Click here to run the code in the Go playground.

Miniscript expressions are simple and can be easily converted into an AST. Unlike mathematical/algebraic expressions, Miniscript expressions do not contain any infix operators, grouping parenthesis and only contain parenthesis for enclosing arguments of a fragment. As a result, it is easy to represent and also simple to parse.

Let's define our AST:

// AST is the abstract syntax tree representing a Miniscript expression.
type AST struct {
	wrappers   string
	identifier string
	args       []*AST
}

Identifiers like or_b will be stored in the identifier field. If there are any wrappers, e.g. ascd:X, the wrappers will be split off and stored in the wrappers field. Finally, arguments to fragments are recursively stored in args.

To turn an expression into a tree during parsing, we will need the trusted old stack data structure:

type stack struct {
	elements []*AST
}

func (s *stack) push(element *AST) {
	s.elements = append(s.elements, element)
}

func (s *stack) pop() *AST {
	if len(s.elements) == 0 {
		return nil
	}
	top := s.elements[len(s.elements)-1]
	s.elements = s.elements[:len(s.elements)-1]
	return top
}

func (s *stack) top() *AST {
	if len(s.elements) == 0 {
		return nil
	}
	return s.elements[len(s.elements)-1]
}

func (s *stack) size() int {
	return len(s.elements)
}

To turn an expression into a tree using a stack, we first split the expression into tokens separated by parenthesis and commas. Unfortunately there is no suitable function in the Go standard library, so I asked ChatGPT to write it for me with great success:

// - Written by ChatGPT.
// splitString keeps separators as individual slice elements and splits a string
// into a slice of strings based on multiple separators. It removes any empty
// elements from the output slice.
func splitString(s string, isSeparator func(c rune) bool) []string {
	// Create a slice to hold the substrings
	substrs := make([]string, 0)

	// Set the initial index to zero
	i := 0

	// Iterate over the characters in the string
	for i < len(s) {
		// Find the index of the first separator in the string
		j := strings.IndexFunc(s[i:], isSeparator)
		if j == -1 {
			// If no separator was found, append the remaining substring and return
			substrs = append(substrs, s[i:])
			return substrs
		}
		j += i
		// If a separator was found, append the substring before it
		if j > i {
			substrs = append(substrs, s[i:j])
		}

		// Append the separator as a separate element
		substrs = append(substrs, s[j:j+1])
		i = j + 1
	}
	return substrs
}

A quick unit test confirms it works:

func TestSplitString(t *testing.T) {
	separators := func(c rune) bool {
		return c == '(' || c == ')' || c == ','
	}

	require.Equal(t, []string{}, splitString("", separators))
	require.Equal(t, []string{"0"}, splitString("0", separators))
	require.Equal(t, []string{"0", ")", "(", "1", "("}, splitString("0)(1(", separators))
	require.Equal(t,
		[]string{"or_b", "(", "pk", "(", "key_1", ")", ",", "s:pk", "(", "key_2", ")", ")"},
		splitString("or_b(pk(key_1),s:pk(key_2))", separators))
}

We are ready to iterate through the tokens and parens/commas and build up an expression tree.

Whenever we see an identifier (anything between a parens and commas), we push the identifier on the stack, which will be the parent of all its child arguments. Whenever we see a comma or a closing parens, we know it is the end of an argument, so we pop the argument of the stack and add it to its parent. Some invalid sequences are explicitly ruled out like "()" or "(,", which can never appear in valid miniscripts.

func createAST(miniscript string) (*AST, error) {
	tokens := splitString(miniscript, func(c rune) bool {
		return c == '(' || c == ')' || c == ','
	})

	if len(tokens) > 0 {
		first, last := tokens[0], tokens[len(tokens)-1]
		if first == "(" || first == ")" || first == "," || last == "(" || last == "," {
			return nil, errors.New("invalid first or last character")
		}
	}

	// Build abstract syntax tree.
	var stack stack
	for i, token := range tokens {
		switch token {
		case "(":
			// Exclude invalid sequences, which cannot appear in valid miniscripts: "((", ")(", ",(".
			if i > 0 && (tokens[i-1] == "(" || tokens[i-1] == ")" || tokens[i-1] == ",") {
				return nil, fmt.Errorf("the sequence %s%s is invalid", tokens[i-1], token)
			}
		case ",", ")":
			// End of a function argument - take the argument and add it to the parent's argument
			// list. If there is no parent, the expression is unbalanced, e.g. `f(X))``.
			//
			// Exclude invalid sequences, which cannot appear in valid miniscripts: "(,", "()", ",,", ",)".
			if i > 0 && (tokens[i-1] == "(" || tokens[i-1] == ",") {
				return nil, fmt.Errorf("the sequence %s%s is invalid", tokens[i-1], token)
			}

			arg := stack.pop()
			parent := stack.top()
			if arg == nil || parent == nil {
				return nil, errors.New("unbalanced")
			}
			parent.args = append(parent.args, arg)
		default:
			if i > 0 && tokens[i-1] == ")" {
				return nil, fmt.Errorf("the sequence %s%s is invalid", tokens[i-1], token)
			}

			// Split wrappers from identifier if they exist, e.g. in "dv:older", "dv" are wrappers
			// and "older" is the identifier.
			wrappers, identifier, found := strings.Cut(token, ":")
			if !found {
				// No colon => Cut returns `identifier, ""`, not `"", identifier"`.
				wrappers, identifier = identifier, wrappers
			} else if wrappers == "" {
				return nil, fmt.Errorf("no wrappers found before colon before identifier: %s", identifier)
			} else if identifier == "" {
				return nil, fmt.Errorf("no identifier found after colon after wrappers: %s", wrappers)
			}

			stack.push(&AST{wrappers: wrappers, identifier: identifier})
		}
	}
	if stack.size() != 1 {
		return nil, errors.New("unbalanced")
	}
	return stack.top(), nil
}

Let's also add a function to draw the tree, so we can visualize it more easily:

func (a *AST) drawTree(w io.Writer, indent string) {
	if a.wrappers != "" {
		fmt.Fprintf(w, "%s:", a.wrappers)
	}
	fmt.Fprint(w, a.identifier)
	fmt.Fprintln(w)
	for i, arg := range a.args {
		mark := ""
		delim := ""
		if i == len(a.args)-1 {
			mark = "└──"
		} else {
			mark = "├──"
			delim = "|"
		}
		fmt.Fprintf(w, "%s%s", indent, mark)
		arg.drawTree(w,
			indent+delim+strings.Repeat(" ", len([]rune(arg.identifier))+len([]rune(mark))-1-len(delim)))
	}
}

func (a *AST) DrawTree() string {
	var b strings.Builder
	a.drawTree(&b, "")
	return b.String()
}

Let's give it a run with a complicated expression:

func main() {
	node, err := createAST("andor(pk(key_remote),or_i(and_v(v:pkh(key_local),hash160(H)),older(1008)),pk(key_revocation))")
	if err != nil {
		panic(err)
	}
	fmt.Println(node.DrawTree())
}

Success! The output is:

andor
├──pk
|   └──key_remote
├──or_i
|     ├──and_v
|     |      ├──v:pkh
|     |      |    └──key_local
|     |      └──hash160
|     |               └──H
|     └──older
|            └──1008
└──pk
    └──key_revocation

Of course, the parser does no checks yet whatsoever, so expressions like unknownFragment(foo,bar) also turn into an AST:

unknownFragment
├──foo
└──bar

Click here to run the code in the Go playground.


Step two - check fragments and argument numbers

Click here to run the code in the Go playground.

As the first of multiple tree walks, we will do an easy first check: is every fragment identifier in the tree a valid Miniscript fragment, and does each one have the correct number of arguments?

These are all the fragments, according to the spec:

const (
	// All fragment identifiers.

	f_0         = "0"         // 0
	f_1         = "1"         // 1
	f_pk_k      = "pk_k"      // pk_k(key)
	f_pk_h      = "pk_h"      // pk_h(key)
	f_pk        = "pk"        // pk(key) = c:pk_k(key)
	f_pkh       = "pkh"       // pkh(key) = c:pk_h(key)
	f_sha256    = "sha256"    // sha256(h)
	f_ripemd160 = "ripemd160" // ripemd160(h)
	f_hash256   = "hash256"   // hash256(h)
	f_hash160   = "hash160"   // hash160(h)
	f_older     = "older"     // older(n)
	f_after     = "after"     // after(n)
	f_andor     = "andor"     // andor(X,Y,Z)
	f_and_v     = "and_v"     // and_v(X,Y)
	f_and_b     = "and_b"     // and_b(X,Y)
	f_and_n     = "and_n"     // and_n(X,Y) = andor(X,Y,0)
	f_or_b      = "or_b"      // or_b(X,Z)
	f_or_c      = "or_c"      // or_c(X,Z)
	f_or_d      = "or_d"      // or_d(X,Z)
	f_or_i      = "or_i"      // or_i(X,Z)
	f_thresh    = "thresh"    // thresh(k,X1,...,Xn)
	f_multi     = "multi"     // multi(k,key1,...,keyn)
)

older, after, thresh and multi have a numeric first argument. As we need to parse it to see if it is a valid number, we will convert it into an number and store it in our AST for later use. Let's add a new field to our AST:

// AST is the abstract syntax tree representing a Miniscript expression.
type AST struct {
	wrappers   string
	identifier string
	// Parsed integer for when identifer is a expected to be a number, i.e. the first argument of
	// older/after/multi/thresh. Otherwise unused.
	num uint64
	args       []*AST
}

We also need a function that recursively walks through the tree and applies a function to every Miniscript expression/subexpression. The transformation function can modify a node or outright replace it with a new node, which will be useful in later stages of the parser:

func (a *AST) apply(f func(*AST) (*AST, error)) (*AST, error) {
	for i, arg := range a.args {
		// We don't rescurse into arguments which are not Miniscript subexpressions themselves:
		// key/hash variables and the numeric arguments of older/after/multi/thresh.
		switch a.identifier {
		case f_pk_k, f_pk_h, f_pk, f_pkh,
			f_sha256, f_hash256, f_ripemd160, f_hash160,
			f_older, f_after, f_multi:
			// None of the arguments of these functions are Miniscript subexpressions - they are
			// variables (or concrete assignments) or numbers.
			continue
		case f_thresh:
			// First argument is a number. The other arguments are subexpressions, which we want to
			// visit, so only skip the first argument.
			if i == 0 {
				continue
			}
		}

		new, err := arg.apply(f)
		if err != nil {
			return nil, err
		}
		a.args[i] = new
	}
	return f(a)
}

Example:

node, _ := createAST("andor(pk(key_remote),or_i(and_v(v:pkh(key_local),hash160(H)),older(1008)),pk(key_revocation))")
node.apply(func(node *AST) (*AST, error) {
		fmt.Println("Visiting node:", node.identifier)
		return node, nil
	})

Output:

Visiting node: pk
Visiting node: pkh
Visiting node: hash160
Visiting node: and_v
Visiting node: older
Visiting node: or_i
Visiting node: pk
Visiting node: andor

We now add a Parse function which creates the AST and then applies a number of transformations to it, the first of which is the fragment and argument checker:

func Parse(miniscript string) (*AST, error) {
	node, err := createAST(miniscript)
	if err != nil {
		return nil, err
	}
	for _, transform := range []func(*AST) (*AST, error){
		argCheck,
		// More stages to come
	} {
		node, err = node.apply(transform)
		if err != nil {
			return nil, err
		}
	}
	return node, nil
}

The argCheck function is applied to every node of the tree, and we can simply enumerate all valid fragment identifiers to perform these basic checks.

// argCheck checks that each identifier is a known Miniscript identifier and that it has the correct
// number of arguments, e.g. `andor(X,Y,Z)` must have three arguments, etc.
func argCheck(node *AST) (*AST, error) {
	// Helper function to check that this node has a specific number of arguments.
	expectArgs := func(num int) error {
		if len(node.args) != num {
			return fmt.Errorf("%s expects %d arguments, got %d", node.identifier, num, len(node.args))
		}
		return nil
	}
	switch node.identifier {
	case f_0, f_1:
		if err := expectArgs(0); err != nil {
			return nil, err
		}
	case f_pk_k, f_pk_h, f_pk, f_pkh, f_sha256, f_ripemd160, f_hash256, f_hash160:
		if err := expectArgs(1); err != nil {
			return nil, err
		}
		if len(node.args[0].args) > 0 {
			return nil, fmt.Errorf("argument of %s must not contain subexpressions", node.identifier)
		}
	case f_older, f_after:
		if err := expectArgs(1); err != nil {
			return nil, err
		}
		_n := node.args[0]
		if len(_n.args) > 0 {
			return nil, fmt.Errorf("argument of %s must not contain subexpressions", node.identifier)
		}
		n, err := strconv.ParseUint(_n.identifier, 10, 64)
		if err != nil {
			return nil, fmt.Errorf(
				"%s(k) => k must be an unsigned integer, but got: %s", node.identifier, _n.identifier)
		}
		_n.num = n
		if n < 1 || n >= (1<<31) {
			return nil, fmt.Errorf("%s(n) -> n must 1 ≤ n < 2^31, but got: %s", node.identifier, _n.identifier)
		}
	case f_andor:
		if err := expectArgs(3); err != nil {
			return nil, err
		}
	case f_and_v, f_and_b, f_and_n, f_or_b, f_or_c, f_or_d, f_or_i:
		if err := expectArgs(2); err != nil {
			return nil, err
		}
	case f_thresh, f_multi:
		if len(node.args) < 2 {
			return nil, fmt.Errorf("%s must have at least two arguments", node.identifier)
		}
		_k := node.args[0]
		if len(_k.args) > 0 {
			return nil, fmt.Errorf("argument of %s must not contain subexpressions", node.identifier)
		}
		k, err := strconv.ParseUint(_k.identifier, 10, 64)
		if err != nil {
			return nil, fmt.Errorf(
				"%s(k, ...) => k must be an integer, but got: %s", node.identifier, _k.identifier)
		}
		_k.num = k
		numSubs := len(node.args) - 1
		if k < 1 || k > uint64(numSubs) {
			return nil, fmt.Errorf(
				"%s(k) -> k must 1 ≤ k ≤ n, but got: %s", node.identifier, _k.identifier)
		}
		if node.identifier == f_multi {
			// Maximum number of keys in a multisig.
			const multisigMaxKeys = 20
			if numSubs > multisigMaxKeys {
				return nil, fmt.Errorf("number of multisig keys cannot exceed %d", multisigMaxKeys)
			}
			// Multisig keys are variables, they can't have subexpressions.
			for _, arg := range node.args {
				if len(arg.args) > 0 {
					return nil, fmt.Errorf("arguments of %s must not contain subexpressions", node.identifier)
				}
			}
		}
	default:
		return nil, fmt.Errorf("unrecognized identifier: %s", node.identifier)
	}
	return node, nil
}

With these checks, we already rule out lots of invalid Miniscripts. Let's see some examples:

func main() {
	for _, expr := range []string{
		"invalid",
		"pk(key1,tooManyArgs)",
		"pk(key1(0))",
		"and_v(0)",
		"after(notANumber)",
		"after(-1)",
		"multi(0,k1)",
		"multi(2,k1)",
		"multi(1,k1,k2,k3,k4,k5,k6,k7,k8,k9,k10,k11,k12,k13,k14,k15,k16,k17,k18,k19,k20,k21)",
	} {
		_, err := Parse(expr)
		fmt.Println(expr, " -- ", err)
	}
}

Output:

invalid  --  unrecognized identifier: invalid
pk(key1,tooManyArgs)  --  pk expects 1 arguments, got 2
pk(key1(0))  --  argument of pk must not contain subexpressions
and_v(0)  --  and_v expects 2 arguments, got 1
after(notANumber)  --  after(k) => k must be an unsigned integer, but got: notANumber
after(-1)  --  after(k) => k must be an unsigned integer, but got: -1
multi(0,k1)  --  multi(k) -> k must 1 ≤ k ≤ n, but got: 0
multi(2,k1)  --  multi(k) -> k must 1 ≤ k ≤ n, but got: 2
multi(1,k1,k2,k3,k4,k5,k6,k7,k8,k9,k10,k11,k12,k13,k14,k15,k16,k17,k18,k19,k20,k21)  --  number of multisig keys cannot exceed 20

Click here to run the code in the Go playground.


Step three - expand wrappers

Click here to run the code in the Go playground.

Every fragment can be wrapped with wrappers, denoted by letters before the colon ":". As an example taken from https://bitcoin.sipa.be/miniscript/:

dv:older(144) is the d: wrapper applied to the v: wrapper applied to the older fragment for 144 blocks.

In further stages of the parser we want to operate on the wrappers as well, as they behave the same way as normal fragments - they have a mapping to Bitcoin Script, have correctness rules, etc. In a way, dv:older(144) is just syntactic sugar for d(v(older(144))).

In this example, we want to transform our AST from this:

dv:older
└──144

to this:

d
└──v
   └──older
          └──144

To perform this transformation, we add this function to the list of transformations. Note that we iterate the letters in the wrappers in reverse order, as they are applied from right to left.

// expandWrappers applies wrappers (the characters before a colon), e.g. `ascd:X` =>
// `a(s(c(d(X))))`.
func expandWrappers(node *AST) (*AST, error) {
	const allWrappers = "asctdvjnlu"

	wrappers := []rune(node.wrappers)
	node.wrappers = ""
	for i := len(wrappers) - 1; i >= 0; i-- {
		wrapper := wrappers[i]
		if !strings.ContainsRune(allWrappers, wrapper) {
			return nil, fmt.Errorf("unknown wrapper: %s", string(wrapper))
		}
		node = &AST{identifier: string(wrapper), args: []*AST{node}}
	}
	return node, nil
}

Click here to run the code in the Go playground.


Step four - desugar

Click here to run the code in the Go playground.

Miniscript defines six instances of syntactic sugar. If a miniscript contains an expression on the left side of the below equations, they can be replaced with the right side of the equation. To save the effort of handling these six fragments in all later stages, we add a desugaring transormation which replaces these upfront.

The replacements are:

pk(key) = c:pk_k(key)
pkh(key) = c:pk_h(key)
and_n(X,Y) = andor(X,Y,0)
t:X = and_v(X,1)
l:X = or_i(0,X)
u:X = or_i(X,0)

Now is a good time to extend the table with the fragment identifiers we defined at the beginnig with the wrapper fragments:

const (
	// [...]
	f_wrap_a    = "a"         // a:X
	f_wrap_s    = "s"         // s:X
	f_wrap_c    = "c"         // c:X
	f_wrap_d    = "d"         // d:X
	f_wrap_v    = "v"         // v:X
	f_wrap_j    = "j"         // j:X
	f_wrap_n    = "n"         // n:X
	f_wrap_t    = "t"         // t:X = and_v(X,1)
	f_wrap_l    = "l"         // l:X = or_i(0,X)
	f_wrap_u    = "u"         // u:X = or_i(X,0))
)

The transformation function looks like this:

// desugar replaces syntactic sugar with the final form.
func desugar(node *AST) (*AST, error) {
	switch node.identifier {
	case f_pk: // pk(key) = c:pk_k(key)
		return &AST{
			identifier: f_wrap_c,
			args: []*AST{
				{
					identifier: f_pk_k,
					args:       node.args,
				},
			},
		}, nil
	case f_pkh: // pkh(key) = c:pk_h(key)
		return &AST{
			identifier: f_wrap_c,
			args: []*AST{
				{
					identifier: f_pk_h,
					args:       node.args,
				},
			},
		}, nil
	case f_and_n: // and_n(X,Y) = andor(X,Y,0)
		return &AST{
			identifier: f_andor,
			args: []*AST{
				node.args[0],
				node.args[1],
				{identifier: f_0},
			},
		}, nil
	case f_wrap_t: // t:X = and_v(X,1)
		return &AST{
			identifier: f_and_v,
			args: []*AST{
				node.args[0],
				{identifier: f_1},
			},
		}, nil
	case f_wrap_l: // l:X = or_i(0,X)
		return &AST{
			identifier: f_or_i,
			args: []*AST{
				{identifier: f_0},
				node.args[0],
			},
		}, nil
	case f_wrap_u: // u:X = or_i(X,0)
		return &AST{
			identifier: f_or_i,
			args: []*AST{
				node.args[0],
				{identifier: f_0},
			},
		}, nil
	}

	return node, nil
}

Let's try all of them and inspect visually:

func main() {
	for _, expr := range []string{
		"pk(key)",
		"pkh(key)",
		"and_n(pk(key),sha256(H))",
		"tv:pk(key)",
		"l:pk(key)",
		"u:pk(key)",
	} {
		node, err := Parse(expr)
		if err != nil {
			panic(err)
		}
		fmt.Printf("Tree for \"%v\"\n", expr)
		fmt.Println(node.DrawTree())
	}
}

As we can confirm in the following output, the desugaring has worked:

Tree for "pk(key)"
c
└──pk_k
      └──key

Tree for "pkh(key)"
c
└──pk_h
      └──key

Tree for "and_n(pk(key),sha256(H))"
andor
├──c
|  └──pk_k
|        └──key
├──sha256
|       └──H
└──0

Tree for "tv:pk(key)"
and_v
├──v
|  └──c
|     └──pk_k
|           └──key
└──1

Tree for "l:pk(key)"
or_i
├──0
└──c
   └──pk_k
         └──key

Tree for "u:pk(key)"
or_i
├──c
|  └──pk_k
|        └──key
└──0

Click here to run the code in the Go playground.


Step five - typecheck

Click here to run the code in the Go playground.

Not all fragments compose with each other to produce a valid Bitcoin Script and valid witnesses.

However, since Miniscript expressions and fragments are well structured and hierarchical, it is easy to statically verify if a Miniscript expression is valid under all circumstances.

As an example, or_b(pk(key1),pk(key2)) and or_b(v:pk(key1),v:pk(key2)) are not valid compositions, but or_b(pk(key1),s:pk(key2)) is.

According to the miniscript spec, each fragment can be one of four different basic types B, V, K or W, and each fragment can can have additional type properties (z, o, n, d and u).

Primitive fragments (fragments that don't have any subexpressions) have a fixed basic type and fixed type properties. For example, the hashing fragment sha256(h), which translates to the Bitcoin Script SIZE <32> EQUALVERIFY SHA256 <h> EQUAL and is satisfied by the witness <32 byte preimage> (the value for which sha256(preimage)=h), has the type Bondu, which means:

  • B: Pushes non-zero upon success and an exact 0 upon failure. Consumes elements from the top of the stack (if any).
  • o: consumes exactly one stack element (the preimage in this example)
  • n: nonzero - the satisfaction cannot be zero. The correct preimage for the sha256(h) fragment must be 32 bytes long, so it cannot be zero.
  • d: Can be dissatisfied unconditionally. In this case, an arbitrary 32 bytes except the actual preimage is invalid, which can be always be constructed. Note that a non-32 bytes value is not a valid dissatisfaction, as it aborts the script execution at EQUALVERIFY, instead of continuing execution.
  • u: Puts 1 one the stack when satisfied.

The basic types and type properties were defined carefully to be able to ensure correctness of composed fragments. They can be assigned to every fragment based on reasoning about the script and witnesses it encapsulates. Similarly, for fragments that have subexpressions like and_b(X,Y), one can reason about what types and properties X and Y must have and what derived type and properties and_b(X,Y) itself must have. Luckily, the authors of Miniscript have done this work and documented it in the correctness table in the spec.

The top-level fragment must of type B, otherwise the miniscript is invalid.

Let's extend our AST type with the basic type and type properties:

type basicType string

const (
	typeB basicType = "B"
	typeV basicType = "V"
	typeK basicType = "K"
	typeW basicType = "W"
)

type properties struct {
	// Basic type properties
	z, o, n, d, u bool
}

func (p properties) String() string {
	s := strings.Builder{}
	if p.z {
		s.WriteRune('z')
	}
	if p.o {
		s.WriteRune('o')
	}
	if p.n {
		s.WriteRune('n')
	}
	if p.d {
		s.WriteRune('d')
	}
	if p.u {
		s.WriteRune('u')
	}
	return s.String()
}

// AST is the abstract syntax tree representing a Miniscript expression.
type AST struct {
	basicType  basicType
	props      properties
	wrappers   string
	identifier string
	// Parsed integer for when identifer is a expected to be a number, i.e. the first argument of
	// older/after/multi/thresh. Otherwise unused.
	num uint64
	args      []*AST
}

// typeRepr returns the basic type (B, V, K or W) followed by all type properties.
func (a *AST) typeRepr() string {
	return fmt.Sprintf("%s%s", a.basicType, a.props)
}

Then we add another function that walks the tree. This one checks the type requirements of subexpressions and sets the type and type properties according to the correctness table of the specification.

Since the function is quite long, we show an abbreviated version that handles only some fragments, but illustrates how it works. Fragment types are either primitive or depend on their arguments. It simply encodes the type rules according to the table in the specification. For example, for s:X, for it to be valid, X must be type Bo, will have type W and gets the d and u properties of X.

You can see and run the full version that handles each fragment in the playground.

// expectBasicType is a helper function to check that this node has a specific type.
func (a *AST) expectBasicType(typ basicType) error {
	if a.basicType != typ {
		return fmt.Errorf("expression `%s` expected to have type %s, but is type %s",
			a.identifier, typ, a.basicType)
	}
	return nil
}

func typeCheck(node *AST) (*AST, error) {
	switch node.identifier {
	case f_0:
		node.basicType = typeB
		node.props.z = true
		node.props.u = true
		node.props.d = true
	// [...]
	case f_pk_k:
		node.basicType = typeK
		node.props.o = true
		node.props.n = true
		node.props.d = true
		node.props.u = true
	// [...]
	case f_or_d:
		_x, _z := node.args[0], node.args[1]
		if err := _x.expectBasicType(typeB); err != nil {
			return nil, err
		}
		if !_x.props.d || !_x.props.u {
			return nil, fmt.Errorf(
				"wrong properties on `%s`, the first argument of `%s`", _x.identifier, node.identifier)
		}
		if err := _z.expectBasicType(typeB); err != nil {
			return nil, err
		}
		node.basicType = typeB
		node.props.z = _x.props.z && _z.props.z
		node.props.o = _x.props.o && _z.props.z
		node.props.d = _z.props.d
		node.props.u = _z.props.u
	// [...]
	case f_wrap_s:
		_x := node.args[0]
		if err := _x.expectBasicType(typeB); err != nil {
			return nil, err
		}
		if !_x.props.o {
			return nil, fmt.Errorf(
				"wrong properties on `%s`, the first argument of `%s`", _x.identifier, node.identifier)
		}
		node.props.d = _x.props.d
		node.props.u = _x.props.u
	// [...]
	}
	return node, nil
}

Now that we have derived all the types and type properties, we also need to add the final type check that the top-level expression must be of type B:

func Parse(miniscript string) (*AST, error) {
	node, err := createAST(miniscript)
	if err != nil {
		return nil, err
	}
	for _, transform := range []func(*AST) (*AST, error){
		argCheck,
		expandWrappers,
		desugar,
		typeCheck,
		// More stages to come
	} {
		node, err = node.apply(transform)
		if err != nil {
			return nil, err
		}
	}
	// Top-level expression must be of type "B".
	if err := node.expectBasicType(typeB); err != nil {
		return nil, err
	}
	return node, nil
}

Let's give it a try with valid and invalid miniscripts:

func main() {
	expr := "or_b(pk(key1),s:pk(key2))"
	node, err := Parse(expr)
	if err == nil {
		fmt.Println("miniscript valid:", expr)
		fmt.Println(node.DrawTree())
	}
	for _, expr := range []string{"pk_k(key)", "or_b(pk(key1),pk(key2))"} {
		_, err = Parse(expr)
		fmt.Println("miniscript invalid:", expr, "-", err)
	}
}

Success! The output is:

miniscript valid: or_b(pk(key1),s:pk(key2))
or_b [Bdu]
├──c [Bondu]
|  └──pk_k [Kondu]
|        └──key1
└──s [Wdu]
   └──c [Bondu]
      └──pk_k [Kondu]
            └──key2

miniscript invalid: pk_k(key) - expression `pk_k` expected to have type B, but is type K
miniscript invalid: or_b(pk(key1),pk(key2)) - expression `c` expected to have type W, but is type B

(we modified the drawTree() function to also render the types next to each fragment).

Click here to run the code in the Go playground.


Step six - produce Bitcoin script

We haven't implemented all checks yet to reject invalid miniscripts, but now is a good time anyway to experiment with generating the Bitcoin script.

The spec translation table defines how Miniscript fragments map to Bitcoin Script. For example, and_b(X,Y) maps to [X] [Y] BOOLAND, etc.

We will first make a function that creates a human readable string representation of the script, just like in translation table. This makes it easy to prototype and and debug, as you can easily check the output. The actual script will be a sequence of bytes, which we will do later.

We add this function which maps each fragment to its Script representation according to the translation table:

func scriptStr(node *AST) string {
	switch node.identifier {
	case f_0, f_1:
		return node.identifier
	case f_pk_k:
		return fmt.Sprintf("<%s>", node.args[0].identifier)
	case f_pk_h:
		return fmt.Sprintf("DUP HASH160 <HASH160(%s)> EQUALVERIFY", node.args[0].identifier)
	case f_older:
		return fmt.Sprintf("<%s> CHECKSEQUENCEVERIFY", node.args[0].identifier)
	case f_after:
		return fmt.Sprintf("<%s> CHECKLOCKTIMEVERIFY", node.args[0].identifier)
	case f_sha256, f_hash256, f_ripemd160, f_hash160:
		return fmt.Sprintf(
			"SIZE <32> EQUALVERIFY %s <%s> EQUAL",
			strings.ToUpper(node.identifier),
			node.args[0].identifier)
	case f_andor:
		return fmt.Sprintf("%s NOTIF %s ELSE %s ENDIF",
			scriptStr(node.args[0]),
			scriptStr(node.args[2]),
			scriptStr(node.args[1]),
		)
	case f_and_v:
		return fmt.Sprintf("%s %s",
			scriptStr(node.args[0]),
			scriptStr(node.args[1]))
	case f_and_b:
		return fmt.Sprintf("%s %s BOOLAND",
			scriptStr(node.args[0]),
			scriptStr(node.args[1]),
		)
	case f_or_b:
		return fmt.Sprintf("%s %s BOOLOR",
			scriptStr(node.args[0]),
			scriptStr(node.args[1]),
		)
	case f_or_c:
		return fmt.Sprintf("%s NOTIF %s ENDIF",
			scriptStr(node.args[0]),
			scriptStr(node.args[1]),
		)
	case f_or_d:
		return fmt.Sprintf("%s IFDUP NOTIF %s ENDIF",
			scriptStr(node.args[0]),
			scriptStr(node.args[1]),
		)
	case f_or_i:
		return fmt.Sprintf("IF %s ELSE %s ENDIF",
			scriptStr(node.args[0]),
			scriptStr(node.args[1]),
		)
	case f_thresh:
		s := []string{}
		for i := 1; i < len(node.args); i++ {
			s = append(s, scriptStr(node.args[i]))
			if i > 1 {
				s = append(s, "ADD")
			}
		}

		s = append(s, node.args[0].identifier)
		s = append(s, "EQUAL")
		return strings.Join(s, " ")
	case f_multi:
		s := []string{node.args[0].identifier}
		for _, arg := range node.args[1:] {
			s = append(s, fmt.Sprintf("<%s>", arg.identifier))
		}
		s = append(s, fmt.Sprint(len(node.args)-1))
		s = append(s, "CHECKMULTISIG")
		return strings.Join(s, " ")
	case f_wrap_a:
		return fmt.Sprintf("TOALTSTACK %s FROMALTSTACK", scriptStr(node.args[0]))
	case f_wrap_s:
		return fmt.Sprintf("SWAP %s", scriptStr(node.args[0]))
	case f_wrap_c:
		return fmt.Sprintf("%s CHECKSIG",
			scriptStr(node.args[0]))
	case f_wrap_d:
		return fmt.Sprintf("DUP IF %s ENDIF",
			scriptStr(node.args[0]))
	case f_wrap_v:
		return fmt.Sprintf("%s VERIFY", scriptStr(node.args[0]))
	case f_wrap_j:
		return fmt.Sprintf("SIZE 0NOTEQUAL IF %s ENDIF",
			scriptStr(node.args[0]))
	case f_wrap_n:
		return fmt.Sprintf("%s 0NOTEQUAL",
			scriptStr(node.args[0]))
	default:
		return "<unknown>"
	}
}

Let's try:

Run in in the playground.

func main() {
	node, err := Parse("or_d(pk(pubkey1),and_v(v:pk(pubkey2),older(52560)))")
	if err != nil {
		panic(err)
	}
	fmt.Println(scriptStr(node))
}

Output:

<pubkey1> CHECKSIG IFDUP NOTIF <pubkey2> CHECKSIG VERIFY <52560> CHECKSEQUENCEVERIFY ENDIF

This is correct, but there is an optimization to be had. The v:X wrapper maps to [X] VERIFY. The opcodes EQUALVERIFY, CHECKSIGVERIFY and CHECKMULTISIGVERIFY are shortcuts for EQUAL VERIFY, CHECKSIG VERIFY and CHECKMULTISIG VERIFY, so in the above output, CHECKSIG VERIFY can be abbreviated to CHECKSIGVERIFY, saving one byte in the script.

If in v:X, the last OP-code in [X] is EQUAL/CHECKSIG/CHECKMULTISIG, it can be replaced by the VERIFY-version.

Since X can be an arbitrary expression, we need another tree-walk to decide for each fragment if its last OP-code is one of these three.

Let's add this property to the properties struct:

type properties struct {
	// Basic type properties
	z, o, n, d, u bool

	// Check if the rightmost script byte produced by this node is OP_EQUAL, OP_CHECKSIG or
	// OP_CHECKMULTISIG.
	//
	// If so, it can be be converted into the VERIFY version if an ancestor is the verify wrapper
	// `v`, i.e. OP_EQUALVERIFY, OP_CHECKSIGVERIFY and OP_CHECKMULTISIGVERIFY instead of using two
	// opcodes, e.g. `OP_EQUAL OP_VERIFY`.
	canCollapseVerify bool
}

And the function that sets this field for each fragment, which we add to our list of transformations:

func canCollapseVerify(node *AST) (*AST, error) {
	switch node.identifier {
	case f_sha256, f_ripemd160, f_hash256, f_hash160, f_thresh, f_multi, f_wrap_c:
		node.props.canCollapseVerify = true
	case f_and_v:
		node.props.canCollapseVerify = node.args[1].props.canCollapseVerify
	case f_wrap_s:
		node.props.canCollapseVerify = node.args[0].props.canCollapseVerify
	}
	return node, nil
}

The and_v fragment and the s-wrapper are the only composite fragments which end with a subexpression: and_v(X,Y) => [X] [Y] and s:X => SWAP [X], so these simply adopt the property from that child node. The Scripts of the hashing fragments and thresh/multi/c actually end in EQUAL/CHECKSIG/CHECKMULTISIG, e.g. c:X => [X] CHECKSIG. These are candidates to be collapsed into the VERIFY-version of the opcode.

Then we can modify our scriptStr function to use the VERIFY-version of the opcode where applicable. For brevity, we only show two cases below. You see and run the full version in this playground.

// collapseVerify is true if the `v` wrapper (VERIFY wrapper) is an ancestor of the node. If so, the
// two opcodes `OP_CHECKSIG VERIFY` can be collapsed into one opcode `OP_CHECKSIGVERIFY` (same for
// OP_EQUAL and OP_CHECKMULTISIG).
func scriptStr(node *AST, collapseVerify bool) string {
	switch node.identifier {
 	// [...]
	case f_wrap_c:
		opVerify := "CHECKSIG"
		if node.props.canCollapseVerify && collapseVerify {
			opVerify = "CHECKSIGVERIFY"
		}
		return fmt.Sprintf("%s %s",
			scriptStr(node.args[0], collapseVerify),
			opVerify,
		)
 	// [...]
	case f_wrap_v:
		s := scriptStr(node.args[0], true)
		if !node.args[0].props.canCollapseVerify {
			s += " VERIFY"
		}
		return s

}

Running the program with the modified function now outputs:

<pubkey1> CHECKSIG IFDUP NOTIF <pubkey2> CHECKSIGVERIFY <52560> CHECKSEQUENCEVERIFY ENDIF

which successfully collapsed CHECKSIG VERIFY to CHECKSIGVERIFY.


Step seven - generate receive address

Click here to run the code in the Go playground.

In the previous section, we created a human-readable representation for the Bitcoin script. To generate a P2WSH-address, we need to build the actual script as a byte-sequence, and then encode it into an address.

In order to do this, we first need to replace all key and hash variables with actual pubkeys and hash values. We add a new field value to our AST:

type AST struct {
	// [...]
	identifier string
	// For key arguments, this will be the 33 bytes compressed pubkey.
	// For hash arguments, this will be the 32 bytes (sha256, hash256) or 20 bytes (ripemd160, hash160) hash.
	value []byte
	args  []*AST
}

Now we can add a new function ApplyVars which let's us replace all variables in the miniscript with actual values. The caller can provide a callback function to provide the values.

Miniscript also specifies that public keys must not be repeated (this simplifies reasoning about the scripts), so we check for duplicates.

// ApplyVars replaces key and hash values in the miniscript.
//
// The callback should return `nil, nil` if the variable is unknown. In this case, the identifier
// itself will be parsed as the value (hex-encoded pubkey, hex-encoded hash value).
func (a *AST) ApplyVars(lookupVar func(identifier string) ([]byte, error)) error {
	// Set of all pubkeys to check for duplicates
	allPubKeys := map[string]struct{}{}

	_, err := a.apply(func(node *AST) (*AST, error) {
		switch node.identifier {
		case f_pk_k, f_pk_h, f_multi:
			var keyArgs []*AST
			if node.identifier == f_multi {
				keyArgs = node.args[1:]
			} else {
				keyArgs = node.args[:1]
			}
			for _, arg := range keyArgs {
				key, err := lookupVar(arg.identifier)
				if err != nil {
					return nil, err
				}
				if key == nil {
					// If the key was not a variable, assume it's the key value directly encoded as
					// hex.
					key, err = hex.DecodeString(arg.identifier)
					if err != nil {
						return nil, err
					}
				}
				if len(key) != pubKeyLen {
					return nil, fmt.Errorf("pubkey argument of %s expected to be of size %d, but got %d",
						node.identifier, pubKeyLen, len(key))
				}

				pubKeyHex := hex.EncodeToString(key)
				if _, ok := allPubKeys[pubKeyHex]; ok {
					return nil, fmt.Errorf(
						"duplicate key found at %s (key=%s, arg identifier=%s)",
						node.identifier, pubKeyHex, arg.identifier)
				}
				allPubKeys[pubKeyHex] = struct{}{}

				arg.value = key
			}
		case f_sha256, f_hash256, f_ripemd160, f_hash160:
			arg := node.args[0]
			hashLen := map[string]int{
				f_sha256:    32,
				f_hash256:   32,
				f_ripemd160: 20,
				f_hash160:   20,
			}[node.identifier]
			hashValue, err := lookupVar(arg.identifier)
			if err != nil {
				return nil, err
			}
			if hashValue == nil {
				// If the hash value was not a variable, assume it's the hash value directly encoded
				// as hex.
				hashValue, err = hex.DecodeString(node.args[0].identifier)
				if err != nil {
					return nil, err
				}
			}
			if len(hashValue) != hashLen {
				return nil, fmt.Errorf("%s len must be %d, got %d", node.identifier, hashLen, len(hashValue))
			}
			arg.value = hashValue
		}
		return node, nil
	})
	return err
}

Let's see it in action:

func main() {
	node, err := Parse("or_d(pk(pubkey1),and_v(v:pk(pubkey2),older(52560)))")
	if err != nil {
		panic(err)
	}
	unhex := func(s string) []byte {
		b, _ := hex.DecodeString(s)
		return b
	}

	// Two arbitrary pubkeys.
	_, pubKey1 := btcec.PrivKeyFromBytes(
		unhex("2c3931f593f26037a8b8bf837363831b18bbfb91a712dd9d862db5b9b06dc5df"))
	_, pubKey2 := btcec.PrivKeyFromBytes(
		unhex("f902f94da618721e516d0a2a2666e2ec37079aaa184ee5a2c00c835c5121b3eb"))

	err = node.ApplyVars(func(identifier string) ([]byte, error) {
		switch identifier {
		case "pubkey1":
			return pubKey1.SerializeCompressed(), nil
		case "pubkey2":
			return pubKey2.SerializeCompressed(), nil
		}
		return nil, nil
	})
	if err != nil {
		panic(err)
	}
	fmt.Println(node.DrawTree())
}

Output, with the pubkeys successfully replaced:

or_d [B]
├──c [Bonduv]
|  └──pk_k [Kondu]
|        └──pubkey1 [03469d685c3445e83ee6e3cfb30382795c249c91955523c25f484d69379c7a7d6f]
└──and_v [Bon]
       ├──v [Von]
       |  └──c [Bonduv]
       |     └──pk_k [Kondu]
       |           └──pubkey2 [03ba991cc359438fdd8cf43e3cf7894f90cf4d0e040314a6bba82963fa77b7a434]
       └──older [Bz]
              └──52560

(we modified the drawTree() function to render the node value next to each variable).

With the help of the excellent btcd library, we will now build actual script. It looks much like the version scriptStr() above, but encodes it as a byte-string, taking care of the nitty-gritty about how to encode integers and data-pushes on the stack. We show an abbreviated version here with only a few cases. See and run the full version in the playground.

// Script creates the witness script from a parsed miniscript.
func (a *AST) Script() ([]byte, error) {
	b := txscript.NewScriptBuilder()
	if err := buildScript(a, b, false); err != nil {
		return nil, err
	}
	return b.Script()
}

// collapseVerify is true if the `v` wrapper (VERIFY wrapper) is an ancestor of the node. If so, the
// two opcodes `OP_CHECKSIG VERIFY` can be collapsed into one opcode `OP_CHECKSIGVERIFY` (same for
// OP_EQUAL and OP_CHECKMULTISIGVERIFY).
func buildScript(node *AST, b *txscript.ScriptBuilder, collapseVerify bool) error {
	switch node.identifier {
	case f_0:
		b.AddOp(txscript.OP_FALSE)
	case f_1:
		b.AddOp(txscript.OP_TRUE)
	case f_pk_h:
		arg := node.args[0]
		key := arg.value
		if key == nil {
			return fmt.Errorf("empty key for %s (%s)", node.identifier, arg.identifier)
		}
		b.AddOp(txscript.OP_DUP)
		b.AddOp(txscript.OP_HASH160)
		b.AddData(btcutil.Hash160(key))
		b.AddOp(txscript.OP_EQUALVERIFY)
	case f_older:
		b.AddInt64(int64(node.args[0].num))
		b.AddOp(txscript.OP_CHECKSEQUENCEVERIFY)
	case f_after:
		b.AddInt64(int64(node.args[0].num))
		b.AddOp(txscript.OP_CHECKLOCKTIMEVERIFY)
	case f_and_b:
		if err := buildScript(node.args[0], b, collapseVerify); err != nil {
			return err
		}
		if err := buildScript(node.args[1], b, collapseVerify); err != nil {
			return err
		}
		b.AddOp(txscript.OP_BOOLAND)
	case f_wrap_c:
		if err := buildScript(node.args[0], b, collapseVerify); err != nil {
			return err
		}
		if node.props.canCollapseVerify && collapseVerify {
			b.AddOp(txscript.OP_CHECKSIGVERIFY)
		} else {
			b.AddOp(txscript.OP_CHECKSIG)
		}
	case f_wrap_v:
		if err := buildScript(node.args[0], b, true); err != nil {
			return err
		}
		if !node.args[0].props.canCollapseVerify {
			b.AddOp(txscript.OP_VERIFY)
		}
	// More cases [...]
	default:
		return fmt.Errorf("unknown identifier: %s", node.identifier)
	}
	return nil
}

Let's run it:

func main() {
	// [...]
	script, err := node.Script()
	if err != nil {
		panic(err)
	}
	fmt.Println("Script", hex.EncodeToString(script))
}

Output:

Script 2103469d685c3445e83ee6e3cfb30382795c249c91955523c25f484d69379c7a7d6fac73642103ba991cc359438fdd8cf43e3cf7894f90cf4d0e040314a6bba82963fa77b7a434ad0350cd00b268

According to BIP141 and BIP173, a P2WSH address is the bech32-encoding of 0 <sha256(script)>, where 0 is the segwit version 0. We will use the btcd library to help us create it:

addr, err := btcutil.NewAddressWitnessScriptHash(chainhash.HashB(script), &chaincfg.TestNet3Params)
if err != nil {
   	panic(err)
}
fmt.Println("Address:", addr.String())

Our testnet receive address is ready:

Address: tb1q4q3cw0mausmamm7n7fn2phh0fpca4n0vmkc7rdh6hxnkz9rd8l0qcpefrj

Coins received on this address are locked with the spending condition or_d(pk(pubkey1),and_v(v:pk(pubkey2),older(52560))), i.e. pubkey1 can spend at any time, or pubkey2 can spend when the coin received on the address is 52560 blocks old (roughly 1 year).

Click here to run the code in the Go playground.


Conclusion

In conclusion, we have created a Miniscript library that can parse and typecheck a Miniscript expression and generate a receive address from it.

There is still a lot of ground to cover. In another instalment, we could study how to generate witnesses that can spend coins sent to a miniscript, and how to ensure that miniscripts conform to Bitcoin consensus and standardness limits, such as script size and OP-code limits.

If you'd like to see this series continue, please let me know on Twitter @_benma_.


Don’t own a BitBox yet?

Keeping your crypto secure doesn't have to be hard. The BitBox02 hardware wallet stores the private keys for your cryptocurrencies offline. So you can manage your coins safely.

The BitBox02 also comes in Bitcoin-only version, featuring a radically focused firmware: less code means less attack surface, which further improves your security when only storing Bitcoin.

Grab one in our shop!


Shift Crypto is a privately-held company based in Zurich, Switzerland. Our team of Bitcoin contributors, crypto experts, and security engineers builds products that enable customers to enjoy a stress-free journey from novice to mastery level of cryptocurrency management. The BitBox02, our second generation hardware wallet, lets users store, protect, and transact Bitcoin and other cryptocurrencies with ease - along with its software companion, the BitBoxApp.