Skip to main content

Assertion language

Sorts#

Sort ::= ...

Sorts are the types of Michelson. (Some of the types are not supported yet. See Instruction support.) So, we often say type if no confusion. One subtle difference is that sorts have no attribute like comparable, pushable, etc., which Michelson types have. So you could use, for instance, set (list int) in an annotation while it is invalid in Michelson since set requires a comparable type but (list int) is not. Anyway, such a difference makes no problem because our interest is the behavior of Michelson code and no need to use such sorts.

Variables#

VAR ::= [a-z][a-z A-Z 0-9 _ ']*

Variables are alphanumerical identifiers starting with a small letter. Builtin functions' names and the following identifiers are reserved and cannot be used as variables.

if then else match with

Literals#

LITERAL ::=  | NUMBER  | STRING  | BYTES

Literals denote base values. Those basically correspond to the ones of Michelson.

Numerals#

NUMBER ::=  | ("0x"|"0X"|"0o"|"0O"|"0b"|"0B")? ([0-9]"_"*)+

Numeral literal is a sequence of digits and underscore optionally prefixed by a radix indicator, where at least one digit exists and underscore is just ignored when interpreted. By default, Literals are interpreted in decimal. Otherwise, prefixed by "0x" or "0X", "0o" or "0O", and "0b" or "0B", literals are interpreted in hexadecimal, octal, and binary, respectively.

After being interpreted as natural numbers, literals are treated as one of the int, nat, timestamp, and mutez sorts depending on context. When a literal is treated as timestamp, it is a Epoch time.

note

Negative numerals, int and timestamp, are obtained by the unary operator "-".

caution

Currently, numeral literals must be smaller than 2622^{62} (in 64-bit system). If you need a larger one, compose it by using arithmetic operators.

Strings#

String literal is a sequence of space and printable ASCII characters enclosed by the double quote ". The following table shows escape sequences.

SequenceCharacter
\\backslash
\"double quote
\nnewline
\thorizontal tab
\bbackspace
\rcarriage return
\[0-9][0-9][0-9]the character of ASCII code in decimal
\o[0-7][0-7][0-7]the character of ASCII code in octal
\x[0-9 a-f A-F][0-9 a-f A-F]the character of ASCII code in hexadecimal

After the interpretation, string is treated as one of the string, address, key, chain_id, signature, and timestamp sorts depending on context and the format of the string. The format of each data type follows the one of Michelson. For instance, string of timestamp is RFC3339 notation.

Bytes#

Bytes literal is a sequence of the hex characters [0-9 a-f A-F]. A literal is treated as one of the bytes, key, address, signature, and chain_id sorts depending on context. For each sort, valid byte sequences in Michelson are the valid literals.

Functions#

Function ::= /* See the table below. */

Functions are the built-in ones and user-defined measures. The following is a list of the built-in functions and those descriptions. The functions marked as UF, most of which correspond to domain-specific instructions of Michelson, are underspecified. That means, for example, Helmholtz does not know the actual value of pack 0 but knows unpack_opt (pack 0) = Some 0, etc. We give reasonable assumptions for the UF functions, but the choice is indeed a bit arbitrary. So it may happen that Helmholtz cannot verify a correct program. Even so, Helmholtz can still reason that, for example, the stack top after the PACK instruction is pack x where x is bound for the stack top before the instruction, which is enough information to verify many specifications.

UFNameTypeDescription
notbool -> boolnot b negates the Boolean value b.
get_str_optstring -> nat -> option stringget_str_opt s i returns i-th character of s as the one-length string. If indexing is invalid, None is returned.
sub_str_optstring -> nat -> nat -> option stringsub_str_opt s i l returns the substring of s from i-th with the length of l. If indexing is invalid, None is returned.
len_strstring -> natlen_str s returns the length of the string s.
concat_strstring -> string -> stringconcat_str s1 s2 concatenates the two strings s1 and s2. This is equivalent to s1 ^ s2.
get_bytes_optbytes -> int -> bytesSimilar to get_string_opt but for bytes.
sub_bytes_optbytes -> int -> int -> bytesSimilar to sub_string_opt but for bytes.
len_bytesbytes -> intSimilar to len_string but for bytes.
concat_bytesbytes -> bytes -> bytesSimilar to concat_string but for bytes.
firstpair 'a 'b -> 'aExtract the first element of a pair.
secondpair 'a 'b -> 'bExtract the second element of a pair.
find_opt'a -> map 'a 'b -> option 'bfind_opt k m looks up an entry associated with the key k from the map m. It returns Some v if v associated with k in m, or otherwise, it returns None.
update'a -> option 'b -> map 'a 'b -> map 'a 'bupdate k (Some v) m returns updated map in which the value associated with k in m to v; update k None m deletes the value associated with k from m.
empty_mapmap 'a 'bCreate the empty map.
mem'a -> set 'a -> boolmem x s returns True if x is in s, or otherwise, False.
add'a -> set 'a -> set 'aadd x s returns the set whose elements are x and ones of s.
remove'a -> set 'a -> set 'aremove x s returns the set whose elements are ones of s except x.
empty_setset 'aCreate the empty set.
opack'a -> bytesCorresponding to PACK in Michelson.
ounpack_optbytes -> option 'aCorresponding to UNPACK in Michelson.
osourceaddressCorresponding to SOURCE in Michelson.
osenderaddressCorresponding to SENDER in Michelson.
oselfcontract parameter_tyCorrespond to SELF in Michelson. There is a variant of the form self %A, which corresponds to SELF %A in Michelson.
oself_addraddressCorrespond to SELF_ADDR in Michelson.
onowtimestampCorrespond to NOW in Michelson.
obalancemutezCorrespond to BALANCE in Michelson.
oamountmutezCorrespond to AMOUNT in Michelson.
ochain_idchain_idCorrespond to CHAIN_ID in Michelson.
olevelnatCorrespond to LEVEL in Michelson.
ovoting_powerkey_hash -> natCorrespond to VOTING_POWER in Michelson.
ototal_voting_powernatCorrespond to TOTAL_VOTING_POWER in Michelson.
ocontract_optaddress -> option (contract 'a)Correspond to CONTRACT ty in Michelson. There is a variant of the form contract_opt %A, which corresponds to CONTRACT %A ty in Michelson.
oimplicit_accountkey_hash -> contract unitCorrespond to IMPLICIT_ACCOUNT in Michelson.
ocalllambda 'a 'b -> 'a -> 'b -> boolcall f a b, where f returns true if the application of lambda f to argument a terminates and evaluates to b; or false otherwise.
ohashkey -> key_hashCorrespond to HASH in Michelson.
oblake2bbytes -> bytesCorrespond to BLAKE2B in Michelson.
okeccakbytes -> bytesCorrespond to KECCAK in Michelson.
osha256bytes -> bytesCorrespond to SHA256 in Michelson.
osha512bytes -> bytesCorrespond to SHA512 in Michelson.
osha3bytes -> bytesCorrespond to SHA3 in Michelson.
osigkey -> signature -> bytes -> boolCorrespond to CHECK_SIGNATURE in Michelson.

Constructors#

Constructor ::= /* See the table below. */

Constructors construct and destruct data structure values. The following is a list of constructors.

NameTypeDescription
TrueboolTrue value of Boolean type
FalseboolFalse value of Boolean type
UnitunitThe unique value of the unit type
Nillist 'aThe empty list
Cons'a -> list 'a -> list 'aPrepend an element to a list
Pair'a -> 'b -> pair 'a 'bPair of two elements
Noneoption 'aAbsent value of the option type
Some'a -> option 'aPresent value of the option type
Left'a -> or 'a 'bLeft alternative of the or type
Right'a -> or 'b 'aRight alternative of the or type
Contractaddress -> contract 'aValue of the contract type.
SetDelegateoption key -> operationDelegate operation
TransferTokens'a -> mutez -> contract 'a -> operationTransaction operation. There is the short alias Transfer.
CreateContractoption address -> mutez -> 'a -> address -> operationContract origination operation.
Error'a -> exceptionGeneral exception value raised by FAILWITH instruction.
OverflowexceptionOverflow exception value
caution

Contract can be used as only a destructor, that is, in a pattern because some addresses cannot be a valid address of a contract as in Michelson. To construct a contract object, use the contract_opt function, which returns a contract object only if a given address can be assumed as a valid one.

Patterns#

PATTERN ::=  | VAR  | "_"  | CONSTRUCTOR ("<" SORT ">")? (PATTERN)*  | PATTERN "," PATTERN  | PATTERN "::" PATTERN  | "[" "]"  | "[" PATTERN (";" PATTERN)* "]"  | "(" PATTERN ")"

A pattern matches a specific form of structured value and binds variables to components of the matched data structure. The precedence and associativity of non-closed operators are shown in the table below, in which operators with higher precedence come first.

OperatorAssociativity
CONSTRUCTORleft
::right
,right

Variable patterns#

The variable pattern VAR matches any value and binds VAR to the value.

Any pattern#

The any pattern _ matches any value but does not bind any variables.

Constructor patterns#

The constructor pattern CONSTRUCTOR PATTERN_1 ... PATTERN_n matches a data structure constructed by CONSTRUCTOR and whose components are matched PATTERN_1 ... PATTERN_n, respectively. Of course, n must be the number of the argument of the constructor.

For the Pair and Cons constructors, we can use the familiar notation PATTERN "," PATTERN and PATTERN "::" PATTERN, respectively. For lists, there is also a list pattern [ PATTERN_1 ; ... ; PATTERN_n ], which matches lists of length n and whose elements are matched PATTERN_1 ... PATTERN_n, respectively.

Type parameters#

Contract, TransferTokens, CreateContract, and Error are followed by a type parameter of the form <ty> to specify those unique type variables as ty. For instance, the pattern Contract <nat> addr matches a value of the contract nat type and binds addr to the content (that is, a contract address) of the value. A type parameter can be omitted if a concerning type is inferred from context.

tip

Consequently, pattern matching for the operation and exception types requires a default clause, consisting of just variable or any pattern, to be exhaustive. That is because, for example, a value of the type exception can be Error <ty> v for any ty, and we have (currently) no sort to denote any type.

Expressions#

EXP ::=  | VAR  | LITERAL  | UOP EXP  | EXP BOP EXP  | CONSTRUCTOR (EXP)*  | FUNCTION (EXP)*  | EXP "." ACCESSER  | "if" EXP "then" EXP "else" EXP  | EXP "," EXP  | EXP ":" SORT  | VAR ":>" RTYPE "->" RTYPE "&" RTYPE  | "[" "]"  | "[" EXP (";" EXP)* "]"  | "match" EXP "with" PATTERN-MATCHING  | "(" EXP ")"PATTERN-MATCHING ::= ("|")? PATTERN "->" EXP ("|" PATTERN "->" EXP)*

An expression denotes a value of several sorts. The precedence and associativity of non-closed expressions are shown in the table below, in which expressions with higher precedence come first.

ExpressionAssociativity
Constructor and function applicationleft
- (prefixed) !-
* / modleft
+ -left
::right
^ @right
< > <= >= = <>left
&&right
||right
,right
:-

Basic expressions#

VAR and LITERAL denote a value those express.

Type annotation EXP : SORT specifies the type of EXP as SORT. It just affects the behavior of type inference of assertion language. So, there is no need to use it as far as the type inference succeeds.

Lambda annotation VAR :> RTYPE -> RTYPE & RTYPE checks a specification, written in LambdaAnnot style RTYPE -> RTYPE & RTYPE, of lambda bound for VAR; and it results in Boolean value denoting if the lambda satisfies the given specification. So this expression is a meaningful one for verification while type annotation is not.

Unary operations#

UOP ::= "-" | "!"

We have two unary operations: the numeral negation operator - for values of int, nat, and timestamp; and the Boolean negation ! for Boolean values.

Binary operations#

OP ::= "+" | "-" | "*" | "/" | "mod" | "<" | ">" | "<=" | ">=" | "=" | "<>" | "&&" | "||" | "::" | "^" | @

Arithmetic operations#

Arithmetic operations +, -, *, /, and mod calculates the numeric value between two numerals. Types of operands and result follow the ones of Michelson. For example, + has one of the type int -> int -> int, int -> nat -> int, nat -> int -> int, nat -> nat -> nat, timestamp -> int -> timestamp, int -> timestamp -> timestamp, and mutez -> mutez -> mutez depending on context.

Comparison operations#

Comparison operators compare two values of the same type and return an intended Boolean value. Currently, inequality operators <, >, <=, and >= only support numeric types int, nat, mutez, and timestamp.

note

Equality operators = and <> can be used for any type even if it is not comparable in Michelson.

Logical connectives#

&& and || denote the logical conjunction and logical disjunction between Boolean values, respectively.

List operations#

:: is infix notation of the Cons constructor. @ denotes the concatenation of two lists.

String concatenation#

^ denotes the concatenation of two strings.

Constructor and function applications#

Constructor application denotes the construction of a data structure. Function application denotes the result of the function.

caution

Constructor and function application must be fully applied.

Pair notations#

A pair can be constructed by comma operation EXP , EXP, which denotes Pair EXP EXP. The first and second components of a pair EXP can denote by EXP . first and EXP . second, respectively.

List notation#

A list can be constructed by list notation [ EXP_1 ; ... ; EXP_n ], which denotes a list consisting of EXP_1 to EXP_n in the order.

Control structures#

There are two conditional branching expressions. if EXP_1 then EXP_2 else EXP_3 denotes the value that EXP_2 denotes if the Boolean expression EXP_1 denotes True or EXP_3 otherwise. match EXP with PATTERN_1 -> EXP_1 | ... | PATTERN_n -> EXP_n denotes the value that EXP_i denotes if the value that EXP denotes matches PATTERN_i. The variables bound by PATTERN_i can be used in EXP_i. If more than one of the patterns are matched, the first one is selected. Furthermore, patterns are must be exhaustive; that is, any value of the type of EXP must be matched at least one of the patterns.

Refinement stack types#

RTYPE ::= "{" STACK "|" EXP "}"STACK ::= PATTERN |  PATTERN ":" STACK

Refinement stack types are broadly used expressions to describe a state of stack values. A refinement stack type consists of STACK, a stack pattern that binds variables to a matched stack values, and EXP, a predicate expression that describes a condition for stack values by using the bound variables. For instance, { x | True } denotes any singleton stack because the stack pattern x matches a stack of exactly one length, and the predicate expression True gives no condition on a value bound for x. In a similar manner, { x : y | True } denotes a stack of two lengths, { x : y : z | True } denotes a stack of three lengths, etc. A predicate expression can be any Boolean expression. So, for instance, { x | x > 0 } denotes a singleton stack of a positive integer.

We can use various patterns to match values in a stack. For instance, { (x, y) | True } denotes a singleton stack of a pair. Roughly speaking, { p1 : ... : pn | e } is interpreted as { x1 : ... : xn | match x1, ..., xn with p1, ..., pn -> e | _ -> False } (in precise, there are a few differences, e.g., when pn = _ as described below). So, { [x] | True } denotes a singleton stack of a singleton list (it is equivalent to { y | match y with [x] -> True | _ -> False }.

Given refinement stack types are typechecked to be a valid expression. For example, when { x | x > 0 } is placed in a code point in which a stack type is int : [] (as you know, Michelson code is statically typed, so we know a stack type of each code point), a type of x is inferred as int, and the refinement stack type is accepted as a valid expression. Contrary, when it is placed in a code point expecting bool : [], Helmholtz reports a type error because a type of x is inferred as bool and the predicate expression is ill-typed since 0 is not a Boolean value. Length mismatch is also reported as a type error as the Michelson type system does.

It sometimes happens that we are just interested in some top values of a stack. In such a case, we can use any pattern _ in the bottom position of a stack pattern, in which the any pattern denotes a stack of any length (including the empty stack). So, { _ | True } denotes any stack, { x : _ | True } denotes a stack of more than zero-length, etc. Note that { x : _ : y | True } denotes a stack of exactly three lengths since any pattern is not placed in the bottom position, and so it just matches any value in the middle of the stack.

Annotations#

ANNOTATION ::=  | "Assert" RTYPE  | "Assume" RTYPE  | "LoopInv" RTYPE  | "MapInv" VAR RTYPE  | "LambdaAnnot" RTYPE "->" RTYPE "&" RTYPE [ "(" (VAR ":" SORT)+ ")" ]  | "ContractAnnot" RTYPE "->" RTYPE "&" RTYPE [ "(" (VAR ":" SORT)+ ")" ]  | "Measure" VAR ":" "list" SORT "->" SORT "where" "[" "]" "=" EXP "|" VAR "::" VAR "=" EXP  | "Measure" VAR ":" "set" SORT "->" SORT "where" "EmptySet" "=" EXP "|" "Add" VAR VAR "=" EXP  | "Measure" VAR ":" "map" SORT SORT "->" SORT "where" "EmptyMap" "=" EXP "|" "Bind" VAR VAR VAR "=" EXP

An annotation is a root object of annotations for Helmholtz. It gives several operations, hints, etc., to Helmholtz. There are six kinds of annotations in Helmholtz: ContractAnnot, LambdaAnnot, LoopInv, Assert, Assume, and Measure. In those, ContractAnnot is mandatory for contract code, and LambdaAnnot and LoopInv are mandatory for the LAMBDA instruction and loop instructions (LOOP, ITER, etc.), respectively.

caution

In the current implementation, ContractAnnot can be omitted. However, in such a case, << ContractAnnot { _ | False } -> { _ | True } & { _ | True} >> is provided. It is a trivial (satisfied by any contract) specification.

ContractAnnot#

A contract annotation gives a specification of a contract. It is placed just before the code section and causes Helmholtz to check if the contract satisfies the given specification. A contract annotation has the form ContractAnnot RTYPE_pre -> RTYPE_post & RTYPE_abpost vars, where vars is optional.

Precondition#

RTYPE_pre describes a supposed state of an initial stack. So it is a refinement stack type for a stack type (pair parameter_ty storage_ty) : []. The stack pattern in RTYPE_pre is restricted to the form of (p1, p2) or _. A typical RTYPE_pre is { _ | True }, which means a contract can accept any parameter and storage value. It is also reasonable to give an assumption for a storage value since the storage is controlled by a contract. The following refinement stack type expresses that: the contract at the stored address must exist, and its parameter type must be string.

{ (_, addr) | match contract_opt addr with Some (Contract<string> _) -> True | None -> False }

Postcondition#

RTYPE_post describes a desirable state of a final stack. So it is a refinement stack type for a stack type (pair (list operation) storage_ty) : []. Helmholtz checks if this condition holds under the assumption described by RTYPE_pre and by analyzing the contract code.

Ab-Postcondition#

RTYPE_abpost describes an exceptional value condition. It is a refinement stack type for the stack type exception : [], where exception is a special data type consists of constructors Error v for a FAILWITH exception and Overflow for the overflow exception. A typical RTYPE_abpost is { _ | False }, which means no exception happens since no value can satisfy the False condition.

Logical variables#

Optional vars declares variables that can be used in annotations occurring in contract code (except the body of LAMBDA instruction). Typical use of declared variables is naming a value of a stack at some code point by combined with the Assume annotation so that we can refer to the value after the code point.

Special treatment of the scope of pre-condition stack variables.#

Basically, variables in the stack pattern of a refinement stack type can be used in its predicate expression. However, those in RTYPE_pre can also be used in predicate expressions of RTYPE_post, RTYPE_abpost, and annotations occurring in contract code except the body of LAMBDA instruction. This enables users to write a specification of input-output relation.

Multiple ContractAnnot#

Helmholtz accepts more than one ContractAnnot. In such a case, Helmholtz verifies if all the annotations satisfy or not. Furthermore, ContractAnnot can be given for a specific entrypoint in the following form.

<< ContractAnnot %entrypoint RTYPE -> RTYPE & RTYPE >>
note

ContractAnnot with an entrypoint is just a syntax sugar. That means << ContractAnnot %A { (p, s) | e } -> RTYPE & RTYPE >> is treated as << ContractAnnot { (Left p, s) | e } -> RTYPE & RTYPE >> if a contract has the parameter type (or (nat %A) (nat %B)).

LambdaAnnot#

A lambda annotation is a similar annotation to contract annotations but for LAMBDA instruction. So it must be placed just before the LAMBDA instruction and causes Helmholtz to check if a lambda pushed by the instruction satisfies the given specification. A lambda annotation has the form LambdaAnnot RTYPE_pre -> RTYPE_post & RTYPE_abpost vars, and each component has a similar meaning to ones of a contract annotation.

LoopInv#

A loop invariant annotation gives a loop invariant for loop-like instructions: ITER, LOOP, and LOOP_LEFT; and so, it must be placed just before those instructions. A loop invariant annotation has the form LoopInv RTYPE, where RTYPE gives a loop invariant, which is a condition for a stack before every loop iteration.

MapInv#

A map invariant is a variant of a loop invariant for the MAP instruction. A map invariant annotation has the form MapInv VAR RTYPE, where RTYPE is an invariant, but also VAR is bound for the processed elements of the MAP operation.

Assert#

An assert annotation lets Helmholtz check a stack condition at the given code point. An assertion annotation has the form Assert RTYPE, where RTYPE gives a stack condition being verified.

Assume#

An assume annotation adds a fact (hypothesis) about a stack condition at the given code point. An assertion has the form Assume RTYPE, where RTYPE gives a fact for a stack condition. A user can give any fact, and Helmholtz believes that the fact is correct, which means Helmholtz never checks if the fact holds. So, this annotation must be carefully used because if you give an incorrect fact, a verification result becomes nonsensical.

Measure#

A measure annotation defines a recursive function for an inductive data structure of the sort list, set, and map. The form of definition is rather restricted as syntax defines. Measure function can take one parameter, which is an inductive data structure, and gives the function definition for the base case and inductive case. In the inductive case, the function being defined can be used.

EmptySet, Add, EmptyMap, and Bind are special constructors which can only be used as syntax defined. The type of each constructor is shown in the table below.

NameTypeDescription
EmptySetset 'aThe empty set
Add'a -> set 'a -> set 'aAdd x s denotes the set whose elements are x and ones of s. Moreover, it is supposed that x is not an element of s.
EmptyMapmap 'k 'vThe empty map
Bind'k -> 'v -> map 'k 'v -> map 'k 'vBind k v m denotes the map whose bindings are k to v and ones of m. Moreover, it is supposed that k is not bound in m.