Assertion language
Sorts#
Sort ::= ...Sorts are the types of Michelson. (Some of the types are not supported yet. See
Instruction support.) So, we often say type if no confusion.
One subtle difference is that sorts have no attribute like comparable, pushable,
etc., which Michelson types have. So you could use, for instance, set (list
int) in an annotation while it is invalid in Michelson since set requires a
comparable type but (list int) is not. Anyway, such a difference makes no
problem because our interest is the behavior of Michelson code and no need to use
such sorts.
Variables#
VAR ::= [a-z][a-z A-Z 0-9 _ ']*Variables are alphanumerical identifiers starting with a small letter. Builtin functions' names and the following identifiers are reserved and cannot be used as variables.
if then else match withLiterals#
LITERAL ::= | NUMBER | STRING | BYTESLiterals denote base values. Those basically correspond to the ones of Michelson.
Numerals#
NUMBER ::= | ("0x"|"0X"|"0o"|"0O"|"0b"|"0B")? ([0-9]"_"*)+Numeral literal is a sequence of digits and underscore optionally prefixed by a radix indicator, where at least one digit exists and underscore is just ignored when interpreted. By default, Literals are interpreted in decimal. Otherwise, prefixed by "0x" or "0X", "0o" or "0O", and "0b" or "0B", literals are interpreted in hexadecimal, octal, and binary, respectively.
After being interpreted as natural numbers, literals are treated as one of the int,
nat, timestamp, and mutez sorts depending on context. When a literal is
treated as timestamp, it is a Epoch time.
note
Negative numerals, int and timestamp, are obtained by the unary operator "-".
caution
Currently, numeral literals must be smaller than (in 64-bit system). If you need a larger one, compose it by using arithmetic operators.
Strings#
String literal is a sequence of space and printable ASCII characters enclosed by the double quote ". The following table shows escape sequences.
| Sequence | Character |
|---|---|
\\ | backslash |
\" | double quote |
\n | newline |
\t | horizontal tab |
\b | backspace |
\r | carriage return |
\[0-9][0-9][0-9] | the character of ASCII code in decimal |
\o[0-7][0-7][0-7] | the character of ASCII code in octal |
\x[0-9 a-f A-F][0-9 a-f A-F] | the character of ASCII code in hexadecimal |
After the interpretation, string is treated as one of the string, address,
key, chain_id, signature, and timestamp sorts depending on context and
the format of the string. The format of each data type follows the one of
Michelson. For instance, string of timestamp is RFC3339 notation.
Bytes#
Bytes literal is a sequence of the hex characters [0-9 a-f A-F]. A literal is
treated as one of the bytes, key, address, signature, and chain_id
sorts depending on context. For each sort, valid byte sequences in Michelson
are the valid literals.
Functions#
Function ::= /* See the table below. */Functions are the built-in ones and user-defined measures. The following is a
list of the built-in functions and those descriptions. The functions marked as
UF, most of which correspond to domain-specific instructions of Michelson, are
underspecified. That means, for example, Helmholtz does not know the actual
value of pack 0 but knows unpack_opt (pack 0) = Some 0, etc. We give
reasonable assumptions for the UF functions, but the choice is indeed a bit
arbitrary. So it may happen that Helmholtz cannot verify a correct program.
Even so, Helmholtz can still reason that, for example, the stack top
after the PACK instruction is pack x where x is bound for the stack top
before the instruction, which is enough information to verify many
specifications.
| UF | Name | Type | Description |
|---|---|---|---|
not | bool -> bool | not b negates the Boolean value b. | |
get_str_opt | string -> nat -> option string | get_str_opt s i returns i-th character of s as the one-length string. If indexing is invalid, None is returned. | |
sub_str_opt | string -> nat -> nat -> option string | sub_str_opt s i l returns the substring of s from i-th with the length of l. If indexing is invalid, None is returned. | |
len_str | string -> nat | len_str s returns the length of the string s. | |
concat_str | string -> string -> string | concat_str s1 s2 concatenates the two strings s1 and s2. This is equivalent to s1 ^ s2. | |
get_bytes_opt | bytes -> int -> bytes | Similar to get_string_opt but for bytes. | |
sub_bytes_opt | bytes -> int -> int -> bytes | Similar to sub_string_opt but for bytes. | |
len_bytes | bytes -> int | Similar to len_string but for bytes. | |
concat_bytes | bytes -> bytes -> bytes | Similar to concat_string but for bytes. | |
first | pair 'a 'b -> 'a | Extract the first element of a pair. | |
second | pair 'a 'b -> 'b | Extract the second element of a pair. | |
find_opt | 'a -> map 'a 'b -> option 'b | find_opt k m looks up an entry associated with the key k from the map m. It returns Some v if v associated with k in m, or otherwise, it returns None. | |
update | 'a -> option 'b -> map 'a 'b -> map 'a 'b | update k (Some v) m returns updated map in which the value associated with k in m to v; update k None m deletes the value associated with k from m. | |
empty_map | map 'a 'b | Create the empty map. | |
mem | 'a -> set 'a -> bool | mem x s returns True if x is in s, or otherwise, False. | |
add | 'a -> set 'a -> set 'a | add x s returns the set whose elements are x and ones of s. | |
remove | 'a -> set 'a -> set 'a | remove x s returns the set whose elements are ones of s except x. | |
empty_set | set 'a | Create the empty set. | |
| o | pack | 'a -> bytes | Corresponding to PACK in Michelson. |
| o | unpack_opt | bytes -> option 'a | Corresponding to UNPACK in Michelson. |
| o | source | address | Corresponding to SOURCE in Michelson. |
| o | sender | address | Corresponding to SENDER in Michelson. |
| o | self | contract parameter_ty | Correspond to SELF in Michelson. There is a variant of the form self %A, which corresponds to SELF %A in Michelson. |
| o | self_addr | address | Correspond to SELF_ADDR in Michelson. |
| o | now | timestamp | Correspond to NOW in Michelson. |
| o | balance | mutez | Correspond to BALANCE in Michelson. |
| o | amount | mutez | Correspond to AMOUNT in Michelson. |
| o | chain_id | chain_id | Correspond to CHAIN_ID in Michelson. |
| o | level | nat | Correspond to LEVEL in Michelson. |
| o | voting_power | key_hash -> nat | Correspond to VOTING_POWER in Michelson. |
| o | total_voting_power | nat | Correspond to TOTAL_VOTING_POWER in Michelson. |
| o | contract_opt | address -> option (contract 'a) | Correspond to CONTRACT ty in Michelson. There is a variant of the form contract_opt %A, which corresponds to CONTRACT %A ty in Michelson. |
| o | implicit_account | key_hash -> contract unit | Correspond to IMPLICIT_ACCOUNT in Michelson. |
| o | call | lambda 'a 'b -> 'a -> 'b -> bool | call f a b, where f returns true if the application of lambda f to argument a terminates and evaluates to b; or false otherwise. |
| o | hash | key -> key_hash | Correspond to HASH in Michelson. |
| o | blake2b | bytes -> bytes | Correspond to BLAKE2B in Michelson. |
| o | keccak | bytes -> bytes | Correspond to KECCAK in Michelson. |
| o | sha256 | bytes -> bytes | Correspond to SHA256 in Michelson. |
| o | sha512 | bytes -> bytes | Correspond to SHA512 in Michelson. |
| o | sha3 | bytes -> bytes | Correspond to SHA3 in Michelson. |
| o | sig | key -> signature -> bytes -> bool | Correspond to CHECK_SIGNATURE in Michelson. |
Constructors#
Constructor ::= /* See the table below. */Constructors construct and destruct data structure values. The following is a list of constructors.
| Name | Type | Description |
|---|---|---|
True | bool | True value of Boolean type |
False | bool | False value of Boolean type |
Unit | unit | The unique value of the unit type |
Nil | list 'a | The empty list |
Cons | 'a -> list 'a -> list 'a | Prepend an element to a list |
Pair | 'a -> 'b -> pair 'a 'b | Pair of two elements |
None | option 'a | Absent value of the option type |
Some | 'a -> option 'a | Present value of the option type |
Left | 'a -> or 'a 'b | Left alternative of the or type |
Right | 'a -> or 'b 'a | Right alternative of the or type |
Contract | address -> contract 'a | Value of the contract type. |
SetDelegate | option key -> operation | Delegate operation |
TransferTokens | 'a -> mutez -> contract 'a -> operation | Transaction operation. There is the short alias Transfer. |
CreateContract | option address -> mutez -> 'a -> address -> operation | Contract origination operation. |
Error | 'a -> exception | General exception value raised by FAILWITH instruction. |
Overflow | exception | Overflow exception value |
caution
Contract can be used as only a destructor, that is, in a pattern because some
addresses cannot be a valid address of a contract as in Michelson. To construct a
contract object, use the contract_opt function, which returns a contract
object only if a given address can be assumed as a valid one.
Patterns#
PATTERN ::= | VAR | "_" | CONSTRUCTOR ("<" SORT ">")? (PATTERN)* | PATTERN "," PATTERN | PATTERN "::" PATTERN | "[" "]" | "[" PATTERN (";" PATTERN)* "]" | "(" PATTERN ")"A pattern matches a specific form of structured value and binds variables to components of the matched data structure. The precedence and associativity of non-closed operators are shown in the table below, in which operators with higher precedence come first.
| Operator | Associativity |
|---|---|
CONSTRUCTOR | left |
:: | right |
, | right |
Variable patterns#
The variable pattern VAR matches any value and binds VAR to the value.
Any pattern#
The any pattern _ matches any value but does not bind any variables.
Constructor patterns#
The constructor pattern CONSTRUCTOR PATTERN_1 ... PATTERN_n matches a data
structure constructed by CONSTRUCTOR and whose components are matched
PATTERN_1 ... PATTERN_n, respectively. Of course, n must be the number of the
argument of the constructor.
For the Pair and Cons constructors, we can use the familiar notation PATTERN
"," PATTERN and PATTERN "::" PATTERN, respectively. For lists, there is also
a list pattern [ PATTERN_1 ; ... ; PATTERN_n ], which matches lists of length
n and whose elements are matched PATTERN_1 ... PATTERN_n, respectively.
Type parameters#
Contract, TransferTokens, CreateContract, and Error are followed by a
type parameter of the form <ty> to specify those unique type variables as ty.
For instance, the pattern Contract <nat> addr matches a value of the contract
nat type and binds addr to the content (that is, a contract address) of the value. A
type parameter can be omitted if a concerning type is inferred from context.
tip
Consequently, pattern matching for the operation and exception types
requires a default clause, consisting of just variable or any pattern, to be
exhaustive. That is because, for example, a value of the type exception can
be Error <ty> v for any ty, and we have (currently) no sort to denote
any type.
Expressions#
EXP ::= | VAR | LITERAL | UOP EXP | EXP BOP EXP | CONSTRUCTOR (EXP)* | FUNCTION (EXP)* | EXP "." ACCESSER | "if" EXP "then" EXP "else" EXP | EXP "," EXP | EXP ":" SORT | VAR ":>" RTYPE "->" RTYPE "&" RTYPE | "[" "]" | "[" EXP (";" EXP)* "]" | "match" EXP "with" PATTERN-MATCHING | "(" EXP ")"PATTERN-MATCHING ::= ("|")? PATTERN "->" EXP ("|" PATTERN "->" EXP)*An expression denotes a value of several sorts. The precedence and associativity of non-closed expressions are shown in the table below, in which expressions with higher precedence come first.
| Expression | Associativity |
|---|---|
| Constructor and function application | left |
- (prefixed) ! | - |
* / mod | left |
+ - | left |
:: | right |
^ @ | right |
< > <= >= = <> | left |
&& | right |
| || | right |
, | right |
: | - |
Basic expressions#
VAR and LITERAL denote a value those express.
Type annotation EXP : SORT specifies the type of EXP as SORT. It just
affects the behavior of type inference of assertion language. So, there is no
need to use it as far as the type inference succeeds.
Lambda annotation VAR :> RTYPE -> RTYPE & RTYPE checks a specification,
written in LambdaAnnot style RTYPE -> RTYPE & RTYPE, of lambda bound for
VAR; and it results in Boolean value denoting if the lambda satisfies the given
specification. So this expression is a meaningful one for verification while type
annotation is not.
Unary operations#
UOP ::= "-" | "!"We have two unary operations: the numeral negation operator - for values of int, nat, and
timestamp; and the Boolean negation ! for Boolean values.
Binary operations#
OP ::= "+" | "-" | "*" | "/" | "mod" | "<" | ">" | "<=" | ">=" | "=" | "<>" | "&&" | "||" | "::" | "^" | @Arithmetic operations#
Arithmetic operations +, -, *, /, and mod calculates the numeric value
between two numerals. Types of operands and result follow the ones of
Michelson. For example, + has one of the type int -> int -> int, int ->
nat -> int, nat -> int -> int, nat -> nat -> nat, timestamp -> int ->
timestamp, int -> timestamp -> timestamp, and mutez -> mutez -> mutez
depending on context.
Comparison operations#
Comparison operators compare two values of the same type and return an intended
Boolean value. Currently, inequality operators <, >, <=, and >= only
support numeric types int, nat, mutez, and timestamp.
note
Equality operators = and <> can be used for any type even if it is not
comparable in Michelson.
Logical connectives#
&& and || denote the logical conjunction and logical disjunction between Boolean values, respectively.
List operations#
:: is infix notation of the Cons constructor. @ denotes the concatenation of two lists.
String concatenation#
^ denotes the concatenation of two strings.
Constructor and function applications#
Constructor application denotes the construction of a data structure. Function application denotes the result of the function.
caution
Constructor and function application must be fully applied.
Pair notations#
A pair can be constructed by comma operation EXP , EXP, which denotes Pair EXP
EXP. The first and second components of a pair EXP can denote by EXP
. first and EXP . second, respectively.
List notation#
A list can be constructed by list notation [ EXP_1 ; ... ; EXP_n ], which
denotes a list consisting of EXP_1 to EXP_n in the order.
Control structures#
There are two conditional branching expressions. if EXP_1 then EXP_2 else
EXP_3 denotes the value that EXP_2 denotes if the Boolean expression EXP_1
denotes True or EXP_3 otherwise. match EXP with PATTERN_1 -> EXP_1 | ... |
PATTERN_n -> EXP_n denotes the value that EXP_i denotes if the value that
EXP denotes matches PATTERN_i. The variables bound by PATTERN_i can be
used in EXP_i. If more than one of the patterns are matched, the first one is
selected. Furthermore, patterns are must be exhaustive; that is, any value of
the type of EXP must be matched at least one of the patterns.
Refinement stack types#
RTYPE ::= "{" STACK "|" EXP "}"STACK ::= PATTERN | PATTERN ":" STACKRefinement stack types are broadly used expressions to describe a state of stack
values. A refinement stack type consists of STACK, a stack pattern that binds
variables to a matched stack values, and EXP, a predicate expression that
describes a condition for stack values by using the bound variables. For
instance, { x | True } denotes any singleton stack because the stack pattern
x matches a stack of exactly one length, and the predicate expression True
gives no condition on a value bound for x. In a similar manner, { x : y |
True } denotes a stack of two lengths, { x : y : z | True } denotes a stack of
three lengths, etc. A predicate expression can be any Boolean expression. So,
for instance, { x | x > 0 } denotes a singleton stack of a positive integer.
We can use various patterns to match values in a stack. For instance, { (x,
y) | True } denotes a singleton stack of a pair. Roughly speaking, { p1 :
... : pn | e } is interpreted as { x1 : ... : xn | match x1, ..., xn with p1,
..., pn -> e | _ -> False } (in precise, there are a few differences, e.g.,
when pn = _ as described below). So, { [x] | True } denotes a singleton
stack of a singleton list (it is equivalent to { y | match y with [x]
-> True | _ -> False }.
Given refinement stack types are typechecked to be a valid expression. For
example, when { x | x > 0 } is placed in a code point in which a stack type is
int : [] (as you know, Michelson code is statically typed, so we know a stack
type of each code point), a type of x is inferred as int, and the refinement
stack type is accepted as a valid expression. Contrary, when it is placed in a
code point expecting bool : [], Helmholtz reports a type error because a type
of x is inferred as bool and the predicate expression is ill-typed since 0
is not a Boolean value. Length mismatch is also reported as a type error as
the Michelson type system does.
It sometimes happens that we are just interested in some top values of a stack.
In such a case, we can use any pattern _ in the bottom position of a stack
pattern, in which the any pattern denotes a stack of any length (including the
empty stack). So, { _ | True } denotes any stack, { x : _ | True } denotes
a stack of more than zero-length, etc. Note that { x : _ : y | True } denotes
a stack of exactly three lengths since any pattern is not placed in the bottom
position, and so it just matches any value in the middle of the stack.
Annotations#
ANNOTATION ::= | "Assert" RTYPE | "Assume" RTYPE | "LoopInv" RTYPE | "MapInv" VAR RTYPE | "LambdaAnnot" RTYPE "->" RTYPE "&" RTYPE [ "(" (VAR ":" SORT)+ ")" ] | "ContractAnnot" RTYPE "->" RTYPE "&" RTYPE [ "(" (VAR ":" SORT)+ ")" ] | "Measure" VAR ":" "list" SORT "->" SORT "where" "[" "]" "=" EXP "|" VAR "::" VAR "=" EXP | "Measure" VAR ":" "set" SORT "->" SORT "where" "EmptySet" "=" EXP "|" "Add" VAR VAR "=" EXP | "Measure" VAR ":" "map" SORT SORT "->" SORT "where" "EmptyMap" "=" EXP "|" "Bind" VAR VAR VAR "=" EXPAn annotation is a root object of annotations for Helmholtz. It gives several operations, hints,
etc., to Helmholtz. There are six kinds of annotations in Helmholtz: ContractAnnot, LambdaAnnot,
LoopInv, Assert, Assume, and Measure. In those, ContractAnnot is mandatory for contract
code, and LambdaAnnot and LoopInv are mandatory for the LAMBDA instruction and loop
instructions (LOOP, ITER, etc.), respectively.
caution
In the current implementation, ContractAnnot can be omitted. However, in such a
case, << ContractAnnot { _ | False } -> { _ | True } & { _ | True} >> is
provided. It is a trivial (satisfied by any contract) specification.
ContractAnnot#
A contract annotation gives a specification of a contract. It is placed just before the code section
and causes Helmholtz to check if the contract satisfies the given specification. A contract
annotation has the form ContractAnnot RTYPE_pre -> RTYPE_post & RTYPE_abpost vars, where vars is
optional.
Precondition#
RTYPE_pre describes a supposed state of an initial stack. So it is a
refinement stack type for a stack type (pair parameter_ty storage_ty) : [].
The stack pattern in RTYPE_pre is restricted to the form of (p1, p2) or _.
A typical RTYPE_pre is { _ | True }, which means a contract can accept any
parameter and storage value. It is also reasonable to give an assumption for a
storage value since the storage is controlled by a contract. The following
refinement stack type expresses that: the contract at the stored address must
exist, and its parameter type must be string.
{ (_, addr) | match contract_opt addr with Some (Contract<string> _) -> True | None -> False }Postcondition#
RTYPE_post describes a desirable state of a final stack. So it is a
refinement stack type for a stack type (pair (list operation) storage_ty) :
[]. Helmholtz checks if this condition holds under the assumption described by
RTYPE_pre and by analyzing the contract code.
Ab-Postcondition#
RTYPE_abpost describes an exceptional value condition. It is a refinement
stack type for the stack type exception : [], where exception is a special
data type consists of constructors Error v for a FAILWITH exception and
Overflow for the overflow exception. A typical RTYPE_abpost is { _ | False
}, which means no exception happens since no value can satisfy the False
condition.
Logical variables#
Optional vars declares variables that can be used in annotations occurring in
contract code (except the body of LAMBDA instruction). Typical use of
declared variables is naming a value of a stack at some code point by combined
with the Assume annotation so that we can refer to the value after the code
point.
Special treatment of the scope of pre-condition stack variables.#
Basically, variables in the stack pattern of a refinement stack type can be used
in its predicate expression. However, those in RTYPE_pre can also be used in
predicate expressions of RTYPE_post, RTYPE_abpost, and annotations occurring
in contract code except the body of LAMBDA instruction. This enables users to
write a specification of input-output relation.
Multiple ContractAnnot#
Helmholtz accepts more than one ContractAnnot. In such a case, Helmholtz verifies if all the annotations satisfy or not. Furthermore, ContractAnnot can be given for a specific entrypoint in the following form.
<< ContractAnnot %entrypoint RTYPE -> RTYPE & RTYPE >>note
ContractAnnot with an entrypoint is just a syntax sugar. That means <<
ContractAnnot %A { (p, s) | e } -> RTYPE & RTYPE >> is treated as <<
ContractAnnot { (Left p, s) | e } -> RTYPE & RTYPE >> if a contract has the
parameter type (or (nat %A) (nat %B)).
LambdaAnnot#
A lambda annotation is a similar annotation to contract annotations but for LAMBDA instruction.
So it must be placed just before the LAMBDA instruction and causes Helmholtz to check if a lambda
pushed by the instruction satisfies the given specification. A lambda annotation has the form
LambdaAnnot RTYPE_pre -> RTYPE_post & RTYPE_abpost vars, and each component has a similar meaning to
ones of a contract annotation.
LoopInv#
A loop invariant annotation gives a loop invariant for loop-like instructions:
ITER, LOOP, and LOOP_LEFT; and so, it must be placed just before those
instructions. A loop invariant annotation has the form LoopInv RTYPE, where
RTYPE gives a loop invariant, which is a condition for a stack before every
loop iteration.
MapInv#
A map invariant is a variant of a loop invariant for the MAP instruction. A
map invariant annotation has the form MapInv VAR RTYPE, where RTYPE is an
invariant, but also VAR is bound for the processed elements of the MAP
operation.
Assert#
An assert annotation lets Helmholtz check a stack condition at the given code point. An assertion
annotation has the form Assert RTYPE, where RTYPE gives a stack condition being verified.
Assume#
An assume annotation adds a fact (hypothesis) about a stack condition at the given code point. An
assertion has the form Assume RTYPE, where RTYPE gives a fact for a stack condition. A user can
give any fact, and Helmholtz believes that the fact is correct, which means Helmholtz never checks
if the fact holds. So, this annotation must be carefully used because if you give an incorrect
fact, a verification result becomes nonsensical.
Measure#
A measure annotation defines a recursive function for an inductive data structure
of the sort list, set, and map. The form of definition is rather
restricted as syntax defines. Measure function can take one parameter, which is
an inductive data structure, and gives the function definition for the base case
and inductive case. In the inductive case, the function being defined can be
used.
EmptySet, Add, EmptyMap, and Bind are special constructors which can only be
used as syntax defined. The type of each constructor is shown in the table below.
| Name | Type | Description |
|---|---|---|
EmptySet | set 'a | The empty set |
Add | 'a -> set 'a -> set 'a | Add x s denotes the set whose elements are x and ones of s. Moreover, it is supposed that x is not an element of s. |
EmptyMap | map 'k 'v | The empty map |
Bind | 'k -> 'v -> map 'k 'v -> map 'k 'v | Bind k v m denotes the map whose bindings are k to v and ones of m. Moreover, it is supposed that k is not bound in m. |