Assertion language
#
SortsSort ::= ...
Sorts are the types of Michelson. (Some of the types are not supported yet. See
Instruction support.) So, we often say type if no confusion.
One subtle difference is that sorts have no attribute like comparable, pushable,
etc., which Michelson types have. So you could use, for instance, set (list
int)
in an annotation while it is invalid in Michelson since set
requires a
comparable type but (list int)
is not. Anyway, such a difference makes no
problem because our interest is the behavior of Michelson code and no need to use
such sorts.
#
VariablesVAR ::= [a-z][a-z A-Z 0-9 _ ']*
Variables are alphanumerical identifiers starting with a small letter. Builtin functions' names and the following identifiers are reserved and cannot be used as variables.
if then else match with
#
LiteralsLITERAL ::= | NUMBER | STRING | BYTES
Literals denote base values. Those basically correspond to the ones of Michelson.
#
NumeralsNUMBER ::= | ("0x"|"0X"|"0o"|"0O"|"0b"|"0B")? ([0-9]"_"*)+
Numeral literal is a sequence of digits and underscore optionally prefixed by a radix indicator, where at least one digit exists and underscore is just ignored when interpreted. By default, Literals are interpreted in decimal. Otherwise, prefixed by "0x" or "0X", "0o" or "0O", and "0b" or "0B", literals are interpreted in hexadecimal, octal, and binary, respectively.
After being interpreted as natural numbers, literals are treated as one of the int
,
nat
, timestamp
, and mutez
sorts depending on context. When a literal is
treated as timestamp
, it is a Epoch time.
note
Negative numerals, int
and timestamp
, are obtained by the unary operator "-".
caution
Currently, numeral literals must be smaller than (in 64-bit system). If you need a larger one, compose it by using arithmetic operators.
#
StringsString literal is a sequence of space and printable ASCII characters enclosed by the double quote ". The following table shows escape sequences.
Sequence | Character |
---|---|
\\ | backslash |
\" | double quote |
\n | newline |
\t | horizontal tab |
\b | backspace |
\r | carriage return |
\[0-9][0-9][0-9] | the character of ASCII code in decimal |
\o[0-7][0-7][0-7] | the character of ASCII code in octal |
\x[0-9 a-f A-F][0-9 a-f A-F] | the character of ASCII code in hexadecimal |
After the interpretation, string is treated as one of the string
, address
,
key
, chain_id
, signature
, and timestamp
sorts depending on context and
the format of the string. The format of each data type follows the one of
Michelson. For instance, string of timestamp
is RFC3339 notation.
#
BytesBytes literal is a sequence of the hex characters [0-9 a-f A-F]
. A literal is
treated as one of the bytes
, key
, address
, signature
, and chain_id
sorts depending on context. For each sort, valid byte sequences in Michelson
are the valid literals.
#
FunctionsFunction ::= /* See the table below. */
Functions are the built-in ones and user-defined measures. The following is a
list of the built-in functions and those descriptions. The functions marked as
UF, most of which correspond to domain-specific instructions of Michelson, are
underspecified. That means, for example, Helmholtz does not know the actual
value of pack 0
but knows unpack_opt (pack 0) = Some 0
, etc. We give
reasonable assumptions for the UF functions, but the choice is indeed a bit
arbitrary. So it may happen that Helmholtz cannot verify a correct program.
Even so, Helmholtz can still reason that, for example, the stack top
after the PACK
instruction is pack x
where x
is bound for the stack top
before the instruction, which is enough information to verify many
specifications.
UF | Name | Type | Description |
---|---|---|---|
not | bool -> bool | not b negates the Boolean value b . | |
get_str_opt | string -> nat -> option string | get_str_opt s i returns i -th character of s as the one-length string. If indexing is invalid, None is returned. | |
sub_str_opt | string -> nat -> nat -> option string | sub_str_opt s i l returns the substring of s from i -th with the length of l . If indexing is invalid, None is returned. | |
len_str | string -> nat | len_str s returns the length of the string s . | |
concat_str | string -> string -> string | concat_str s1 s2 concatenates the two strings s1 and s2 . This is equivalent to s1 ^ s2 . | |
get_bytes_opt | bytes -> int -> bytes | Similar to get_string_opt but for bytes . | |
sub_bytes_opt | bytes -> int -> int -> bytes | Similar to sub_string_opt but for bytes . | |
len_bytes | bytes -> int | Similar to len_string but for bytes . | |
concat_bytes | bytes -> bytes -> bytes | Similar to concat_string but for bytes . | |
first | pair 'a 'b -> 'a | Extract the first element of a pair. | |
second | pair 'a 'b -> 'b | Extract the second element of a pair. | |
find_opt | 'a -> map 'a 'b -> option 'b | find_opt k m looks up an entry associated with the key k from the map m . It returns Some v if v associated with k in m , or otherwise, it returns None . | |
update | 'a -> option 'b -> map 'a 'b -> map 'a 'b | update k (Some v) m returns updated map in which the value associated with k in m to v ; update k None m deletes the value associated with k from m . | |
empty_map | map 'a 'b | Create the empty map. | |
mem | 'a -> set 'a -> bool | mem x s returns True if x is in s , or otherwise, False. | |
add | 'a -> set 'a -> set 'a | add x s returns the set whose elements are x and ones of s . | |
remove | 'a -> set 'a -> set 'a | remove x s returns the set whose elements are ones of s except x . | |
empty_set | set 'a | Create the empty set. | |
o | pack | 'a -> bytes | Corresponding to PACK in Michelson. |
o | unpack_opt | bytes -> option 'a | Corresponding to UNPACK in Michelson. |
o | source | address | Corresponding to SOURCE in Michelson. |
o | sender | address | Corresponding to SENDER in Michelson. |
o | self | contract parameter_ty | Correspond to SELF in Michelson. There is a variant of the form self %A , which corresponds to SELF %A in Michelson. |
o | self_addr | address | Correspond to SELF_ADDR in Michelson. |
o | now | timestamp | Correspond to NOW in Michelson. |
o | balance | mutez | Correspond to BALANCE in Michelson. |
o | amount | mutez | Correspond to AMOUNT in Michelson. |
o | chain_id | chain_id | Correspond to CHAIN_ID in Michelson. |
o | level | nat | Correspond to LEVEL in Michelson. |
o | voting_power | key_hash -> nat | Correspond to VOTING_POWER in Michelson. |
o | total_voting_power | nat | Correspond to TOTAL_VOTING_POWER in Michelson. |
o | contract_opt | address -> option (contract 'a) | Correspond to CONTRACT ty in Michelson. There is a variant of the form contract_opt %A , which corresponds to CONTRACT %A ty in Michelson. |
o | implicit_account | key_hash -> contract unit | Correspond to IMPLICIT_ACCOUNT in Michelson. |
o | call | lambda 'a 'b -> 'a -> 'b -> bool | call f a b , where f returns true if the application of lambda f to argument a terminates and evaluates to b ; or false otherwise. |
o | hash | key -> key_hash | Correspond to HASH in Michelson. |
o | blake2b | bytes -> bytes | Correspond to BLAKE2B in Michelson. |
o | keccak | bytes -> bytes | Correspond to KECCAK in Michelson. |
o | sha256 | bytes -> bytes | Correspond to SHA256 in Michelson. |
o | sha512 | bytes -> bytes | Correspond to SHA512 in Michelson. |
o | sha3 | bytes -> bytes | Correspond to SHA3 in Michelson. |
o | sig | key -> signature -> bytes -> bool | Correspond to CHECK_SIGNATURE in Michelson. |
#
ConstructorsConstructor ::= /* See the table below. */
Constructors construct and destruct data structure values. The following is a list of constructors.
Name | Type | Description |
---|---|---|
True | bool | True value of Boolean type |
False | bool | False value of Boolean type |
Unit | unit | The unique value of the unit type |
Nil | list 'a | The empty list |
Cons | 'a -> list 'a -> list 'a | Prepend an element to a list |
Pair | 'a -> 'b -> pair 'a 'b | Pair of two elements |
None | option 'a | Absent value of the option type |
Some | 'a -> option 'a | Present value of the option type |
Left | 'a -> or 'a 'b | Left alternative of the or type |
Right | 'a -> or 'b 'a | Right alternative of the or type |
Contract | address -> contract 'a | Value of the contract type. |
SetDelegate | option key -> operation | Delegate operation |
TransferTokens | 'a -> mutez -> contract 'a -> operation | Transaction operation. There is the short alias Transfer. |
CreateContract | option address -> mutez -> 'a -> address -> operation | Contract origination operation. |
Error | 'a -> exception | General exception value raised by FAILWITH instruction. |
Overflow | exception | Overflow exception value |
caution
Contract
can be used as only a destructor, that is, in a pattern because some
addresses cannot be a valid address of a contract as in Michelson. To construct a
contract object, use the contract_opt
function, which returns a contract
object only if a given address can be assumed as a valid one.
#
PatternsPATTERN ::= | VAR | "_" | CONSTRUCTOR ("<" SORT ">")? (PATTERN)* | PATTERN "," PATTERN | PATTERN "::" PATTERN | "[" "]" | "[" PATTERN (";" PATTERN)* "]" | "(" PATTERN ")"
A pattern matches a specific form of structured value and binds variables to components of the matched data structure. The precedence and associativity of non-closed operators are shown in the table below, in which operators with higher precedence come first.
Operator | Associativity |
---|---|
CONSTRUCTOR | left |
:: | right |
, | right |
#
Variable patternsThe variable pattern VAR
matches any value and binds VAR
to the value.
#
Any patternThe any pattern _
matches any value but does not bind any variables.
#
Constructor patternsThe constructor pattern CONSTRUCTOR PATTERN_1 ... PATTERN_n
matches a data
structure constructed by CONSTRUCTOR
and whose components are matched
PATTERN_1 ... PATTERN_n
, respectively. Of course, n
must be the number of the
argument of the constructor.
For the Pair
and Cons
constructors, we can use the familiar notation PATTERN
"," PATTERN
and PATTERN "::" PATTERN
, respectively. For lists, there is also
a list pattern [ PATTERN_1 ; ... ; PATTERN_n ]
, which matches lists of length
n and whose elements are matched PATTERN_1 ... PATTERN_n
, respectively.
#
Type parametersContract
, TransferTokens
, CreateContract
, and Error
are followed by a
type parameter of the form <ty>
to specify those unique type variables as ty
.
For instance, the pattern Contract <nat> addr
matches a value of the contract
nat
type and binds addr
to the content (that is, a contract address) of the value. A
type parameter can be omitted if a concerning type is inferred from context.
tip
Consequently, pattern matching for the operation
and exception
types
requires a default clause, consisting of just variable or any pattern, to be
exhaustive. That is because, for example, a value of the type exception
can
be Error <ty> v
for any ty
, and we have (currently) no sort to denote
any type.
#
ExpressionsEXP ::= | VAR | LITERAL | UOP EXP | EXP BOP EXP | CONSTRUCTOR (EXP)* | FUNCTION (EXP)* | EXP "." ACCESSER | "if" EXP "then" EXP "else" EXP | EXP "," EXP | EXP ":" SORT | VAR ":>" RTYPE "->" RTYPE "&" RTYPE | "[" "]" | "[" EXP (";" EXP)* "]" | "match" EXP "with" PATTERN-MATCHING | "(" EXP ")"PATTERN-MATCHING ::= ("|")? PATTERN "->" EXP ("|" PATTERN "->" EXP)*
An expression denotes a value of several sorts. The precedence and associativity of non-closed expressions are shown in the table below, in which expressions with higher precedence come first.
Expression | Associativity |
---|---|
Constructor and function application | left |
- (prefixed) ! | - |
* / mod | left |
+ - | left |
:: | right |
^ @ | right |
< > <= >= = <> | left |
&& | right |
|| | right |
, | right |
: | - |
#
Basic expressionsVAR
and LITERAL
denote a value those express.
Type annotation EXP : SORT
specifies the type of EXP
as SORT
. It just
affects the behavior of type inference of assertion language. So, there is no
need to use it as far as the type inference succeeds.
Lambda annotation VAR :> RTYPE -> RTYPE & RTYPE
checks a specification,
written in LambdaAnnot style RTYPE -> RTYPE & RTYPE
, of lambda bound for
VAR
; and it results in Boolean value denoting if the lambda satisfies the given
specification. So this expression is a meaningful one for verification while type
annotation is not.
#
Unary operationsUOP ::= "-" | "!"
We have two unary operations: the numeral negation operator -
for values of int
, nat
, and
timestamp
; and the Boolean negation !
for Boolean values.
#
Binary operationsOP ::= "+" | "-" | "*" | "/" | "mod" | "<" | ">" | "<=" | ">=" | "=" | "<>" | "&&" | "||" | "::" | "^" | @
#
Arithmetic operationsArithmetic operations +
, -
, *
, /
, and mod
calculates the numeric value
between two numerals. Types of operands and result follow the ones of
Michelson. For example, +
has one of the type int -> int -> int
, int ->
nat -> int
, nat -> int -> int
, nat -> nat -> nat
, timestamp -> int ->
timestamp
, int -> timestamp -> timestamp
, and mutez -> mutez -> mutez
depending on context.
#
Comparison operationsComparison operators compare two values of the same type and return an intended
Boolean value. Currently, inequality operators <
, >
, <=
, and >=
only
support numeric types int
, nat
, mutez
, and timestamp
.
note
Equality operators =
and <>
can be used for any type even if it is not
comparable in Michelson.
#
Logical connectives&&
and ||
denote the logical conjunction and logical disjunction between Boolean values, respectively.
#
List operations::
is infix notation of the Cons
constructor. @
denotes the concatenation of two lists.
#
String concatenation^
denotes the concatenation of two strings.
#
Constructor and function applicationsConstructor application denotes the construction of a data structure. Function application denotes the result of the function.
caution
Constructor and function application must be fully applied.
#
Pair notationsA pair can be constructed by comma operation EXP , EXP
, which denotes Pair EXP
EXP
. The first and second components of a pair EXP
can denote by EXP
. first
and EXP . second
, respectively.
#
List notationA list can be constructed by list notation [ EXP_1 ; ... ; EXP_n ]
, which
denotes a list consisting of EXP_1
to EXP_n
in the order.
#
Control structuresThere are two conditional branching expressions. if EXP_1 then EXP_2 else
EXP_3
denotes the value that EXP_2
denotes if the Boolean expression EXP_1
denotes True
or EXP_3
otherwise. match EXP with PATTERN_1 -> EXP_1 | ... |
PATTERN_n -> EXP_n
denotes the value that EXP_i
denotes if the value that
EXP
denotes matches PATTERN_i
. The variables bound by PATTERN_i
can be
used in EXP_i
. If more than one of the patterns are matched, the first one is
selected. Furthermore, patterns are must be exhaustive; that is, any value of
the type of EXP
must be matched at least one of the patterns.
#
Refinement stack typesRTYPE ::= "{" STACK "|" EXP "}"STACK ::= PATTERN | PATTERN ":" STACK
Refinement stack types are broadly used expressions to describe a state of stack
values. A refinement stack type consists of STACK
, a stack pattern that binds
variables to a matched stack values, and EXP
, a predicate expression that
describes a condition for stack values by using the bound variables. For
instance, { x | True }
denotes any singleton stack because the stack pattern
x
matches a stack of exactly one length, and the predicate expression True
gives no condition on a value bound for x
. In a similar manner, { x : y |
True }
denotes a stack of two lengths, { x : y : z | True }
denotes a stack of
three lengths, etc. A predicate expression can be any Boolean expression. So,
for instance, { x | x > 0 }
denotes a singleton stack of a positive integer.
We can use various patterns to match values in a stack. For instance, { (x,
y) | True }
denotes a singleton stack of a pair. Roughly speaking, { p1 :
... : pn | e }
is interpreted as { x1 : ... : xn | match x1, ..., xn with p1,
..., pn -> e | _ -> False }
(in precise, there are a few differences, e.g.,
when pn = _
as described below). So, { [x] | True }
denotes a singleton
stack of a singleton list (it is equivalent to { y | match y with [x]
-> True | _ -> False }
.
Given refinement stack types are typechecked to be a valid expression. For
example, when { x | x > 0 }
is placed in a code point in which a stack type is
int : []
(as you know, Michelson code is statically typed, so we know a stack
type of each code point), a type of x
is inferred as int
, and the refinement
stack type is accepted as a valid expression. Contrary, when it is placed in a
code point expecting bool : []
, Helmholtz reports a type error because a type
of x
is inferred as bool
and the predicate expression is ill-typed since 0
is not a Boolean value. Length mismatch is also reported as a type error as
the Michelson type system does.
It sometimes happens that we are just interested in some top values of a stack.
In such a case, we can use any pattern _
in the bottom position of a stack
pattern, in which the any pattern denotes a stack of any length (including the
empty stack). So, { _ | True }
denotes any stack, { x : _ | True }
denotes
a stack of more than zero-length, etc. Note that { x : _ : y | True }
denotes
a stack of exactly three lengths since any pattern is not placed in the bottom
position, and so it just matches any value in the middle of the stack.
#
AnnotationsANNOTATION ::= | "Assert" RTYPE | "Assume" RTYPE | "LoopInv" RTYPE | "MapInv" VAR RTYPE | "LambdaAnnot" RTYPE "->" RTYPE "&" RTYPE [ "(" (VAR ":" SORT)+ ")" ] | "ContractAnnot" RTYPE "->" RTYPE "&" RTYPE [ "(" (VAR ":" SORT)+ ")" ] | "Measure" VAR ":" "list" SORT "->" SORT "where" "[" "]" "=" EXP "|" VAR "::" VAR "=" EXP | "Measure" VAR ":" "set" SORT "->" SORT "where" "EmptySet" "=" EXP "|" "Add" VAR VAR "=" EXP | "Measure" VAR ":" "map" SORT SORT "->" SORT "where" "EmptyMap" "=" EXP "|" "Bind" VAR VAR VAR "=" EXP
An annotation is a root object of annotations for Helmholtz. It gives several operations, hints,
etc., to Helmholtz. There are six kinds of annotations in Helmholtz: ContractAnnot
, LambdaAnnot
,
LoopInv
, Assert
, Assume
, and Measure
. In those, ContractAnnot
is mandatory for contract
code, and LambdaAnnot
and LoopInv
are mandatory for the LAMBDA
instruction and loop
instructions (LOOP
, ITER
, etc.), respectively.
caution
In the current implementation, ContractAnnot can be omitted. However, in such a
case, << ContractAnnot { _ | False } -> { _ | True } & { _ | True} >>
is
provided. It is a trivial (satisfied by any contract) specification.
#
ContractAnnotA contract annotation gives a specification of a contract. It is placed just before the code section
and causes Helmholtz to check if the contract satisfies the given specification. A contract
annotation has the form ContractAnnot RTYPE_pre -> RTYPE_post & RTYPE_abpost vars
, where vars
is
optional.
#
PreconditionRTYPE_pre
describes a supposed state of an initial stack. So it is a
refinement stack type for a stack type (pair parameter_ty storage_ty) : []
.
The stack pattern in RTYPE_pre
is restricted to the form of (p1, p2)
or _
.
A typical RTYPE_pre
is { _ | True }
, which means a contract can accept any
parameter and storage value. It is also reasonable to give an assumption for a
storage value since the storage is controlled by a contract. The following
refinement stack type expresses that: the contract at the stored address must
exist, and its parameter type must be string
.
{ (_, addr) | match contract_opt addr with Some (Contract<string> _) -> True | None -> False }
#
PostconditionRTYPE_post
describes a desirable state of a final stack. So it is a
refinement stack type for a stack type (pair (list operation) storage_ty) :
[]
. Helmholtz checks if this condition holds under the assumption described by
RTYPE_pre
and by analyzing the contract code.
#
Ab-PostconditionRTYPE_abpost
describes an exceptional value condition. It is a refinement
stack type for the stack type exception : []
, where exception
is a special
data type consists of constructors Error v
for a FAILWITH
exception and
Overflow
for the overflow exception. A typical RTYPE_abpost
is { _ | False
}
, which means no exception happens since no value can satisfy the False
condition.
#
Logical variablesOptional vars
declares variables that can be used in annotations occurring in
contract code (except the body of LAMBDA
instruction). Typical use of
declared variables is naming a value of a stack at some code point by combined
with the Assume
annotation so that we can refer to the value after the code
point.
#
Special treatment of the scope of pre-condition stack variables.Basically, variables in the stack pattern of a refinement stack type can be used
in its predicate expression. However, those in RTYPE_pre
can also be used in
predicate expressions of RTYPE_post
, RTYPE_abpost
, and annotations occurring
in contract code except the body of LAMBDA
instruction. This enables users to
write a specification of input-output relation.
#
Multiple ContractAnnotHelmholtz accepts more than one ContractAnnot. In such a case, Helmholtz verifies if all the annotations satisfy or not. Furthermore, ContractAnnot can be given for a specific entrypoint in the following form.
<< ContractAnnot %entrypoint RTYPE -> RTYPE & RTYPE >>
note
ContractAnnot with an entrypoint is just a syntax sugar. That means <<
ContractAnnot %A { (p, s) | e } -> RTYPE & RTYPE >>
is treated as <<
ContractAnnot { (Left p, s) | e } -> RTYPE & RTYPE >>
if a contract has the
parameter type (or (nat %A) (nat %B))
.
#
LambdaAnnotA lambda annotation is a similar annotation to contract annotations but for LAMBDA
instruction.
So it must be placed just before the LAMBDA
instruction and causes Helmholtz to check if a lambda
pushed by the instruction satisfies the given specification. A lambda annotation has the form
LambdaAnnot RTYPE_pre -> RTYPE_post & RTYPE_abpost vars
, and each component has a similar meaning to
ones of a contract annotation.
#
LoopInvA loop invariant annotation gives a loop invariant for loop-like instructions:
ITER
, LOOP
, and LOOP_LEFT
; and so, it must be placed just before those
instructions. A loop invariant annotation has the form LoopInv RTYPE
, where
RTYPE
gives a loop invariant, which is a condition for a stack before every
loop iteration.
#
MapInvA map invariant is a variant of a loop invariant for the MAP
instruction. A
map invariant annotation has the form MapInv VAR RTYPE
, where RTYPE
is an
invariant, but also VAR
is bound for the processed elements of the MAP
operation.
#
AssertAn assert annotation lets Helmholtz check a stack condition at the given code point. An assertion
annotation has the form Assert RTYPE
, where RTYPE
gives a stack condition being verified.
#
AssumeAn assume annotation adds a fact (hypothesis) about a stack condition at the given code point. An
assertion has the form Assume RTYPE
, where RTYPE
gives a fact for a stack condition. A user can
give any fact, and Helmholtz believes that the fact is correct, which means Helmholtz never checks
if the fact holds. So, this annotation must be carefully used because if you give an incorrect
fact, a verification result becomes nonsensical.
#
MeasureA measure annotation defines a recursive function for an inductive data structure
of the sort list
, set
, and map
. The form of definition is rather
restricted as syntax defines. Measure function can take one parameter, which is
an inductive data structure, and gives the function definition for the base case
and inductive case. In the inductive case, the function being defined can be
used.
EmptySet
, Add
, EmptyMap
, and Bind
are special constructors which can only be
used as syntax defined. The type of each constructor is shown in the table below.
Name | Type | Description |
---|---|---|
EmptySet | set 'a | The empty set |
Add | 'a -> set 'a -> set 'a | Add x s denotes the set whose elements are x and ones of s . Moreover, it is supposed that x is not an element of s . |
EmptyMap | map 'k 'v | The empty map |
Bind | 'k -> 'v -> map 'k 'v -> map 'k 'v | Bind k v m denotes the map whose bindings are k to v and ones of m . Moreover, it is supposed that k is not bound in m . |