The Mercury Language Reference Manual

Copyright (C) 1995 University of Melbourne.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions.

Introduction

Mercury is a new general-purpose programming language, designed and implemented by a small group of researchers at the University of Melbourne, Australia. Mercury is based on the paradigm of pure declarative logic programming, and was designed to be useful for the development of large and robust "real-world" applications. It improves on existing logic programming languages by providing increased productivity, reliability and efficiency, and by avoiding the need for non-logical program constructs.

Mercury requires programmers to supply type, mode and determinism declarations for the predicates they write. The compiler checks these declarations, and rejects the program if it cannot prove that every predicate satisfies its declarations. This improves reliability, since many kinds of errors simply cannot happen in successfully compiled Mercury programs. It also improves productivity, since the compiler pinpoints many errors that would otherwise require manual debugging to locate. The fact that declarations are checked by the compiler makes them much more useful than comments to anyone who has to maintain the program. The compiler also exploits the guaranteed correctness of the declarations for significantly improving the efficiency of the code it generates.

To facilitate programming-in-the-large, to allow separate compilation, and to support encapsulation, Mercury has a simple module system. Mercury's standard library has a variety of pre-defined modules for common programming tasks -- see the Mercury Library Reference Manual.

Syntax

Mercury's syntax is similar to the syntax of Prolog, with some additional declarations for types, modes, determinism, and the module system. A Mercury program consists of a set of modules. Each module is a file containing a sequence of items (declarations and clauses). Each item is a term followed by a period. Each term is composed of a sequence of tokens, and each token is composed of a sequence of characters. Like Prolog, Mercury has the Definite Clause Grammar (DCG) notation for clauses.

Tokens

Tokens in Mercury are exactly the same as in ISO Prolog.

Terms

Terms in Mercury are exactly the same as in ISO Prolog, with one exception: double-quoted strings are atomic in Mercury, they are not abbreviations for lists of character codes.

Items

Each item in a Mercury module is either a declaration or a clause. If the top-level functor of the term is `:-/1', the item is a declaration, otherwise it is a clause. There are three types of clauses. If the top-level functor of the item is `:-/2', the item is a rule. If the top-level functor is `-->/2', the item is a DCG rule. Otherwise, the item is a fact.

Declarations

The allowed declarations are:

:- type
:- pred
:- inst
:- mode
:- module
:- interface
:- implementation
:- import_module
:- external
:- end_module

The `type' and `pred' declarations are used for the type system, the `inst' and `mode' declarations are for the mode system, and the remainder are for the module system. They are described in more detail in their respective chapters.

(The current implementation allows `when/2' declarations, but ignores them. This helps when one wants to write a program that is both a Mercury program and an NU-Prolog program.)

Facts

A fact is an item of the form `Head', where the top-level functor of Head is not :-/1, :-/2, or -->/2. The Head term must not be a variable. The top-level functor of the Head determines which predicate the clause belongs to; the predicate must have been declared in a previous `pred' declaration in this module. A fact is equivalent to a rule `Head :- true'.

Rules

A rule is an item of the form `Head :- Body'. The Head term must not be a variable. The top-level functor of the Head determines which predicate the clause belongs to; the predicate must have been declared in a previous `pred' declaration in this module. The Body must be a valid goal.

Goals

A goal is a term of one of the following forms:

some Vars Goal
An existential quantification. Vars must be a list of variables. Goal must be a valid goal.
all Vars Goal
A universal quantification. Vars must be a list of variables. Goal must be a valid goal. This is an abbreviation for `not (some Vars not Goal)'.
Goal1, Goal2
A conjunction. Goal1 and Goal2 must be valid goals.
Goal1 ; Goal2
where Goal1 is not of the form `Goal1a -> Goal1b': a disjunction. Goal1 and Goal2 must be valid goals.
true
The empty conjunction. Always succeeds.
fail
The empty disjunction. Always fails.
not Goal
\+ Goal
A negation. The two different syntaxes have identical semantics. Goal must be a valid goal.
Goal1 => Goal2
An implication. This is an abbreviation for `not (Goal1, not Goal2)'.
Goal1 <= Goal2
A reverse implication. This is an abbreviation for `not (Goal2, not Goal1)'.
Goal1 <=> Goal2
A logical equivalence. This is an abbreviation for `(Goal1 => Goal2), (Goal1 <= Goal2').
if CondGoal then ThenGoal else ElseGoal
CondGoal -> ThenGoal ; ElseGoal
An if-then-else. The two different syntaxes have identical semantics. CondGoal, ThenGoal, and ElseGoal must be valid goals. Note that the "else" part is not optional.
Term1 = Term2
A unification.
Term1 \= Term2
An inequality. This is an abbreviation for `not (Term1 = Term2)'.
Call
Any term which does not match any of the above forms must be a predicate call. If the term is a variable Var, it is treated as if it were `call(Var)'. Otherwise, the top-level functor of the term determines the predicate called; the predicate must be declared in a pred declaration in the module or in the interface of an imported module.

DCG-rules

DCG-rules in Mercury have identical syntax and semantics to DCG-rules in Prolog.

A DCG-rule is an item of the form `Head --> Body'. The Head term must not be a variable. A DCG-rule is an abbreviation for an ordinary rule with two additional implicit arguments appended to the arguments of Head. These arguments are fresh variables which we shall call V_in and V_out. The Body must be a valid DCG-goal, and is an abbreviation for an ordinary goal. The next section defines a mathematical function `DCG-transform(V_in, V_out, DCG-goal)' which specifies the semantics of how DCG goals are transformed into ordinary goals. (The `DCG-transform' function is purely for the purposes of exposition, to define the semantics -- it is not part of the language.)

DCG-goals

A DCG-goal is a term of one of the following forms:

some Vars DCG-goal
A DCG existential quantification. Vars must be a list of variables. DCG-goal must be a valid DCG-goal. Semantics:
transform(V_in, V_out, some Vars DCG_goal) = 
some Vars transform(V_in, V_out, DCG_goal)
all Vars DCG-goal
A DCG universal quantification. Vars must be a list of variables. DCG-goal must be a valid DCG-goal. Semantics:
transform(V_in, V_out, all Vars DCG_goal) = 
all Vars transform(V_in, V_out, DCG_goal)
DCG-goal1, DCG-goal2
A DCG sequence. Intuitively, this means "parse DCG-goal1 and then parse DCG-goal2" or "do DCG-goal1 and then do DCG-goal2". (Note that the only way this construct actually forces the desired sequencing is by the modes of the implicit DCG arguments.) DCG-goal1 and DCG-goal2 must be valid DCG-goals. Semantics:
transform(V_in, V_out, (DCG-goal1, DCG-goal2)) =
transform(V_in, V_new, DCG_goal1), transform(V_new, V_out, DCG_goal2)
where V_new is a fresh variable.
DCG-goal1 ; DCG-goal2
A disjunction. DCG-goal1 and DCG-goal2 must be valid goals. DCG-goal1 must not be of the form `DCG-goal1a -> DCG-goal1b'. (If it is, then the goal is an if-then-else, not a disjunction.) Semantics:
transform(V_in, V_out, (DCG_goal1 ; DCG_goal2)) =
transform(V_in, V_out, DCG_goal1) ; transform(V_in, V_out, DCG_goal2)
{ Goal }
A brace-enclosed ordinary goal. Goal must be a valid goal. Semantics:
transform(V_in, V_out, { Goal }) = (Goal, V_out = V_in)
[Term, ...]
A DCG input match. Unifies the implicit DCG input variable V_in, which must have type `list(_)', with a list whose initial elements are the terms specified and whose tail is the implicit DCG output variable V_out. Semantics:
transform(V_in, V_out, [Term1, ...]) = (V_in = [Term, ... | V_Out])
[]
The null DCG goal (an empty DCG input match). Equivalent to `{ true }'. Semantics:
transform(V_in, V_out, []) = (V_out = V_in)
not DCG-goal
\+ DCG-goal
A DCG negation. The two different syntaxes have identical semantics. Goal must be a valid goal. Semantics:
transform(V_in, V_out, not DCG_goal) =
(not transform(V_in, V_new, DCG_goal), V_out = Vin)
where V_new is a fresh variable.
if CondGoal then ThenGoal else ElseGoal
CondGoal -> ThenGoal ; ElseGoal
A DCG if-then-else. The two different syntaxes have identical semantics. CondGoal, ThenGoal, and ElseGoal must be valid DCG-goals. Semantics:
transform(V_in, V_out, if CondGoal then ThenGoal else ElseGoal) =
if transform(V_in, V_cond, CondGoal) then
        transform(V_cond, V_out, ThenGoal)
else
        transform(V_in, V_out, ElseGoal)
=(Term)
A DCG unification. Unifies Term with the implicit DCG argument. Semantics:
transform(V_in, V_out, =(Term)) = (Term = V_in, V_out = Vin)
DCG-call
Any term which does not match any of the above forms must be a DCG predicate call. If the term is a variable Var, it is treated as if it were `call(Var)'. Then, the two implicit DCG arguments are appended to the specified arguments. Semantics:
transform(V_in, V_out, p(A1, ..., AN)) =
p(A1, ..., AN, V_in, V_Out)

Implicit quantification

The rule for implicit quantification in Mercury is not the same as the usual one in mathematical logic. In Mercury variables that do not occur in the head of a clause are implicitly existentially quantified around their closest enclosing scope (in a sense to be made precise in the following paragraphs). This allows most existential quantifiers to be omitted, and leads to more concise code.

An occurrence of a variable is in a negated context if it is in a negation, in a universal quantification, in the condition of an if-then-else, or in an inequality.

Two goals are parallel if they are different disjuncts of the same disjunction, or if one is the "else" part of an if-then-else and the other goal is either the "then" part or the condition of the if-then-else.

If a variable occurs in a negated context and does not occur outside of that negated context other than in parallel goals (and in the case of a variable in the condition of an if-then-else, other than in the "then" part of the if-then-else), then that variable is implicitly existentially quantified inside the negation.

Elimination of double negation

The treatment of inequality, universal quantification, implication, and logical equivalence as abbreviations can cause the introduction of double negations which could make otherwise well-formed code mode-incorrect. To avoid this problem, the language specifies that double negations are removed after syntax analysis, before mode analysis is performed.

Types

The type system is based on polymorphic many-sorted logic.

Certain special types are builtin, or are defined in the Mercury library:

Primitive types: char, int, float, string.
There is a special syntax for constants of type int, float, and string. (For char, the standard syntax suffices.)
Predicate types: pred, pred(T), pred(T1, T2), ...
These higher-order predicate types are used to pass predicates around. We're still working on the details of how higher-order predicates will work in Mercury -- so don't use these yet.
The universal type: univ.
The type univ is defined in the standard library module std_util, along with the predicates type_to_univ/2 and univ_to_type/2. With those predicates, any type can be converted to the universal type and back again. The universal type is useful for situations where you need heterogeneous collections.
The "state-of-the-world" type: io__state.
The type io__state is defined in the standard library module io, and represents the state of the world. Predicates which perform I/O are passed the old state of the world and produce a new state of the world. In this way, we can give a declarative semantics to code that performs I/O.

New types can be introduced with `:- type' declarations. There are several categories of derived types:

Constructors may be overloaded among different types: there may be any number of constructors with a given name and arity, so long as they all have different types. However, there must be only one constructor with a given name, arity, and result type. (There is no particularly good reason for this restriction; in the future we may allow several such functors as long as they have different argument types.) Note that excessive overloading of constructors can slow down type checking and can make the program confusing for human readers, so overloading should not be over-used.

The argument types of each predicate must be explicitly declared with a `:- pred' declaration. These declarations may be polymorphic. For example:

:- pred list__member(T, list(T)).

Predicates may not be overloaded: there must only be one predicate with a given name and arity in each program. (This restriction will hopefully be relaxed in the future.)

The compiler infers the types of variables and overloaded constructors. A type assignment is an assignment of a type to every variable and constructor in a clause. A type assignment is valid if it satisfies the following conditions. Each constructor in a clause must have been declared in at least one visible type declaration. The type assigned to each constructor must match one of the type declarations for that constructor, and the types assigned to the arguments of that constructor must match the argument types specified in that type declaration. The type assigned to each predicate argument much match the type specified in the `:- pred' declaration for that predicate. The type assigned to each head argument in a clause must exactly match the type specified in the corresponding `:- pred' declaration. (Here "match" means to be an instance of, i.e. to be identical to for some substitution of the type parameters, and "exactly match" means to be identical up to renaming of type parameters.)

One type assignment A is said to be more general than another type assignment B if there is a binding of the type parameters in A that makes it identical (up to renaming of parameters) to B. If there is more than one valid type assignment, the compiler must choose the most general one. If there are two valid type assignments which are not identical up to renaming and neither of which is more general than the other, then there is a type ambiguity, and compiler must report an error. A clause is type-correct if there is a unique (up to renaming) most general valid type assignment. Every clause in a Mercury program must be type-correct.

Modes

The mode of a predicate is a mapping from the initial state of instantiation of the arguments of the predicate to their final state of instantiation. To describe states of instantiation, we use information provided by the type system. Types can be viewed as regular trees with two kinds of nodes: or-nodes representing types and and-nodes representing constructors. The children of an or-node are the constructors that can be used to construct terms of that type; the children of an and-node are the types of the arguments of the constructors. We attach mode information to the or-nodes of type trees.

An instantiatedness tree is an assignment of an instantiatedness -- either free or bound --- to each or-node of a type tree, with the constraint that all descendants of a free node must be free.

A term is approximated by an instantiatedness tree if for every node in the instantiatedness tree,

When an instantiatedness tree tells us that a variable is bound, there may be several alternative function symbols to which it could be bound. The instantiatedness tree does not tell us which of these it is bound to; instead for each possible function symbol it tells us exactly which arguments of the function symbol will be free and which will be bound. The same principle applies recursively to these bound arguments.

Mercury's mode system allows users to declare names for instantiatedness trees using declarations such as

:- inst listskel ---> bound( [] ; [free | listskel] ).

This instantiatedness tree describes lists whose skeleton is known but whose elements are distinct variables. As such, it approximates the term [A,B] but not the term [H|T] (only part of the skeleton is known), the term [A,2] (not all elements are variables), or the term [A,A] (the elements are not distinct variables).

As a shorthand, the mode system provides `free' and `ground' as names for instantiatedness trees all of whose nodes are free and bound respectively. The shape of these trees is determined by the type of the variable to which they apply.

As execution proceeds, variables may become more instantiated. A mode mapping is a mapping from an initial instantiatedness tree to a final instantiatedness tree, with the constraint that no node of the type tree is transformed from bound to free. Mercury allows the user to specify mode mappings directly by expressions such as inst1 -> inst2, or to give them a name using declarations such as

:- mode m :: inst1 -> inst2.

Two standard shorthand modes are provided, corresponding to the standard notions of inputs and outputs:

:- mode in :: ground -> ground.
:- mode out :: free -> ground.

These two modes are enough for most predicates. Nevertheless, Mercury's mode system is sufficiently expressive to handle more complex data-flow patterns, including those involving partially instantiated data structures. (The current implementation does not handle partially instantiated data structures yet.)

For example, consider an interface to a database that associates data with keys, and provides read and write access to the items it stores. To represent accesses to the database over a network, you would need declarations such as

:- type operation
        --->    lookup(key, data)
        ;       set(key, data).
:- inst request --->
        bound(  lookup(ground, free)
        ;       set(ground, ground)
        ).
:- mode create_request :: free -> request.
:- mode satisfy_request :: request -> ground.

A predicate mode declaration assigns a mode mapping to each argument of a predicate. For example, given the mode names defined by

:- mode out_listskel ::
        free -> listskel.
:- mode in_listskel ::
        listskel -> listskel.

the (type and) mode declarations of the predicates length and append are as follows:

:- pred length(list(T), int).
:- mode length(in_listskel, out).
:- mode length(out_listskel, in).

:- pred append(list(T), list(T), list(T)).
:- mode append(in, in, out).
:- mode append(out, out, in).

If a predicate has only one mode, the `pred' and `mode' declaration can be combined:

:- pred append(list(T)::in, list(T)::in, list(T)::out).

A predicate mode declaration is an assertion by the programmer that for all possible argument terms for the predicate that are approximated (in our technical sense) by the initial instantiatedness trees of the mode declaration and all of whose free variables are distinct, if the predicate succeeds then the resulting binding of those argument terms will in turn be approximated by the final instantiatedness trees of the mode declaration, with all free variables again being distinct. We refer to such assertions as mode declaration constraints. These assertions are checked by the compiler, which rejects programs if it cannot prove that their mode declaration constraints are satisfied.

The mode set for a predicate is the set of mode declarations for the predicate plus all their implied modes. A mode set is an assertion by the programmer that the predicate should only be called with argument terms that are approximated by the initial instantiatedness trees of one of the mode declarations in the set (i.e. the specified modes and the modes they imply are the only allowed modes for this predicate). We refer to the assertion associated with a mode set as the mode set constraint; these are also checked by the compiler.

A predicate p is well-moded with respect to a given mode declaration if given that the predicates called by p all satisfy their mode declaration constraints, there exists an ordering of the literals in the definition of p such that

We say that a predicate is well-moded if it is well-moded with respect to all the mode declarations in its mode set, and we say that a program is well-moded if all its predicates are well-moded.

The mode analysis algorithm checks one mode of one predicate at a time. It abstractly interprets the definition of the predicate, keeping track of the instantiatedness of each variable, and selecting a mode for each call and unification in the definition. To ensure that the mode set constraints of called predicates are satisfied, the compiler may reorder the elements of conjunctions; it reports an error if no satisfactory order exists. Finally it checks that the resulting instantiatedness of the predicate's arguments is the same as the one given by the predicate's mode declaration.

The mode analysis algorithm annotates each call with the mode used.

Determinism

Determinism categories

For each mode of a predicate, we categorise that mode according to how many times it can succeed, and whether or not it can fail before producing its first solution.

To summarize:

                Maximum number of solutions
Can fail?       0               1               > 1
no              erroneous       det             multi
yes             failure         semidet         nondet

The determinism of each mode of a predicate is indicated by an annotation on the mode declaration. For example:

:- pred append(list(T), list(T), list(T)).
:- mode append(in, in, out) is det.
:- mode append(out, out, in) is multi.
:- mode append(in, in, in) is semidet.

If the mode of the predicate is given in the :- pred declaration rather than in a separate :- mode declaration, then the determinism annotation goes on the :- pred declaration. In particular, this is necessary if the predicate does not have any argument variables. For example:

:- pred loop(int::in) is erroneous.
loop(X) :- loop(X).

:- pred p is det.
p.

:- pred q is failure.
q :- fail.

The determinism categories form this lattice:

             erroneous
              /     \
          failure   det
             \     /   \
             semidet    multi
                  \     /
                   nondet

The higher up this lattice a determinism category is, the more the compiler knows about the number of solutions of predicates of that determinism.

Determinism checking and inference.

The determinism of goals is inferred from the determinism of their component parts, according to the rules below. The inferred determinism of a procedure is just the inferred determinism of the procedure's body.

For procedures that are local to a module, the determinism annotations may be omitted; in that case, their determinism will be inferred. (To be precise, the determinism of procedures without a determinism annotation is defined as the least fixpoint of the transformation which, given an initial assignment of the determinism det to all such procedures, applies those rules to infer a new determinism assignment for those procedures.)

It is an error to omit the determinism annotation for procedures that are exported from their containing module.

If a determinism annotation is supplied for a predicate, the declared determinism is compared against the inferred determinism. If the declared determinism is greater than or not comparable to the inferred determinism (in the partial ordering above), it is an error. If the declared determinism is less than the inferred determinism, it is not an error, but the implementation may issue a warning.

The determinism category of each goal is inferred according to the following rules. These rules work with the two components of determinism category: whether the goal can fail without producing a solution. and the maximum number of solutions of the goal (0, 1, or more). If the inference process below reports that a goal can succeed more than once, but the goal generates no outputs that are visible from outside the goal, the final determinism of the goal will be based on the goal succeeding at most once, since the compiler will implicitly prune away any duplicate solutions.

Calls
The determinism category of a call is the determinism declared or inferred for the called mode of the called predicate.
Unifications
The determinism of a unification is either det, semidet, or failure, depending on its mode. A unification that assigns the value of one variable to another is deterministic. A unification that constructs a structure and assigns it to a variable is also deterministic. A unification that tests whether a variable has a given top function symbol is semideterministic, unless the compiler knows the top function symbol of that variable, in which case its determinism is either det or failure depending on whether the two function symbols are the same or not. A unification that tests two variables for equality is semideterministic, unless the compiler knows that the two variables are aliases for one another, in which case the unification is deterministic, or unless the compiler knows that the two variables have different function symbols in the same position, in which case the unification has a determinism of failure. The compiler knows the top function symbol of a variable if the previous part of the predicate definition contains a unification of the variable with a function symbol, or if the variable's type has only one function symbol.
Conjunctions
The determinism of the empty conjunction (the goal `true') is det. The conjunction `(A, B)' can fail if either A or B can fail. The conjunction can succeed at most zero times if either A or B can succeed at most zero times. The conjunction can succeed more than once if either A or B can succeed more than once and both A and B can succeed at least once. (If e.g. A can succeed at most zero times, then even if B can succeed many times the maximum number of solutions of the conjunction is still zero.) Otherwise, i.e. if both A and B succeed at most once, the conjunction can succeed at most once.
Switches
A disjunction is a switch if each disjunct has near its start a unification that tests the same bound variable against a different function symbol. For example, consider the common pattern
(
        L = [], empty(Out)
;
        L = [H|T], nonempty(H, T, Out)
)
If L is input to the disjunction, then the disjunction is a switch on L. A switch can fail if the various arms of the switch do not cover all the function symbols in the type of the switched-on variable, or if the code in some arms of the switch can fail, bearing in mind that in each arm of the switch, the unification that tests the switched-on variable against the function symbol of that arm is considered to be deterministic. A switch can succeed several times if some arms of the switch can succeed several times, possibly because there are multiple disjuncts that test the switched-on variable against the same function symbol. A switch can succeed at most zero times only if all arms of the switch can succeed at most zero times.
Disjunctions
The determinism of the empty disjunction (the goal `fail') is failure. A disjunction `(A, B)' that is not a switch can fail if both A or B can fail. It can succeed at most zero times if both A or B can succeed at most zero times. It can succeed more than once if either A or B can succeed more than once. Otherwise, i.e. if one of A and B can succeed at most once and the other can succeed at most zero times, then the disjunction can succeed at most once.
If-then-else
An if-then-else can fail if either the then part or the else part can fail. It can succeed at most zero times if the else part can succeed at most zero times and if at least one of the condition and the then part can succeed at most zero times. It can succeed more than once if any one of the condition, the else part and the then part can succeed more than once.
Negations
If the determinism of the negated goal is erroneous, then the determinism of the negation is erroneous. If the determinism of the negated goal is failure, the determinism of the negation is det. If the determinism of the negated goal is det or multi, the determinism of the negation is failure. Otherwise, the determinism of the negation is semidet.

Replacing compile-time checking with run-time checking

Note that "perfect" determinism inference is an undecidable problem, because it requires solving the halting problem. (For instance, in the following example

:- pred p(T, T).
:- mode p(in, out) is det.

p(A, B) :-
        (
                something_complicated(A, B)
        ;
                B = A
        ).

`p/0' can have more than one solution only if `something_complicated' can succeed.) Sometimes, the rules specified by the Mercury language for determinism inference will infer a determinism that is not as precise as you would like. However, it is generally easy to overcome such problems. The way to do this is to replace the compiler's static checking with some manual run-time checking. For example, if you know that a particular goal should never fail, but the compiler infers that goal to be semidet, you can check at runtime that the goal does succeed, and if it fails, call the library predicate `error/1'.

:- pred q(T, T).
:- mode q(in, out) is det.

q(A, B) :-
        ( goal_that_should_never_fail(A, B0) ->
                B = B0
        ;
                error("goal_that_should_never_fail failed!")
        ).

The predicate error/1 has determinism erroneous, which means the compiler knows that it will never succeed or fail, so the inferred determinism for the body of q/2 is det. (Checking assumptions like this is good coding style anyway. The small amount of up-front work that Mercury requires is paid back in reduced debugging time.) Mercury's mode analysis knows that computations with determinism erroneous can never succeed, which is why it does not require the "else" part to generate a value for `B'. The introduction of the new variable `B0' is necessary because the condition of an if-then-else is a negated context, and can export the values it generates only to the "then" part of the if-then-else, not directly to the surrounding computation. (If the surrounding computations had direct access to values generated in conditions, they might access them even if the condition failed.)

Interfacing nondeterministic code with the real world

Normally, attempting to call a nondet or multi mode of a predicate from a predicate declared as semidet or det will cause a determinism error. So how can we call nondeterministic code from deterministic code? There are several alternative possibilities.

If you just want to see if a nondeterministic goal is satisfiable or not, without needing to know what variable bindings it produces, then there is no problem - determinism analysis considers nondet and multi goals with no non-local output variables to be semidet and det respectively.

If you want to use the values of output variables, then you need to ask yourself which one of possibly many solutions to a goal do you want? If you want all of them, you need to use the predicate `solutions/2' in the standard library module `std_util', which has the following declaration:

:- pred solutions(pred(T), list(T)).
:- mode solutions(in, out) is det.

The term which you pass to `solutions/2' is a higher-order predicate term. You can pass the name of a one-argument predicate, or you can pass a several-argument predicate with all but one of the arguments supplied (a closure). The declarative semantics of `solutions/2' can be defined as follows:

solutions(Pred, List) is true iff
        all [X] (call(Pred, X) <=> list__member(X, List))
        and List is sorted.

where `call(Pred, X)' is a call to the standard library predicate `call/2' which invokes the higher-order predicate term `Pred' with argument `X', and where `list__member/2' is the standard library predicate for list membership. In other words, `solutions(Pred, List)' finds all the values of `X' for which `call(Pred, X)' is true, collects these solutions in a list, sorts the list, and returns that list as its result. Here's an example: the standard library defines a predicate `list__perm(List0, List)'

:- pred list__perm(list(T), list(T)).
:- mode list__perm(in, out) is nondet.

which succeeds iff List is a permutation of List0. Hence the following call to solutions

solutions(list__perm([3,1,2]), L)

should return all the possible permutations of the list `[3,1,2]' in sorted order:

L = [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]].

Unfortunately, the current implementation of higher-order predicates is not complete. You cannot pass a polymorphic predicate (one with a type variable in its predicate declaration) like `list__perm' to `solutions/2', and there is no compile-time mode or determinism checking for higher-order predicate terms (if you get the mode or determinism wrong, you will probably just get a core dump).

If you just want one solution and don't care which, the calling predicate should be declared nondet or multi. The nondeterminism should then be propagated up the call tree to the point at which it can be pruned. In Mercury, pruning can be achieved in several ways.

The first way is the one mentioned above: if a goal has no non-local output variables then the implementation will only attempt to satisfy the goal once. Any potential duplicate solutions will be implicitly pruned away.

The second way is to rely on the fact that the implementation will only seek a single solution to `main/2', so alternative solutions to `main/2' (and hence also to nondet or multi predicates called directly or indirectly from `main/2') are implicitly pruned away. This is one way to achieve "don't care" nondeterminism in Mercury.

The other situation in which you may want pruning and committed choice style nondeterminism is when you know that all the solutions returned will be equivalent. For example, you might want to count the elements in a set by removing them one at a time. Removing an unspecified element from a set is a nondeterministic operation, but no matter which order you remove them, the computed size of the set should be the same.

We plan to eventually extend Mercury to allow users to write

unique [X] Goal

as a special quantifier, meaning "there exists a unique X for which `Goal' is true". This would allow the implementation to prune alternative solutions for `Goal' if `X' was the only output variable of `Goal'.

We would also like to allow users to specify a user-defined equivalence relation as the equality relation for user-defined types, so that the `unique' quantifier could be used to express more general forms of equivalence. For example, if you define a set type which represents sets as unsorted lists, you would want to define a user-defined equivalence relation for that type, which could sort the lists before comparing them. The `unique' quantifier could then be used for sets even though the lists used to represent the sets might not be in the same order in every solution.

However, the current implementation does not yet support either the `unique' quantifier or user-defined equivalence relations.

Modules

The Mercury module system is simple and straightforward. Each module must start with a module declaration, specifying the name of the module. An interface declaration specifies the start of the module's interface section: this section contains declarations for the types, function symbols, instantiation states, modes, and predicates exported by this module. Mercury provides support for abstract data types, since the definition of a type may be kept hidden, with only the type name being exported. An implementation declaration specifies the start of the module's implementation section. Any entities declared in this section are local to the module and cannot be used by other modules. The implementation section must of course contain definitions for all abstract data types and predicates exported by the module, as well for all local types and predicates. The module may optionally end with an end_module declaration.

If a module wishes to make use of entities exported by other modules, then it must explicitly import those modules using one or more import_module declarations. These declarations may occur either in the interface or the implementation section. If the imported entities are used in the interface section, then the corresponding import_module declaration must also be in the interface section. If the imported entities are only used in the implementation section, the import_module declaration should be in the implementation section.

One module most export a predicate `main', which must be declared as either

:- pred main(io__state::di, io__state::uo) is det.

or

:- pred main(io__state::di, io__state::uo) is multi.

(or any declaration equivalent to one of the two above).

For example, here is the definition of a simple module for managing queues:

:- module queue.
:- interface.

% Declare an abstract data type.

:- type queue(T).

% Declare some predicates which operate on the abstract data type.

:- pred empty_queue(queue(T)).
:- mode empty_queue(out) is det.
:- mode empty_queue(in) is semidet.

:- pred put(queue(T), T, queue(T)).
:- mode put(in, in, out) is det.

:- pred get(queue(T), T, queue(T)).
:- mode get(in, out, out) is semidet.

:- implementation.

% Queues are implemented as lists. We need the `list' module
% for the declaration of the type list(T), with its constructors
% '[]'/0 % and '.'/2, and for the declaration of the predicate
% list__append/3.

:- import_module list.

% Define the queue ADT.

:- type queue(T) == list(T).

% Declare the exported predicates.

empty_queue([]).

put(Queue0, Elem, Queue) :-
         list__append(Queue0, [Elem], Queue).

get([Elem | Queue], Elem, Queue).

:- end_module queue.

Mercury has a standard library which includes modules for lists, stacks, queues, priority queues, sets, bags (multi-sets), maps (dictionaries), random number generation, input/output and filename and directory handling. See the Mercury Library Reference Manual for details.

Semantics

A legal Mercury program is one that complies with the syntax, type, mode, determinism, and module system rules specified in earlier chapters. If a program does not comply with those rules, the compiler must report an error.

For each legal Mercury program, there is an associated predicate calculus theory whose language is specified by the type declarations in the program and whose axioms are the completion of all predicates in the program plus the usual equality axioms. The declarative semantics of a legal Mercury program is specified by this theory. Mercury implementations must be sound: the answers they compute must be true in every model of the theory.

Execution of a legal Mercury program begins with the predicate `main/2'. Only one solution is sought, even if `main/2' is declared as multi.