Previous Up Next

Chapter 8  Batch compilation (ocamlc)

This chapter describes the Objective Caml batch compiler ocamlc, which compiles Caml source files to bytecode object files and link these object files to produce standalone bytecode executable files. These executable files are then run by the bytecode interpreter ocamlrun.

8.1  Overview of the compiler

The ocamlc command has a command-line interface similar to the one of most C compilers. It accepts several types of arguments and processes them sequentially: The output of the linking phase is a file containing compiled bytecode that can be executed by the Objective Caml bytecode interpreter: the command named ocamlrun. If caml.out is the name of the file produced by the linking phase, the command
        ocamlrun caml.out arg1 arg2 ... argn
executes the compiled code contained in caml.out, passing it as arguments the character strings arg1 to argn. (See chapter 10 for more details.)

On most systems, the file produced by the linking phase can be run directly, as in:
        ./caml.out arg1 arg2 ... argn
The produced file has the executable bit set, and it manages to launch the bytecode interpreter by itself.

8.2  Options

The following command-line options are recognized by ocamlc.
-a
Build a library (.cma file) with the object files (.cmo files) given on the command line, instead of linking them into an executable file. The name of the library must be set with the -o option.

If -custom, -cclib or -ccopt options are passed on the command line, these options are stored in the resulting .cma library. Then, linking with this library automatically adds back the -custom, -cclib and -ccopt options as if they had been provided on the command line, unless the -noautolink option is given.

-c
Compile only. Suppress the linking phase of the compilation. Source code files are turned into compiled files, but no executable file is produced. This option is useful to compile modules separately.

-cc ccomp
Use ccomp as the C linker called by ocamlc -custom and as the C compiler for compiling .c source files.

-cclib -llibname
Pass the -llibname option to the C linker when linking in “custom runtime” mode (see the -custom option). This causes the given C library to be linked with the program.

-ccopt option
Pass the given option to the C compiler and linker, when linking in “custom runtime” mode (see the -custom option). For instance, -ccopt -Ldir causes the C linker to search for C libraries in directory dir.

-custom
Link in “custom runtime” mode. In the default linking mode, the linker produces bytecode that is intended to be executed with the shared runtime system, ocamlrun. In the custom runtime mode, the linker produces an output file that contains both the runtime system and the bytecode for the program. The resulting file is larger, but it can be executed directly, even if the ocamlrun command is not installed. Moreover, the “custom runtime” mode enables static linking of Caml code with user-defined C functions, as described in chapter 18.   Unix:
Never use the strip command on executables produced by ocamlc -custom. This would remove the bytecode part of the executable.


-dllib -llibname
Arrange for the C shared library dlllibname.so (dlllibname.dll under Windows) to be loaded dynamically by the run-time system ocamlrun at program start-up time.

-dllpath dir
Adds the directory dir to the run-time search path for shared C libraries. At link-time, shared libraries are searched in the standard search path (the one corresponding to the -I option). The -dllpath option simply stores dir in the produced executable file, where ocamlrun can find it and exploit it as described in section 10.3.

-dtypes
Dump detailed type information. The information for file x.ml is put into file x.annot. In case of a type error, dump all the information inferred by the type-checker before the error. The x.annot file can be used with the emacs commands given in emacs/caml-types.el to display types interactively.

-g
Add debugging information while compiling and linking. This option is required in order to be able to debug the program with ocamldebug (see chapter 16).

-i
Cause the compiler to print all defined names (with their inferred types or their definitions) when compiling an implementation (.ml file). No compiled files (.cmo and .cmi files) are produced. This can be useful to check the types inferred by the compiler. Also, since the output follows the syntax of interfaces, it can help in writing an explicit interface (.mli file) for a file: just redirect the standard output of the compiler to a .mli file, and edit that file to remove all declarations of unexported names.

-I directory
Add the given directory to the list of directories searched for compiled interface files (.cmi), compiled object code files (.cmo), libraries (.cma), and C libraries specified with -cclib -lxxx. By default, the current directory is searched first, then the standard library directory. Directories added with -I are searched after the current directory, in the order in which they were given on the command line, but before the standard library directory.

If the given directory starts with +, it is taken relative to the standard library directory. For instance, -I +labltk adds the subdirectory labltk of the standard library to the search path.

-impl filename
Compile the file filename as an implementation file, even if its extension is not .ml.

-intf filename
Compile the file filename as an interface file, even if its extension is not .mli.

-linkall
Force all modules contained in libraries to be linked in. If this flag is not given, unreferenced modules are not linked in. When building a library (-a flag), setting the -linkall flag forces all subsequent links of programs involving that library to link all the modules contained in the library.

-make-runtime
Build a custom runtime system (in the file specified by option -o) incorporating the C object files and libraries given on the command line. This custom runtime system can be used later to execute bytecode executables produced with the ocamlc -use-runtime runtime-name option. See section 18.1.6 for more information.

-noassert
Turn assertion checking off: assertions are not compiled. This flag has no effect when linking already compiled files.

-noautolink
When linking .cma libraries, ignore -custom, -cclib and -ccopt options potentially contained in the libraries (if these options were given when building the libraries). This can be useful if a library contains incorrect specifications of C libraries or C options; in this case, during linking, set -noautolink and pass the correct C libraries and options on the command line.

-nolabels
Ignore non-optional labels in types. Labels cannot be used in applications, and parameter order becomes strict.

-o exec-file
Specify the name of the output file produced by the linker. The default output name is a.out, in keeping with the Unix tradition. If the -a option is given, specify the name of the library produced. If the -output-obj option is given, specify the name of the output file produced.

-output-obj
Cause the linker to produce a C object file instead of a bytecode executable file. This is useful to wrap Caml code as a C library, callable from any C program. See chapter 18, section 18.7.5. The name of the output object file is camlprog.o by default; it can be set with the -o option.

-pack
Build a bytecode object file (.cmo file) and its associated compiled interface (.cmi) that combines the object files given on the command line, making them appear as sub-modules of the output .cmo file. The name of the output .cmo file must be given with the -o option. For instance,
        ocamlc -pack -o p.cmo a.cmo b.cmo c.cmo
generates compiled files p.cmo and p.cmi describing a compilation unit having three sub-modules A, B and C, corresponding to the contents of the object files a.cmo, b.cmo and c.cmo. These contents can be referenced as P.A, P.B and P.C in the remainder of the program.

-pp command
Cause the compiler to call the given command as a preprocessor for each source file. The output of command is redirected to an intermediate file, which is compiled. If there are no compilation errors, the intermediate file is deleted afterwards. The name of this file is built from the basename of the source file with the extension .ppi for an interface (.mli) file and .ppo for an implementation (.ml) file.

-principal
Check information path during type-checking, to make sure that all types are derived in a principal way. When using labelled arguments and/or polymorphic methods, this flag is required to ensure future versions of the compiler will be able to infer types correctly, even if internal algorithms change. All programs accepted in -principal mode are also accepted in default mode with equivalent types, but different binary signatures, and this may slow down type checking; yet this is a good idea to use it once before publishing source code.

-rectypes
Allow arbitrary recursive types during type-checking. By default, only recursive types where the recursion goes through an object type are supported.

-thread
Compile or link multithreaded programs, in combination with the system threads library described in chapter 24.

-unsafe
Turn bound checking off on array and string accesses (the v.(i) and s.[i] constructs). Programs compiled with -unsafe are therefore slightly faster, but unsafe: anything can happen if the program accesses an array or string outside of its bounds.

-use-runtime runtime-name
Generate a bytecode executable file that can be executed on the custom runtime system runtime-name, built earlier with ocamlc -make-runtime runtime-name. See section 18.1.6 for more information.

-v
Print the version number of the compiler and the location of the standard library directory, then exit.

-verbose
Print all external commands before they are executed, in particular invocations of the C compiler and linker in -custom mode. Useful to debug C library problems.

-version
Print the version number of the compiler in short form (e.g. 3.06), then exit.

-vmthread
Compile or link multithreaded programs, in combination with the VM-level threads library described in chapter 24.

-w warning-list
Enable or disable warnings according to the argument warning-list. The argument is a string of one or several characters, with the following meaning for each character:
A/a
enable/disable all warnings.
C/c
enable/disable warnings for suspicious comments.
D/d
enable/disable warnings for deprecated features.
E/e
enable/disable warnings for fragile pattern matchings (matchings that would remain complete if additional constructors are added to a variant type involved).
F/f
enable/disable warnings for partially applied functions (i.e. f x; expr where the application f x has a function type).
L/l
enable/disable warnings for labels omitted in application.
M/m
enable/disable warnings for overriden methods.
P/p
enable/disable warnings for partial matches (missing cases in pattern matchings).
S/s
enable/disable warnings for statements that do not have type unit (e.g. expr1; expr2 when expr1 does not have type unit).
U/u
enable/disable warnings for unused (redundant) match cases.
V/v
enable/disable warnings for hidden instance variables.
Y/y
enable/disable warnings for unused variables bound with the let or as keywords and that don't start with an underscore.
Z/z
enable/disable warnings for all unused variables that don't start with an underscore.
X/x
enable/disable all other warnings.
The default setting is -w Aelyz (all warnings enabled except fragile matchings, omitted labels, unused variables).

-warn-error warning-list
Turn the warnings indicated in the argument warning-list into errors. The compiler will stop on an error as soon as one of these warnings is emitted, instead of going on. The warning-list is a string of one or several characters, with the same meaning as for the -w option: an uppercase character turns the corresponding warning into an error, a lowercase character leaves it as a warning. The default setting is -warn-error a (all warnings are not treated as errors).

-where
Print the location of the standard library, then exit.

8.3  Modules and the file system

This short section is intended to clarify the relationship between the names of the modules corresponding to compilation units and the names of the files that contain their compiled interface and compiled implementation.

The compiler always derives the module name by taking the capitalized base name of the source file (.ml or .mli file). That is, it strips the leading directory name, if any, as well as the .ml or .mli suffix; then, it set the first letter to uppercase, in order to comply with the requirement that module names must be capitalized. For instance, compiling the file mylib/misc.ml provides an implementation for the module named Misc. Other compilation units may refer to components defined in mylib/misc.ml under the names Misc.name; they can also do open Misc, then use unqualified names name.

The .cmi and .cmo files produced by the compiler have the same base name as the source file. Hence, the compiled files always have their base name equal (modulo capitalization of the first letter) to the name of the module they describe (for .cmi files) or implement (for .cmo files).

When the compiler encounters a reference to a free module identifier Mod, it looks in the search path for a file named Mod.cmi or mod.cmi and loads the compiled interface contained in that file. As a consequence, renaming .cmi files is not advised: the name of a .cmi file must always correspond to the name of the compilation unit it implements. It is admissible to move them to another directory, if their base name is preserved, and the correct -I options are given to the compiler. The compiler will flag an error if it loads a .cmi file that has been renamed.

Compiled bytecode files (.cmo files), on the other hand, can be freely renamed once created. That's because the linker never attempts to find by itself the .cmo file that implements a module with a given name: it relies instead on the user providing the list of .cmo files by hand.

8.4  Common errors

This section describes and explains the most frequently encountered error messages.
Cannot find file filename
The named file could not be found in the current directory, nor in the directories of the search path. The filename is either a compiled interface file (.cmi file), or a compiled bytecode file (.cmo file). If filename has the format mod.cmi, this means you are trying to compile a file that references identifiers from module mod, but you have not yet compiled an interface for module mod. Fix: compile mod.mli or mod.ml first, to create the compiled interface mod.cmi.

If filename has the format mod.cmo, this means you are trying to link a bytecode object file that does not exist yet. Fix: compile mod.ml first.

If your program spans several directories, this error can also appear because you haven't specified the directories to look into. Fix: add the correct -I options to the command line.

Corrupted compiled interface filename
The compiler produces this error when it tries to read a compiled interface file (.cmi file) that has the wrong structure. This means something went wrong when this .cmi file was written: the disk was full, the compiler was interrupted in the middle of the file creation, and so on. This error can also appear if a .cmi file is modified after its creation by the compiler. Fix: remove the corrupted .cmi file, and rebuild it.

This expression has type t1, but is used with type t2
This is by far the most common type error in programs. Type t1 is the type inferred for the expression (the part of the program that is displayed in the error message), by looking at the expression itself. Type t2 is the type expected by the context of the expression; it is deduced by looking at how the value of this expression is used in the rest of the program. If the two types t1 and t2 are not compatible, then the error above is produced.

In some cases, it is hard to understand why the two types t1 and t2 are incompatible. For instance, the compiler can report that “expression of type foo cannot be used with type foo”, and it really seems that the two types foo are compatible. This is not always true. Two type constructors can have the same name, but actually represent different types. This can happen if a type constructor is redefined. Example:
        type foo = A | B
        let f = function A -> 0 | B -> 1
        type foo = C | D
        f C
This result in the error message “expression C of type foo cannot be used with type foo”.

The type of this expression, t, contains type variables that cannot be generalized
Type variables ('a, 'b, ...) in a type t can be in either of two states: generalized (which means that the type t is valid for all possible instantiations of the variables) and not generalized (which means that the type t is valid only for one instantiation of the variables). In a let binding let name = expr, the type-checker normally generalizes as many type variables as possible in the type of expr. However, this leads to unsoundness (a well-typed program can crash) in conjunction with polymorphic mutable data structures. To avoid this, generalization is performed at let bindings only if the bound expression expr belongs to the class of “syntactic values”, which includes constants, identifiers, functions, tuples of syntactic values, etc. In all other cases (for instance, expr is a function application), a polymorphic mutable could have been created and generalization is therefore turned off for all variables occuring in contravariant or non-variant branches of the type. For instance, if the type of a non-value is 'a list the variable is generalizable (list is a covariant type constructor), but not in 'a list -> 'a list (the left branch of -> is contravariant) or 'a ref (ref is non-variant).

Non-generalized type variables in a type cause no difficulties inside a given structure or compilation unit (the contents of a .ml file, or an interactive session), but they cannot be allowed inside signatures nor in compiled interfaces (.cmi file), because they could be used inconsistently later. Therefore, the compiler flags an error when a structure or compilation unit defines a value name whose type contains non-generalized type variables. There are two ways to fix this error:

Reference to undefined global mod
This error appears when trying to link an incomplete or incorrectly ordered set of files. Either you have forgotten to provide an implementation for the compilation unit named mod on the command line (typically, the file named mod.cmo, or a library containing that file). Fix: add the missing .ml or .cmo file to the command line. Or, you have provided an implementation for the module named mod, but it comes too late on the command line: the implementation of mod must come before all bytecode object files that reference mod. Fix: change the order of .ml and .cmo files on the command line.

Of course, you will always encounter this error if you have mutually recursive functions across modules. That is, function Mod1.f calls function Mod2.g, and function Mod2.g calls function Mod1.f. In this case, no matter what permutations you perform on the command line, the program will be rejected at link-time. Fixes:

The external function f is not available
This error appears when trying to link code that calls external functions written in C. As explained in chapter 18, such code must be linked with C libraries that implement the required f C function. If the C libraries in question are not shared libraries (DLLs), the code must be linked in “custom runtime” mode. Fix: add the required C libraries to the command line, and possibly the -custom option.

Previous Up Next