Chapter 1. Packages: Reusable software components

Reusability is one of the keywords in software engineering today. It simply means to have source code that can be shared by several programs. As the idea might be simple, its practical implementation is complex because sharing of source code has an impact on all steps and phases in software production. This document only addresses the following problems:

Objective Caml has a variety of language means to support reusability. Most important, polymorphic functions can be written which generalize the types of input arguments and the type of the result value. There are many examples in the core library such that I assume that the reader is familiar with this feature. Second, modules and functors must be mentioned which not only generalize types of values, but can even generalize structures, i.e. types with associated operations. Third, the class construct allows us to adopt the object-oriented techniques of abstraction such as inheritage and dynamic method lookup.

There are no general rules how to make components reusable. Often, data structures must be identified which are supposed to depend on the particular kind of usage; then one of OCaml's techniques of making these structures more general should be applied, such as introducing a functor. As it is often not worthwhile to make a module "fully general" (whatever this means), one has to study possible cases how the component could be used, and to decide which of the data structures must become more general such that these cases are possible applications of the component.

Sometimes there is a new application for an already existing component, and it is decided to make the component more general such that the new case can be satisfied by it. Of course, the more general component must be compatible to older applications. This is a very common situation, and it simply means that it must be possible to modify (improve, extend) components and to re-integrate the changed component into programs.

The consequence of this is that components must be kept seperate from the programs using them, and even to separate the various components from each other. A component must be able to be "plugged out" of the system, and to be replaced by a compatible substitute. In terms of system administration, such a replaceable component is called a package, and because this document discusses the administrative side of the problem, the term "package" has been adopted.

Strictly speaking, a package is only the part of the component which is needed to allow the integration into a program. Objective Caml compiles both interfaces and implementations of modules, and this is our first definition of a package: A package is a collection of compiled module interfaces and module implementations. It is then a simple task to manage the packages of a system; just have somewhere a central package directory "site-lib" which has subdirectories for every package of the system. If you want to replace a package with a newer version, just delete the directory representing the package, and create a new one with the same name and the new binaries.

There are still some open questions. Replacing a package works only if the interface does not change; OCaml keeps checksums of used interfaces and can detect modified interfaces almost always. A package manager, i.e. a tool supporting using and administering packages, could help by finding out all programs and packages that use the replaced package such that it is possible to check if they still work or need to be recompiled. The package manager needs information about the dependencies between packages and programs in order to do this.

Furthermore, it has to be specified which link operation is necessary to use a package. Often, only a single archive file needs to be linked in, but sometimes additional archives or system libraries must be linked, too.

The findlib library is my suggestion for a package manager suitable for Objective Caml. It is a library (stored as a package itself) which can answer the following questions:

Furthermore, there is a frontend for this library called ocamlfind. It is a command-line interface for the library, but has also some additional abilities:

As you'll see in the following chapters, the usage of this library is really simple. If you want only to link in packages written by other people, you must only change the command that invokes the compiler, e.g. instead of calling "ocamlc program.ml" invoke "ocamlfind ocamlc -package name_of_package_to_use -linkpkg program.ml", and you can refer to the named package within program.ml. If you want to turn your collection of modules into a package, you need only to write one adminstrative file (META) containing all extra information such as required other packages.