Skip to content
agarwal edited this page Jul 23, 2012 · 5 revisions

RFC: Error Handling
This is currently a proposal and comments are requested. Once a decision is made, this page will serve as a reference for how Biocaml's API should be implemented.

Discussion

Error handling can be done either by raising exceptions or encoding the possibility of an error in types. In both cases, we must decide what the type of error information is (whether carried by an exception or in the return type). First we summarize these options, and then propose a design. The running example we use is of_file, a function that takes a file as input and returns a parsed value of type t.

Exceptions

One option is to define of_file : string -> t, which implies that an exception will be raised in case of any error.
Pros: API is simple.
Cons: Exceptions are a side effect not visible in the type of the function.

If we choose to use exceptions, we must decide which exceptions to raise. The options are:

  • Within each module M, define exception Error of string. Thus, M.Error msg is the exception that should be raised by all functions within module M. Possible variations on this are to define multiple module specific exceptions if there are clearly different categories of the kinds of errors expected. Also the value carried by the exception can be something other than a string; see discussion below.
    Pros: Exceptions are local to each module giving you some information about where they came, and allowing exception handling to be tailored to this knowledge.
    Cons: It is annoying to define these exceptions, it is hard to anticipate what exact exception to define, and often the extra information about which module the error came from is not used anyway. Also, adhering to the requirement that this module specific exception be thrown requires catching standard exceptions and raising instead this one. It can get annoying to remember to do this.

  • Raise standard exceptions, either the built-in Failure or define Biocaml specific one(s) and use across all modules.
    Pros: Don't have to spend time thinking about what exception to define every time we create a new module or implement a new function.
    Cons: Hard to do any exception handling that takes account of the specific error.

Option Type

Instead of raising an exception, we can return an option type: val of_file : string -> t option.
Pros: Possibility of not getting an answer is encoded in type.
Cons:

  • It is more difficult to use this function. Must case analyze every call to this function. Once you have many such functions, it becomes unbearable unless you do monadic programming.
  • Does not provide information on the cause of the error.

Result Type

The option type is not very informative when an error occurs. You just get None but don't know why. The type ('a, 'b) result represents a possibly correct answer of type 'a, or an error of type 'b. Often 'b = string, and the error is simply a message explaining what went wrong. Other choices are discussed below, but for now let's assume it is string. Then, we would define of_file : string -> (t, string) result).
Pros: This is almost certainly a strict improvement over t option. The only case in which option might be better is if there can only be one possible error. For example, Map.find : Map.t -> key -> data option finds the value bound to some key in a map. The only error possible is that there is no binding for the key, and it is clear that's what a result of None means.
Cons: Same as option types, must case analyze on result every time. Monadic programming is essentially required if you make even a few such function calls within the same code block.

Type of Error Information

In exceptions exception of Error of 'err and in result types ('a, 'err) result, we must decide what the 'err type is. Some choices are:

  • string - The error is free text and should be set to something meaningful.
  • polymorphic variants - Often there are a handful of different causes of an error. We can use a polymorphic variant constructor for each such possibility. The benefit over normal variants is that these can be created quickly on-the-fly without too much thinking. Also they unify into larger types by type inference, so you end up getting very precise errors without too much work. The disadvantage is the error types get big and start looking scary. Only if you train yourself to ignore them while reading the documentation do you not feel overwhelmed. Very likely to discourage beginners.
  • very specific type - We can also define a very specific type for any given error.

Exception-ful Function Names

If exception-ful functions are provided, there are two commonly followed conventions, each useful in different scenarios.

  • For a given module Foo, provide all of its exception-ful functions in a sub-module named Exnful (or Exn), with the same name as the type safe versions. Thus, we would have Foo.of_file and Foo.Exnful.of_file.
    Pros: One can simply open Foo.Exnful, and switch completely to an exception throwing style.
    Cons: When you see of_file in your code, you don't easily know what its type is. You have to know if you are in a context where Foo.Exnful was opened or not.

  • Use different names for exception-ful functions. Thus, we would have Foo.of_file and Foo.of_file_exn.
    Pros: The name of_file_exn clearly indicates that the function may throw an exception. You can selectively use such a function mixed in a context where you are mostly using type safe functions, but want to use an exception-ful version of just one function. You could do this in the above solution too, but the function name would be Foo.Exnful.of_file, which is longer.
    Cons: For dirty scripting, you may be happy to work completely in the exception-ful style but that becomes more verbose. Lots of your functions will have the _exn suffix on them. This requires more typing and looks messier. Adding the suffix is a manual way of doing what the sub-module solution is doing in a more structured way. If we choose to call the sub-module just Exn, then the choice is between of_file_exn and Exn.of_file. In other words, the choice becomes whether to suffix with "_exn" or prefix with "Exn." every function name, which is the exact same number of characters.

Of course it would be possible to support both of the above, but that is more work. It is likely that the two solutions will not remain in sync. Inevitably we will forget to include an _exn function in the Exnful module and vice versa.

Proposal

Given the above discussion, we propose:

  • Use result types by default.
  • Within each module, provide a sub-module Exnful in which all functions returning result instead raise an exception.

Thus, one can do

open Foo
open Foo.Exnful

to get an exception-ful version of Foo. We can also provide this for the top Biocaml module. Thus, doing

open Biocaml
open Biocaml.Exnful

would provide the full library in exception-ful form. This would be the preferred style for script writing where one does not care about error handling. Furthermore, the exception-ful form is now considered to be used only in the case where error handling will not be done. Thus, the exception raised should simply be the built-in Failure. Creating the Exnful sub-modules should thus be easy as it will require minimal thought. You just have to construct a string that loosely communicates the same information in the error value of the result type.

Now that the result type is the default, client code (within Biocaml and externally) will essentially have to be monadic. We consider this reasonable as monadic programming is now mainstream and standard amongst functional programmers. For those still not comfortable with it, there still remain two options: one can case analyze, or use the exception-ful API. Here is an example of how to apply a monadic style with the result type. The following code using exceptions:

val f1 : t -> t'    (* may raise an exception *)
val f2 : t' -> t''  (* won't raise an exception *)

let r =
  try
    f2 (f1 (of_file path))
  with
    | ... -> (...)
    | ... -> (...)

should be replaced by:

val f1 : t -> (t',string) result
val f2 : t' -> t''

let r = Result.(of_file path >>= f1 >> f2)

TO DO: draft proposal for type of error information to use in result types.