Skip to content

An OCaml Syntax Extension for all Monadic Syntaxes.

License

Notifications You must be signed in to change notification settings

Niols/ppx_monad

Repository files navigation

ppx_monad

An OCaml Syntax Extension for all Monadic Syntaxes. https://github.com/Niols/ppx_monad/actions/workflows/ci.yml/badge.svg?branch=main

Overview

ppx_monad aims at providing its users with a set of standard monadic syntaxes as well as an easy mean to define their own. Concretely, it consists in:

  • a library making it easy (-ier) to write syntax extensions for monads and
  • a standard library of pre-defined such extensions.

One of the simplest examples is the option monad. Using ppx_monad, one can write:

let add_string_ints (s1 : string) (s2 : string) : string option =
  let%option n1 = int_of_string_opt s1 in
  let%option n2 = int_of_string_opt s2 in
  Some (string_of_int (n1 + n2))

which is then converted to:

let add_string_ints (s1 : string) (s2 : string) : string option =
  Option.bind (int_of_string_opt s1) @@ fun n1 ->
  Option.bind (int_of_string_opt s2) @@ fun n2 ->
  Some (string_of_int (n1 + n2))

equivalent to the following bind-free code:

let add_string_ints (s1 : string) (s2 : string) : string option =
  match int_of_string_opt s1 with
  | None -> None
  | Some n1 ->
    match int_of_string_opt s2 with
    | None -> None
    | Some n2 -> Some (string_of_int (n1 + n2))

Installation

The easiest way to install ppx_monad is via OPAM [WIP]:

opam install ppx_monad

Using the Standard Library

Using the standard library is simply a matter of adding a preprocessing step to your target. With Dune, for instance, this is done by putting:

(executable
 (name example)
 (preprocess (pps ppx_monad.std)))

in the appropriate Dune file. With ocamlbuild, this goes by adding the following to the _tags file:

<src/*>: package(ppx_monad.std)

Standard Extensions

The standard library contains the following syntax extensions:

Name/LabelOther LabelsErrorSince
optionoptyes
result.okres.ok, res, result, okyes4.08
result.errorerr, errorno4.08
listlstno
seqno4.07
either.righteither, rightyes4.12
either.leftleftno4.12

The second column indicate alternative labels that one can use to trigger the PPX. for instance, let%result.ok, let%res and let%ok all trigger the same extension. The third column indicate whether the extension covers an error part of the monad. This error part gives access to error-related structures such as assert and try. For instance, one can right try%opt f () with () -> Some 7 which will return Some x if f returns Some x and Some 7 if f returns None. On the other hand, try%list does not exist.

In the case of Result, the same module gives birth to two monads and therefore two extensions: result.ok and result.error. The former is what you expect: let%result.ok (let%ok or let%res for short) binds on value of the form Ok x, propagating values of the form Error y. It features an error part and try%res f () with Foo -> Ok 7 | Bar -> Error "no bar" returns Ok x if f returns Ok x, Ok 7 if f returns Error Foo and Error "no bar" if f returns Error Bar. The latter is the counterpart which binds on values of the form Error y, propagating values of the form Ok x. It does not feature an error part. The same considerations hold for Either which the convention that values of the form Right x are the “good” ones, even if OCaml’s standard library specifies that one should only use Either “without assigning a specific meaning to what each case should be.”

The monad Extension

The standard library contains one specific extension, ppx_monad.monad, which does not rely on any particular monad but uses the “current” one. For instance:

let%monad x = ...some computation... in
...some other computation...

is rewritten into:

bind ...some computation... (fun x ->
    ...some other computation...)

and it is up to the user to make sure that the bind function is in scope. This can be useful when manipulating modules that define a monad with the usual keywords. For instance:

let open Result in
let%m x = ...some computation... in
...some other computation...

will rely on bind from the Result monad. (Note that we used m as a short label for monad.) This extension will expect the functions return, bind, fail and catch to be defined in order to provide all the possible structures.

The do Extension

Additionally, the standard library contains a specific extension, ppx_monad.do that implements a Haskell-like do-notation. For instance, the following code:

let add_string_ints (s1 : string) (s2 : string) : string option =
  begin%do [@monad Option]
    n1 <- int_of_string_opt s1;
    n2 <- int_of_string_opt s2;
    Some (string_of_int (n1 + n2))
  end

is rewritten to:

let add_string_ints (s1 : string) (s2 : string) : string option =
  Option.bind int_of_string_opt s1 @@ fun n1 ->
  Option.bind int_of_string_opt s2 @@ fun n2 ->
  Some (string_of_int (n1 + n2))

This notation supports the use of a sequence when no value should be bound. It does not, however, support patterns on the left-hand side of <- but one can recover this functionality with <-- which supports basic patterns with the peculiarity that the wildcard _ has to be written __, without which the compiler will complain with Syntax error: wildcard "_" not expected. If one needed to use an actual operator <- or <-- or a usual OCaml sequence, they should wrap those inside a let () = ... in.

Finally, this notation supports two optional attributes, @bind and @monad. @bind allows to provide a custom bind function; @monad allows to specify a module in which to find the bind function. Without attributes, the PPX simply uses bind.

Note that this extension has little to with the others because it only requires a bind function. It is however compatible with them and it is possible to use eg. match%res inside a do notation.

Defining Custom Monadic Syntaxes

Defining a new monadic syntax consists in writing a rewriting library calling the following endpoint at top-level:

val register :
  ?mk_return:(loc:Location.t -> expression -> expression) ->
  ?mk_bind:  (loc:Location.t -> expression -> expression -> expression) ->
  ?mk_fail:  (loc:Location.t -> expression -> expression) ->
  ?mk_catch: (loc:Location.t -> expression -> expression -> expression) ->
  ?applies_on:string ->
  string ->
  unit

The functions mk_return, mk_bind, mk_fail and mk_catch are here to build the corresponding monadic expressions. mk_bind, for instance, takes two OCaml expressions representing the two usual arguments to a bind function and builds an expression representing the application of such a bind function to the arguments. For instance, one could define it as:

let mk_bind ~loc e f = [%expr bind [%e e] [%e f]]

The syntax extensions [%expr ...] and [%e ...] are helpers provided by ppxlib to manipulate OCaml expression. [%expr ...] allows to write OCaml code and to get an OCaml syntax tree out of it. This OCaml code can contain holes which [%e ...] allows to fill with a syntax tree. For more information about this, check out the documentation of ppxlib’s Metaquot.

The functions mk_return, mk_bind, mk_fail and mk_catch are all optional but providing more of them allows ppx_monad to provide more syntactic structures. For instance, mk_bind will suffice for ppx_monad to provide a simple let ... in, but mk_return is necessary for let ... and ... in. As another example, most structures can be defined using only mk_return and mk_bind, but mk_fail and mk_catch are necessary to define assert, try or match with exception patterns.

The last two arguments of register are the name of the PPX. By default, this is also the labels on which the PPX applies: a PPX of name result will apply on let%result, match%result, etc. Additionally, one can provide applies_on, another string describing on which labels the PPX applies. This string can contain simple regular expressions using grouping with (...), optional parts with ? and choices with |. For instance, ok|res(ult)?(.ok)? matches exactly all of the following: ok, res, result, res.ok, result.ok.

Redefining ppx_monad_result

Let us now re-implement the PPX for Result, available through ppx_monad.result or ppx_monad.std. We can do this with the following file, of only 21 lines:

open Ppxlib

let mk_return ~loc x =
  [%expr Result.ok [%e x]]

let mk_bind ~loc e f =
  [%expr Result.bind [%e e] [%e f]]

let mk_fail ~loc y =
  [%expr Result.error [%e y]]

let mk_catch ~loc e f =
  [%expr (fun e f -> match e with
           | Ok x -> Ok x
           | Error y -> f y) [%e e] [%e f]]

let () =
  Ppx_monad.register "result"
    ~applies_on:"ok|res(ult)?(.ok)?"
    ~mk_return ~mk_bind
    ~mk_fail ~mk_catch

It is then only a matter of building this file as a library that depends on ppx_monad and gets pre-processed by ppxlib’s Metaquot. For instance, with Dune, assuming that our library is called ppx_result:

(library
 (name ppx_result)
 (public_name ppx_result)
 (libraries ppx_monad)
 (preprocess (pps ppxlib.metaquot))
 (kind ppx_rewriter))

You can have a look at how ppx_monad.result is defined in this repository.

Sanitising Variable Names

For most of the functions in the example above (mk_return, mk_bind and mk_fail), there exists an implementation in the Result module which we can use directly. This is however not the case for mk_catch and we had to implement it by hand. The way we wrote it might feel weird and we might be tempted to write it as either of the following:

let incorrect_mk_catch ~loc e f =
  [%expr let catch e f = match e with
           | Ok x -> Ok x
           | Error y -> f y
         in catch [%e e] [%e f]

let incorrect_mk_catch ~loc e f =
  [%expr match [%e e] with
         | Ok x -> Ok x
         | Error y -> [%e f] y]

These implementations, however, are incorrect, because they bind variables in the scope of e and f. For instance, in the first implementation, if e or f were to contain the free variable catch, it would . The same issue is present in the second implementation if f were to contain the free variable y. If one wants to go down this road, the proper way is to ensure that the variable names are unique. Luckily, ppx_monad includes a mechanism for this. A proper way to write the above functions would be the following:

let mk_catch ~loc e f =
  let (pcatch, catch) = Ppx_monad.fresh_variable () in
  [%expr let [%p pcatch] e f = match e with
           | Ok x -> Ok x
           | Error y -> f y
         in [%e catch] [%e e] [%e f]]

let mk_catch ~loc e f =
  let (py, y) = Ppx_monad.fresh_variable () in
  [%expr match [%e e] with
         | Ok x -> Ok x
         | Error [%p py] -> [%e f] [%e y]]

[%p ...] is similar to [%e ...] for wholes in pattern positions. Ppx_monad.fresh_variable returns a pair of a pattern and an expression, the former binding a unique variable name which the latter mentions.

Related Works

This section attempts to list all works that provide similar features as ppx_monad. We consider not mentioning such a project here a bug and welcome any issue or pull request aiming at fixing this.

  • OCaml’s binding operators:
    • pure OCaml
  • zepalmer/ocaml-monadic, a “lightweight PPX extension for OCaml to support natural monadic syntax.”
    • last updated in 2021.
  • marigold-dev/ppx_let_binding, an “OCaml syntax extension for monads in the style of ReasonML.”
    • very new (to be followed),
    • not documented yet, and
    • not published on OPAM yet.
  • kandu/ppx_ok_monad, “a ppx syntax extension for monad syntax sugar.”
    • last updated 2 years ago.
  • foretspaisibles/ppx_monad, “a monad syntax extension for OCaml, that provides two major monad syntaxes: clean but incomplete Haskell-style monad syntax and verbose but complete let monad syntax.”
    • last updated in 2017.
  • rizo/ppx_monad, a “minimalistic monad syntax for OCaml.”
    • last updated in 2017.
  • danmey/omonad, a “monad syntax using ppx extensions.”
    • last updated in 2013,
    • no documentation, and
    • no OPAM package.
  • pippijn/pa_monad_custom:
    • based on Camlp4,
    • last updated in 2013, and
    • no OPAM package.