Saturday, April 28, 2012

A Prettier Printer in ML

To try out something different, I took a shot at writing a pretty-printing module. Of all the papers I found on the subject, perhaps the most accessible is A prettier printer by Wadler. In particular the algebra of documents presented in the paper is very simple and therefore compelling.

I tried to implement these combinators in a OCaml. Unfortunately for OCaml, Wadler's code heavily relies on laziness for searching the exponentially large tree of possible printouts and selecting the optimal one by limited look-ahead and backtracking. The exact same behavior could be obtained in OCaml by mechanically injecting explicit "lazy" everywhere, but it will not be very pretty. I decided to play with a different approach and pre-compute enough information to make every decision on the spot.

The result is ocaml-pretty - see the link for the code and API docs. I tried to maintain the same behavior as in the paper, but have only checked consistency on some small examples. One obvious operational difference is memory use: since the document data structure is strict, pretty-printing cannot help but use O(N) memory in the size of the document. The published Haskell algorithm certainly does better. I did not find an obvious simple fix for this within the approach I adopted, and ended up deciding it does not matter for my purposes.

OCaml standard library has a Format module - this is what you should use for real-world OCaml pretty-printing. Its imperative interface avoids both the O(N) memory overhead and the lazy data structure allocation overhead, and it's always there for OCaml.

UPDATE: an adaptation of Wadler algorithm to strict languages, in particular OCaml, has already been published in the paper Strictly Pretty by Christian Lindig and Gartner Datensysteme Gbr, see, as was kindly pointed out to me by its author. The critical insight is managing Group nodes explicitly to avoid the exponential search space.

No comments:

Post a Comment