Friday, February 18, 2011

Groking HOAS in F#

It is 3 AM in Budapest and, quite typically, I cannot sleep since my mind is actively engaged in remembering all the functional programming tricks I have read about. So, tonight it is HOAS, or Higher Order Abstract Syntax. If you have ever implemented a De Bruijn-indexed AST for Scheme and spent a few days catching indexing bugs, you surely would appreciate the beauty of this:

Wednesday, February 16, 2011

Where F#/.NET Falls Short: Higher Kinds

Every programmer thinks their favorite language has a problem. I am no exception. I used to think that F# misses an ML module system or Haskell-style typeclasses, but I came to realize that the complaint is really about a feature called higher-kinded polymorphism. Without this feature, there is a lot of very beautiful, general, and type-safe code in OCaml and Haskell that cannot be expressed in type-safe F#. It can be expressed, but you are facing a tradeoff:

  • Sacrifice generality – this is what workflow builders do. In effect, F# has no true monads, because you cannot easily write code generalized over an arbitrary monad.
  • Sacrifice type safety by using downcasts and rely on convention.

Option #1 is by far the most popular, but very detrimental to the F# ecosystem, as it encourages duplication of specific code that could have otherwise been expressed by a single, general definition.

Option #2 is controversial. I have attempted a few encodings I personally have not yet found a practical one, and in practice usually fall back to Option #1.

To illustrate, suppose we have this Haskell code:

Functor type class is parameterized over f, which is itself generic. This is exactly the piece that .NET and F# are lacking. If one really insists on writing code that is generic in every possible Functor, not just a concrete Functor instance, one ends up having to rely on downcasts and convention:

As you can see, this approach (1) has a run-time cost and (2) gets annoying very quickly as the complexity of the code grows.

It is sad.

Thursday, February 10, 2011

The Sad State of fshtmldoc – Can We Do Better?

If you ever tried to generate HTML documentation to document the API of F#-produced assemblies, you have likely tried fshtmldoc, a tool that is a part of F# PowerPack. It does a few things well and another couple of things not so well, there is a single overarching problem this tool. It uses System.Reflection.

Why is System.Reflection evil? In fact it is not, it is simply an API designed for a purpose vastly different from ours (HTML API report generation). The side effects of using these APIs are: the runtime needs to load the documented assemblies, and their references, which is slow and error-prone. fshtmldoc often crashes as it cannot find a reference. Also, the runtime cannot unload these assemblies. So, using this from an MS Build task, for example, is not an option, as it can make your Visual Studio session leak memory.

Is there an alternative? A non-alternative is to use C#-centric tools such as monodoc or SandCastle. These tools do not know anything about F# constructs, and are likely to produce output that will easily confuse you for code that is highly functional – full of function types, unions, records.

Enter Mono.Cecil. There is the alternative reflection and bytecode-manipulation library. It addresses all of the above problems. Could we base a documentation tool on Cecil? Yes. All that needs to be done, is to detect F# constructs from metadata.

The good news is I have almost completed this – remaining tasks are fairly minor. I am testing this out on a bunch of our IntelliFactory assemblies. More coming soon..

Wednesday, February 9, 2011

Home-made Regular Expressions in F#: Thompson NFA

Russ Cox re-discovered and popularized an old (originating in a 1968 paper by Thompson) technique for implementing regular expressions: http://swtch.com/~rsc/regexp/regexp1.html. The article offers a wonderfully simple to understand presentation, and I just could not resist trying it in F#.

First, the regular expressions themselves: the source language can be nicely described with a union type, encompassing the empty string, choice, concatenation, the Kleene star, and a token parser. The  algorithm is general enough to accept the Filter case, which generalizes over character classes and the like.

The first step is to compile this definition to a non-deterministic finite automaton (NFA). As a first cut, the state of an NFA could look like this:

When writing the compiler function, however, I realized that I would want two more features for the NFA state. First, compiling the Kleene star makes it hard to tie the knot, making one wish the language was Haskell. As a workaround, I simulate mutable union fields with a helper record. Second, it is nice to be able to compare state values to use binary search trees later in the game. To do this, I label every state with an integer and use it for comparison:

The compiler looks like this:

The key insight is that the NFA corresponds to a deterministic (DFA) machine. Whenever our machine encounters choice (Branch), it follows both options simultaneously. So then, the state of this machine (DFA state) is a set of NFA states. You can easily write a transition function that takes an input token and produces a transformation on the DFA state. Moreover, this function can be cached by token and input state. Once caching is introduced, we obtain a simple yet fascinating, lazily-constructed DFA machine.

The definition of the DFA state is not surprising:

In addition to token-based transitions, I also cache IsMatch information which can be obtained by traversing the NFA states and testing if any of them contains a Match case.

Here is the sketch of the transition function:

What remains is to write an interpreter for the DFA. I will post it in the next article when it is more tested. Perhaps I will also simplify the above definitions a bit. It is typical for me to find that the first code that I write for any problem is much more complicated than necessary.

Monday, February 7, 2011

WebSharper B5 with Extensions

We have finally released the next beta (5) of WebSharper that can run the samples I have been demonstrating on the blog. In addition, the download page now lets you play with an exciting set of extensions: Google Maps, Visualization, Info Vis Toolkit, Protovis, Raphael, jQuery Mobile and jQuery UI. Check it out: http://www.websharper.com/