Wednesday, October 10, 2012

Combinators over Records and Unions

In the previous post, I discussed designing combinator libraries that compose some property over unions. It is only fitting to throw records in the mix.

type U =
    | A of int
    | B of float
    | C of string

let UFormat =
    (
        UnionCase A IntFormat <<
        UnionCase B FloatFormat <<
        UnionCase C StringFormat
    )
    |> Union (fun a b c x ->
        match x with
        | A x -> a x
        | B x -> b x
        | C x -> c x)

type R =
    {
        A : int
        B : float
        C : string
    }

let RFormat : Format<R> =
    (
        RecordField (fun r -> r.A) IntFormat <<
        RecordField (fun r -> r.B) FloatFormat <<
        RecordField (fun r -> r.C) StringFormat
    )
    |> Record (fun a b c -> { A = a; B = b; C = c })

With some simplifications, here is the code:

open System.IO
type Format<'T> =
{
Read : BinaryReader -> 'T
Write : BinaryWriter -> 'T -> unit
}
(* Unions *)
type UnionPart<'X,'U> =
{
R : list<BinaryReader -> 'U>
W : 'X -> 'U -> BinaryWriter -> unit
}
let UnionCase (ctor: 'C -> 'U) (fmt: Format<'C>)
(part: UnionPart<'X,'U>) : UnionPart<('C -> unit) -> 'X,'U> =
let tag = char part.R.Length
let write (w: BinaryWriter) x =
w.Write(tag)
fmt.Write w x
{
R = (fmt.Read >> ctor) :: part.R
W = fun f u w -> part.W (f (write w)) u w
}
let private unionEnd<'U> : UnionPart<('U -> unit),'U> =
{
R = []
W = fun f u w -> f u
}
let Union proof f =
let part = f unionEnd
let rs = Array.rev (List.toArray part.R)
{
Read = fun r ->
let tag = int (r.ReadChar())
rs.[tag] r
Write = fun w x -> part.W proof x w
}
(* Records *)
type RecordPart<'X,'R> =
{
R : 'X -> BinaryReader -> 'R
W : list<BinaryWriter -> 'R -> unit>
}
let RecordField<'F,'U,'X> (proj: 'U -> 'F) (fmt: Format<'F>) (part: RecordPart<'X,'U>) : RecordPart<'F -> 'X, 'U> =
{
R = fun x r -> part.R (x (fmt.Read(r))) r
W = (fun w x -> fmt.Write w (proj x)) :: part.W
}
let private recordEnd<'T> : RecordPart<'T,'T> =
{
R = fun r _ -> r
W = []
}
let Record proof f =
let x = f recordEnd
let ws = List.toArray x.W
{
Read = x.R proof
Write = fun w x -> for f in ws do f w x
}
(* Scalars *)
let IntFormat : Format<int> =
{
Read = fun r -> r.ReadInt32()
Write = fun w x -> w.Write(x)
}
let FloatFormat : Format<float> =
{
Read = fun r -> r.ReadDouble()
Write = fun w x -> w.Write(x)
}
let StringFormat : Format<string> =
{
Read = fun r -> r.ReadString()
Write = fun w x -> w.Write(x)
}

Monday, October 8, 2012

Combinators over Discrimated Unions in ML

Discriminated unions or sum types are a natural way to model logical OR. Often you have a property that distributes over OR. Say, in F# (used throughout the article, though the ideas should apply equally well to any ML), you can write a combinator of the type:

P<'T1> → P<'T2> → P<Choice<'T1,'T2>>

How to go from here to a nice set of combinators that would handle an arbitrary union? This question has been on my mind for a while, and finally I have an acceptable solution.

As a disclaimer, at the level of theory the question is completely trivial. The interest is in how to find a user-friendly ML interface. Even this, I suspect, has been solved before, and at a much more general level. I am aware for instance of Vesa Karvonen's Generics for the working ML'er, and a more recent Oleg Kiselyov's presentation of generics in OCaml. I am looking here at a much simpler setting, hopefully taking it to a more accessible, for-dummies-like-myself level.

Suppose you are designing binary format combinators to let users of your library construct values of the form:

type Format<'T> =
    {
        Read : BinaryReader → 'T
        Write : BinaryWriter → 'T → unit
    }

Given an arbitrary union type and some primitive Format values, the user should be able to compose them. This involves projecting from sub-types to the parent union type, and matching backwards. After some experimentation, the design I have looks like this:

type U =
    | A of int
    | B of float
    | C of string

let UFormat =
    Union (fun a b c x →
        match x with
        | A x → a x
        | B x → b x
        | C x → c x)
    << Case A IntFormat
    << Case B FloatFormat
    << Case C StringFormat
    <| End
That's pretty much it. The magic is in the type of Case combinator that accumulates types so that the Union combinator can present N-way pattern matching as a single reasonably convenient to write function. Full code:
open System.IO
type Format<'T> =
{
Read : BinaryReader -> 'T
Write : BinaryWriter -> 'T -> unit
}
type FormatUnionPart<'X,'U> =
{
R : list<BinaryReader -> 'U>
W : 'X -> 'U -> BinaryWriter -> unit
}
let Case (ctor: 'C -> 'U) (fmt: Format<'C>)
(part: FormatUnionPart<'X,'U>) :
FormatUnionPart<('C -> unit) -> 'X,'U> =
let tag = char part.R.Length
let write (w: BinaryWriter) x =
w.Write(tag)
fmt.Write w x
{
R = (fmt.Read >> ctor) :: part.R
W = fun f u w -> part.W (f (write w)) u w
}
let End<'U> : FormatUnionPart<('U -> unit),'U> =
{
R = []
W = fun f u w -> f u
}
let Union (proof: 'X) (part: FormatUnionPart<'X,'U>) : Format<'U> =
let rs = Array.rev (List.toArray part.R)
{
Read = fun r ->
let tag = int (r.ReadChar())
rs.[tag] r
Write = fun w x -> part.W proof x w
}
let IntFormat : Format<int> =
{
Read = fun r -> r.ReadInt32()
Write = fun w x -> w.Write(x)
}
let FloatFormat : Format<float> =
{
Read = fun r -> r.ReadDouble()
Write = fun w x -> w.Write(x)
}
let StringFormat : Format<string> =
{
Read = fun r -> r.ReadString()
Write = fun w x -> w.Write(x)
}
type U =
| A of int
| B of float
| C of string
let UFormat =
Union (fun a b c x ->
match x with
| A x -> a x
| B x -> b x
| C x -> c x)
<< Case A IntFormat
<< Case B FloatFormat
<< Case C StringFormat
<| End
let test (fmt: Format<'T>) (value: 'T) =
let bytes =
use s = new MemoryStream()
use w = new BinaryWriter(s)
fmt.Write w value
s.ToArray()
let roundtrip =
use s = new MemoryStream(bytes, false)
use r = new BinaryReader(s)
fmt.Read r
roundtrip = value