Wednesday, October 10, 2012

Combinators over Records and Unions

In the previous post, I discussed designing combinator libraries that compose some property over unions. It is only fitting to throw records in the mix.

type U =
    | A of int
    | B of float
    | C of string

let UFormat =
    (
        UnionCase A IntFormat <<
        UnionCase B FloatFormat <<
        UnionCase C StringFormat
    )
    |> Union (fun a b c x ->
        match x with
        | A x -> a x
        | B x -> b x
        | C x -> c x)

type R =
    {
        A : int
        B : float
        C : string
    }

let RFormat : Format<R> =
    (
        RecordField (fun r -> r.A) IntFormat <<
        RecordField (fun r -> r.B) FloatFormat <<
        RecordField (fun r -> r.C) StringFormat
    )
    |> Record (fun a b c -> { A = a; B = b; C = c })

With some simplifications, here is the code:

Monday, October 8, 2012

Combinators over Discrimated Unions in ML

Discriminated unions or sum types are a natural way to model logical OR. Often you have a property that distributes over OR. Say, in F# (used throughout the article, though the ideas should apply equally well to any ML), you can write a combinator of the type:

P<'T1> → P<'T2> → P<Choice<'T1,'T2>>

How to go from here to a nice set of combinators that would handle an arbitrary union? This question has been on my mind for a while, and finally I have an acceptable solution.

As a disclaimer, at the level of theory the question is completely trivial. The interest is in how to find a user-friendly ML interface. Even this, I suspect, has been solved before, and at a much more general level. I am aware for instance of Vesa Karvonen's Generics for the working ML'er, and a more recent Oleg Kiselyov's presentation of generics in OCaml. I am looking here at a much simpler setting, hopefully taking it to a more accessible, for-dummies-like-myself level.

Suppose you are designing binary format combinators to let users of your library construct values of the form:

type Format<'T> =
    {
        Read : BinaryReader → 'T
        Write : BinaryWriter → 'T → unit
    }

Given an arbitrary union type and some primitive Format values, the user should be able to compose them. This involves projecting from sub-types to the parent union type, and matching backwards. After some experimentation, the design I have looks like this:

type U =
    | A of int
    | B of float
    | C of string

let UFormat =
    Union (fun a b c x →
        match x with
        | A x → a x
        | B x → b x
        | C x → c x)
    << Case A IntFormat
    << Case B FloatFormat
    << Case C StringFormat
    <| End
That's pretty much it. The magic is in the type of Case combinator that accumulates types so that the Union combinator can present N-way pattern matching as a single reasonably convenient to write function. Full code:

Friday, September 14, 2012

Faster Printf Released on NuGet

EDIT: The project described in this article is in alpha stage. If you are interested in a more mature an faster drop-in replacement for F# Printf module, check out Arseny Kapoulkine's fastprintf.

There is news on the faster F# Printf.* story. We released an alternative implementation as a package today.

NuGet: IntelliFactory.Printf

Source: https://bitbucket.org/IntelliFactory/printf

Some background: F# Printf.* functions are a very nice interface for formatted printing, but the default implementation is quite slow, which is undesirable for production use (say, for logging inside a server). I believe the F# team is addressing this for their next release - and when it comes, it is going to be awesome. In the meanwhile, we offer a drop-in replacement library that can speed things up a bit with the existing F#.

The code does not go the full 100% way of what is possible to optimize, in particular it does not use Reflection.Emit to attempt constructing an efficient closure capable of bypassing some of the closure allocation overhead via FastInvoke. However, I believe the improved performance already is practical, and the part of the implementation dealing with closure construction is pleasantly clear and easy to understand this way.

If you decide to try it out, note that this is an alpha release as it does not yet support format modifier flags and %A yet (I intend to rectify this as time permits).

ACKNOWLEDGEMENTS: I would like to thank Vladimir Matveev from the Microsoft F# team for some fruitful discussions on the internals of F# Printf.*; This project is largely based on my best attempt to beat Vladimir's code on a set of micro-benchmarks. In the end I lost the race, as Vladimir has come up with an even faster implementation since then. I am still quite happy though, knowing that with any luck Vladimir's code will make it to the F# trunk, and all will benefit.

Thursday, July 19, 2012

Speeding up F# Printf.*

There recently was an interesting SO question on F# Printf.* family of functions:

http://stackoverflow.com/questions/11559440/how-to-manage-debug-printing-in-f

It is known that these functions are very slow. Slow enough for most people to avoid them entirely, despite the advantages they offer in verifying argument types.

What I did not know is that these functions are so slow that a few lines of simple user code can speed them up, without changing the interface:



With this code I get from 2x to 10x speedups. It is suspicous, I am afraid F# does not cache parsing of the format strings. There is no reason why it should not.

Exercise for the reader: fix the code above to be thread-safe.

Thursday, June 14, 2012

AppHarbor: Free Cloud Hosting of WebSharper Apps

We have just released a new version (2.4.85) of WebSharper, our web development framework and F#-to-JavaScript compiler. The main highlight of this release is experimental support for easy cloud deployment of your applications with AppHarbor. Small AppHarbor deployments are currently free, which is great news for individual developers and small companies.

How to get it to work:

  1. Set up a GitHub or Bitbucket repository
  2. Set up an AppHarbor account
  3. Connect the two, according to AppHarbor instructions, so that pushing to your repository notifies AppHarbor to pull, build and deploy your project on their cloud
  4. Install WebSharper using the MSI installer
  5. Install NuGet through VisualStudio extensions manager
  6. Create a new WebSharper solution
  7. Using NuGet package manager, add a dependency on the "WebSharper" NuGet package from NuGet gallery to all projects in your solution
  8. Enable NuGet package restore (alternatively, use our own tool BuildMagic)
  9. Push to your repository, and see it working!

NOTES: you may commit the whole contents of NuGet-generated packages folder with binaries, which guarantees successful builds on AppHarbor servers but is not recommended as it adds unnecessary bloat. It is better to not commit packages and rely on NuGet package restore (or BuildMagic) to download the packages during build.

IMPORTANT: Due to a technicality we have not yet overcome, you do have to commit all *.targets from the WebSharper package in the packages folder. You do not have to commit any WebSharper binaries if you use NuGet package restore or BuildMagic.

We are also started releasing WebSharper extensions via NuGet. This should provide a convenient way for you to install and update extensions in your projects.

Please let us know how these works for you. As always, your feedback, bug reports and suggestions are welcome at our issue tracker.

Friday, May 25, 2012

NuGet BuildMagic: No Binaries in your DVCS

Let me introduce BuildMagic: https://bitbucket.org/IntelliFactory/buildmagic - get 0.0.1 via NuGet

Have you used NuGet? If not, you probably should - with aggressive backing from Microsoft, it's quickly converging to be the default package manager and binary repository for .NET, other similar projects now stand no chance (though they often have more technical merit).

Have you used NuGet package restore? You probably should - pushing binaries into source control is wicked. Especially if you are using DVCS - every fresh pull will have to get all history of your binaries. Especially if you are using Bitbucket which is quite slow on binaries.

Now, have you been disappointed by NuGet package restore requiring you to commit NuGet.exe binary to source control? If so, check out BuildMagic. It is a little workaround for the issue. Unfortunately you still have to commit something redundant (a targets file). But at least now you can say a definite NO to binaries.

Saturday, April 28, 2012

A Prettier Printer in ML

To try out something different, I took a shot at writing a pretty-printing module. Of all the papers I found on the subject, perhaps the most accessible is A prettier printer by Wadler. In particular the algebra of documents presented in the paper is very simple and therefore compelling.

I tried to implement these combinators in a OCaml. Unfortunately for OCaml, Wadler's code heavily relies on laziness for searching the exponentially large tree of possible printouts and selecting the optimal one by limited look-ahead and backtracking. The exact same behavior could be obtained in OCaml by mechanically injecting explicit "lazy" everywhere, but it will not be very pretty. I decided to play with a different approach and pre-compute enough information to make every decision on the spot.

The result is ocaml-pretty - see the link for the code and API docs. I tried to maintain the same behavior as in the paper, but have only checked consistency on some small examples. One obvious operational difference is memory use: since the document data structure is strict, pretty-printing cannot help but use O(N) memory in the size of the document. The published Haskell algorithm certainly does better. I did not find an obvious simple fix for this within the approach I adopted, and ended up deciding it does not matter for my purposes.

OCaml standard library has a Format module - this is what you should use for real-world OCaml pretty-printing. Its imperative interface avoids both the O(N) memory overhead and the lazy data structure allocation overhead, and it's always there for OCaml.

UPDATE: an adaptation of Wadler algorithm to strict languages, in particular OCaml, has already been published in the paper Strictly Pretty by Christian Lindig and Gartner Datensysteme Gbr, see 10.1.1.34.2200, as was kindly pointed out to me by its author. The critical insight is managing Group nodes explicitly to avoid the exponential search space.

Friday, April 27, 2012

WebSharper 2.4.63

WebSharper has seen a few more releases, the latest being 2.4.63.

In a nutshell, you now can:

  1. Update HTML templates in sitelets without recompiling the solution and see immediate changes
  2. Use VS in Windows Phone 7 and Eclipse in Android templates, taking advantage of more customization opportunities and IDE services such as an integrated debugger
  3. Use HttpContext.Current.Session in sitelets and remote/RPC methods
  4. Use Visual Studio 2011 Beta, .NET 4.5, Windows 8, F# 3.0, even if you do not have the previous .NET framework or F#

For a full change log, please see here.

Thursday, February 16, 2012

WebSharper 2.4.35

WebSharper 2.4.35 is a bugfix release, fixing offline site generation and other important issues. See the complete list of changes.

Tuesday, January 24, 2012

WebSharper Goes Green: Staph Genome Viz

I have been thoroughly enjoying our latest project at IntelliFactory. Me and Loïc Denuzière, the creator of FPish, are working together for the U Nebraska Medical Center on genetic visualization with HTML5/WebSharper/F#. The organism of interest is Staphylococcus aureus. We have built an interactive chart showing its genome coding sequences and a few thousand transposon insertions performed in the UNMC labs.

You are welcome to play with the latest prototype.

It has been fun to do some graphics and modular arithmetic. We used Raphael to draw SVG, this has worked quite well for us, especially after we "fixed" the Raphael API a little bit for easier use from F# in the latest WebSharper binding. I also am just starting to realize how much I missed in highschool biology.

F# may have a great future in bioinformatics. Type providers easily consuming various data sources, .NET providing decent performance for numerical algorithms, and WebSharper or Silverlight giving a browser-accessible UI.. The only limit is your imagination.

In one of the discussions we had, Loïc made an interesting comment. He said that unlike working in the financial sector, applying F# to bioinformatics is not just interesting, but also useful. And though I have not been an Occupy protester, I cannot help but agree. Realizing that your work may help medical research is definitely a great motivator.

If you are tempted to work with us, please do apply: we are hiring. Interns are especially welcome.

Monday, January 9, 2012

WebSharper Goes Open-Source

WebSharper 2.4, our latest F# web stack based on a F#-to-JavaScript compiler, is out today, and it is open source, with a dual licensing model: you either use AGPL + OSS exceptions or purchase a commercial license. See details at websharper.com, and source at bitbucket.org.