ha4: March 2013

Friday, March 29, 2013

FAKE with NuGet support

UPDATE: the proposal made it into pre-release FAKE, see the more recent article: http://t0yv0.blogspot.com/2013/04/introducing-fake-boot-workflow.html

Intended F# build workflow: start from an empty folder, write build.fsx, and run fake - and get anything building with software from NuGet.

Draft Implementation Discussion

Generalizing records combinators a bit

Towards generic programming in F#: thoughts on generalizing the earlier combinators over records and unions...

WebSharper, PhoneGap, and Ripple: easier native HTML5 apps

We are experimenting with PhoneGap, PhoneGap Build, Ripple emulator and WebSharper. If successful, this will let you write truly cross-platform native mobile apps in F#, quickly pre-testing them in Chrome, and then generating various installers (Android, Windows, iOS - yes, iOS!) without even having to install the SDK.

TypeScript: initial impressions

There are several things under the TypeScript umbrella: a type system, language spec, JavaScript-targeting compiler and tooling that provides interactive code completion. I have tried TypeScript on a few small projects, the latest one being TypedPhoneGap. I also worked extensively with the language spec, wrote a parser and analyzer for the `.d.ts` contract specification fragment, pondered the semantics, and engaged in little flamewars on the language forum. My conclusion, in short, is this: the type system and language design are terrible, the compiler is OK, and tooling is excellent. Overall, it is an improvement over writing bare JavaScript, and the definition fragment is helpful (though not ideal) for communicating JavaScript API formally.

Using Coq as a program optimization tool

Update: - a slightly more compelling example of where program computation shines is at Another take on F# generics.

I tend to complain a lot about the programming tools I work with. In one sentence, the world of software is just not a sane place. If left unchecked, this complaining attitude can grow into despair - so I enjoy a little escape from time to time. I need to dream of a better world, where programming is meaningful, fun, beautiful, efficient, mostly automated, and where programs are verified. This is why my escape activity is playing with Coq.

One fascinating trick that Coq does really well is computing and manipulating programs. Have you ever found yourself writing beautiful, general functional code and then wondering how to make it run faster? Spending lots of time inlining definitions, special-casing, performing simplifications and hoping you do not introduce bugs in the process?

This activity can be automated in a way that verifies meaning preservation.

Here is a rather silly example (I will try to think of a better and more convincing one):

Lines 20-24 should in theory be expressible as a single-word tactic. This is where program simplification happens.

Imagine a better world.. You write functional code, focus on clarity of specification, then drop into interactive theorem proving to compute (!) the equivalent huge, ugly, optimized, specialized program that runs a lot faster but is guaranteed to produce the same result.

EDIT: a reader has asked how manipulating programs in Coq is different from using an optimizing compiler such as GHC. The short answer (and I again realize just how lame my example is since it does not demonstrate it) is control.

With GHC, you get a black-box optimizer that does some transformations and gives you a result. The resulting program is usually better (faster, uses less space), but once in a while worse. It also may or may not do the essential optimization for your domain. I am not using GHC on a daily basis - so feel free to correct me, but I believe the amount of control you exert is fairly limited, even with REWRITE pragmas. In Coq, on the other hand, you get "proof mode" - an interactive environment where you can inspect and explore the intermediate versions of your program as it is being optimized, and guide the process. There is a spectrum of possibilities from fully manual to fully automated transformations, programmable in the Ltac tactic language.

Another advantage (which makes this attractive to me) is that if you are targeting ML (OCaml, F#), you are working with a compiler that refuses to do too much optimization. So GHC-like magic is simply not available. Here extracting from Coq may come handy.

Yet another advantage is that you are working in a more expressive logic. This opens the door to writing more general programs than your target compiler would admit. I have not yet explored this too much, but it seems to be an easy way to bypass the lack of, say, higher kind polymorphism in F#.

Finally, I should also mention that as a proof assistant, Coq makes it practical to construct proofs that your programs (and transformations) are correct with respect to a given semantics. I find this appealing, even though this is a bit beside the point. As a beginner my experience has been that proving any non-trivial lemma takes unbelievably more time than constructing a program, so for now I am just exploring the possibility of using Coq as a programming language without doing much proof development.

Tuesday, March 19, 2013

Automate, automate, automate..

Latest snapshot of the WebSharper repositories (Bitbucket and at GitHub) showcases some build automation I finally managed to get working. It starts from MSBuild that uses NuGet to pull dependencies, including build dependencies, and then jumps to FAKE - using the alpha pre-release FAKE version has solved some F# version problems I faced earlier. It builds a lot of code, creates an output NuGet package, and also creates several Visual Studio templates that it packages up into a VSIX extensibility package - using the free to use API we are releasing under IntelliFactory.Build.

Things are looking a bit better since the build succeeds from scratch inside the AppHarbor environment that is currently hooked up - as a simple build server.

Despite the modest progress, build automation is still a nightmare. I think we need a much simpler story, one that would definitely involve the NuGet repository as a global binaries library and F# as the language for all build logic, with MSBuild in support role (generating MSBuild scripts for Visual Studio users).

See also the discussion on ScriptCS - this project attempts to do something very similar based on C#. I like their authors' emphasis on Node Package Manager as the model.

A thought that occurred to me frequently when debugging the builds was the simple question - why do all the parts have to work with process isolation barriers? It seems like we are trying to play the UNIX game on Windows, where the filesystem simply does not keep up, and processes have an unbearably slow cold start. For example, in principle, why invoke NuGet or MSBuild for that matter as a process through the command line, when there is .NET API for both? It seems that PowerShell at least allows Cmdlet's to keep some state in-memory, so reusing the same Cmdlet is vastly faster than invoking the same functionality in a separate process. I might be working on the tool along these lines as time permits.

A big win would be to have sane F# compiler (without the famous toxic nuclear waste that Haskell avoids by controlling side effects in the type system).. Then we would not have to invoke it in a separate process for each project in the solution. This could drastically reduce build times.

Thursday, March 14, 2013

Multi-targeting .NET projects with F# and FAKE

I am currently working on simplifying build configurations for a bunch of projects including WebSharper, IntelliFactory.Build, IntelliFactory.FastInvoke and eventually multiple WebSharper extensions. I am now using F# and FAKE when possible instead of MSBuild, and relying more heavily on the public NuGet repository.

The good news is that abstracting things in F# and sharing common build logic in a library via NuGet really works well, and feels a lot more natural than MSBuild. Consider this Build.fsx file from FastInvoke:

The FAKE-based build (well, together with some MSBuild boilerplate that I generate) accomplishes quite a few chores:

Bootstraps NuGet.exe without any binaries in the source repo
Resolves packages specified in solution packages.config, including pulling in FAKE and build logic such as IntelliFactory.Build
Determines current Mercurial hash or tag
Constructs AutoAssemblyInfo.fs with company metadata and mercurial tag
Constructs MSBuild boilerplate to help projects find NuGet-installed dependencies without specifying the version, for easy dependency version updates
Builds specified projects in multiple framework configurations

And you can, of course, do more inside the FAKE file.

The bad news is that I expected quite a bit more from FAKE, and I end up fighting it more than using it - note that these can be either legit problems with FAKE or else my limited understanding of it.

~~Running FAKE.exe drops your code into the 2.0 runtime, even if the host process was in 4.0 - not acceptable for me, had to replace FAKE.exe invocation with FSI.exe invocation~~ - UPDATE: when using pre-release alpha version of FAKE, the scripts default to the 4.0 runtime and use FSharp.Core 4.3.0.0 - problem solved
FAKE had no easy support for dependency tracking, such as not overwriting a file unless necessary (useful to prevent say the MSBuild it invokes from doing work twice) - had to roll a few or my own helpers
Surprisingly FAKE MSBuild helpers are calling MSBuild process instead of using the in-process MSBuild API. By using MSBuild API myself, I am able to speed things up a bit.
On a similar note, I toyed with invoking either FAKE or Fsi.exe in a slave AppDomain (should be faster than a separate process, right?) from the host MSBuild process that my build starts with. The approach failed miserably. Fsi.exe is reading System.Environment.GetCommandLineArgs() instead of reading the EntryPoint args, so that it does not see the args I pass to the slave AppDomain, but instead sees the args that MSBuild receives. And FAKE, again, drops me into the 2.0 runtime, probably starting another system process too.

In the end my impression is that there is tremendous value in automating build logic in F# instead of MSBuild. As to FAKE, it seems most of its value comes from the helper library - some shared direct-style imperative recipes. I would like to see that released separately, say as FAKE.Lib NuGet package, to make it easier to use standalone. Also, it seems that FAKE could really benefit from some extra standard features for dependency management such as comparing target input and output files by checksum or date, to be on par with rake.

ha4