ha4: 2013

Tuesday, October 15, 2013

Last word in .NET build systems

Has not been said yet, I do not think.

In the F# world you may be looking at:

I am not going to do a detailed pro/con analysis of these just yet, but note that every one of them is currently missing abstractions relevant for this problem domain - building. A build system should allow you to do at least what veritable Makefile does - optimal rebuilds, but at an abstract level, as a library.

The best system I have seen so far that gives you these abstractions is Shake by Neil Mitchell (coded in Haskell). It goes beyond Makefiles by allowing dynamic dependencies. I did not study the specifics very closely, but the overall design is vastly useful, brilliant.

Do not have time at this moment to get it into the shape it deserves, but here is some work I have been drafting now and then to build a similar library in F#: fshake. If this scratches an itch, let me know, I would be interested in contributors - I have not put the license in yet, but this project will be under Apache license.

Friday, October 4, 2013

WebSharper vs FunScript

Got asked to compare WebSharper to TypeScript on StackOverflow, doing it here instead. Disclaimer: I work for IntelliFactory and in fact develop WebSharper, so I am obviously biased.

Good things to say about FunScript:

seems to be entirely unrestricted (I could not even locate a license file)
has some very talented people hacking on it (but not as the day job)
has this wonderful idea of using TypeScript definition files to provide typed API for JavaScript libraries (however, at this moment it does not work with latest TypeScript version so no luck there)

Why would you be interested in WebSharper instead? The quick answer is that it actually gets used a lot more, and is known to work quite well on large projects (I will cite FPish and CloudSharper as the ones we have been using it on at IntelliFactory), there is a team working on it day-to-day, and you can get actual support. It has been in use longer and it might have more issues ironed out, and it definitely supports a larger portion of F# standard library out-of-the-box.

NOTE: Tomas pointed me to a FunScript way of doing the following in the comments. Jon Harrop sites problems with using sqrt in FunScript. In WebSharper, it works. Suppose it did not, and you know how the function looks in JavaScript, you do:

[<Inline "Math.sqrt($x)">]
let sqrt (x: double) = sqrt x

And there you have it.. Also, rest assured that there is no string splicing going on - the inline is parsed as JavaScript, then a large subset of JS like the above is lifted to our Core form, which assures things like evaluation order are preserved, and lets optimizations work smoothly, including ones we have not written yet :)

That being said, I believe we have made quite a few mistakes in the past, including:

hiding sources (it is open-source now)
mismanaging the community - actually I am reading books now on how to facilitate an open-source project community better
making it too hard to get started and not giving enough documentation
(presently we are writing manual chapters at our github repo) and preparing a website update
trying to provide all library bindings ourselves - TypeScript is definitely a great idea and we will add support for it (have a working prototype for 0.8 but need to upgrade to latest TS - same as FunScript
attempting to solve too many problems at once, and not focusing on important problems first. This leaving us with a large and somewhat difficult to change codebase
some engineering mistakes in organizing code or designing APIs

If you feel like making more suggestions, please do..

People also typically bring up licensing and how the output code looks like as problems. I do not think those are show-stopper issues:

License (AGPL) - it is actually free for open-source use. If you need to close your app, but cannot afford the listed license fees, just talk to us and we can work out a deal. Note that having this license might be annoying but it is essential for securing funding and commercial support for the project - for it to have a future.
Code output - we are working out a better optimizer with Andras Janko that will help a lot, in both shrinking output and improving performance, but really, how the code looks - this is not an issue. In months of programming with WebSharper I never remember looking at the JavaScript output. Look at Emscripten, can you read its output? Yet Emscripten is extremely useful. Once you get past a certain stability level, this does not matter anymore.

I also privately think both WebSharper and FunScript make this fundamental mistake - using F# quotations. I think the future is simply not there. There is exciting stuff going on with projects such as ASM.js and Emscripten, and I strongly suspect the future successful project in this area will provide a much more compatible CLR implementation on top of JS, plus perhaps some specialized optimizations for functional code as F# produces. I have seen a few CLR-to-JS compilers, but am not sure which is the best today. I tried writing one myself and hope to come back to it, it certainly was challenging and interesting. The key is to stop thinking of such projects as source-to-source compilers and start treating JS for what it is, portable assembly.

Thursday, June 6, 2013

Generic programming in F# - another take

Playing a bit more with generics, I stumbled upon some fairly compact combinators that can derive n-ary conjunctions from a binary conjunction. Here is a self-explanatory F# example (disregard the ugliness involved in F#'s lack of higher kinds, Haskell or OCaml code would not need this):

In case you are wondering about the Coq file in the gist: a good objection to this approach is efficiency. Instances computed in this way are not efficient at all. Just think of all the allocated tuples! It turns out, we can remedy the situation with automated program transformation. In F# it would involve computing over quotations or similar quoted forms. It is hard to get started since little support for optimizing those is available out of the box. In Coq there is more batteries. As you can see in the gist, Coq seems to tackle the above example trivially with the "compute" tactic, and derive an instance equivalent to a hand-written one.

Wednesday, April 3, 2013

Introducing FAKE boot workflow

FAKE is a tool to automate your builds with F# scripts. Use the newly available FAKE boot workflow to get started quickly, automatically reference packages, and plug in your own build logic from NuGet.

FAKE with NuGet support

UPDATE: the proposal made it into pre-release FAKE, see the more recent article: http://t0yv0.blogspot.com/2013/04/introducing-fake-boot-workflow.html

Intended F# build workflow: start from an empty folder, write build.fsx, and run fake - and get anything building with software from NuGet.

Draft Implementation Discussion

Generalizing records combinators a bit

Towards generic programming in F#: thoughts on generalizing the earlier combinators over records and unions...

WebSharper, PhoneGap, and Ripple: easier native HTML5 apps

We are experimenting with PhoneGap, PhoneGap Build, Ripple emulator and WebSharper. If successful, this will let you write truly cross-platform native mobile apps in F#, quickly pre-testing them in Chrome, and then generating various installers (Android, Windows, iOS - yes, iOS!) without even having to install the SDK.

TypeScript: initial impressions

There are several things under the TypeScript umbrella: a type system, language spec, JavaScript-targeting compiler and tooling that provides interactive code completion. I have tried TypeScript on a few small projects, the latest one being TypedPhoneGap. I also worked extensively with the language spec, wrote a parser and analyzer for the `.d.ts` contract specification fragment, pondered the semantics, and engaged in little flamewars on the language forum. My conclusion, in short, is this: the type system and language design are terrible, the compiler is OK, and tooling is excellent. Overall, it is an improvement over writing bare JavaScript, and the definition fragment is helpful (though not ideal) for communicating JavaScript API formally.

Using Coq as a program optimization tool

Update: - a slightly more compelling example of where program computation shines is at Another take on F# generics.

I tend to complain a lot about the programming tools I work with. In one sentence, the world of software is just not a sane place. If left unchecked, this complaining attitude can grow into despair - so I enjoy a little escape from time to time. I need to dream of a better world, where programming is meaningful, fun, beautiful, efficient, mostly automated, and where programs are verified. This is why my escape activity is playing with Coq.

One fascinating trick that Coq does really well is computing and manipulating programs. Have you ever found yourself writing beautiful, general functional code and then wondering how to make it run faster? Spending lots of time inlining definitions, special-casing, performing simplifications and hoping you do not introduce bugs in the process?

This activity can be automated in a way that verifies meaning preservation.

Here is a rather silly example (I will try to think of a better and more convincing one):

Lines 20-24 should in theory be expressible as a single-word tactic. This is where program simplification happens.

Imagine a better world.. You write functional code, focus on clarity of specification, then drop into interactive theorem proving to compute (!) the equivalent huge, ugly, optimized, specialized program that runs a lot faster but is guaranteed to produce the same result.

EDIT: a reader has asked how manipulating programs in Coq is different from using an optimizing compiler such as GHC. The short answer (and I again realize just how lame my example is since it does not demonstrate it) is control.

With GHC, you get a black-box optimizer that does some transformations and gives you a result. The resulting program is usually better (faster, uses less space), but once in a while worse. It also may or may not do the essential optimization for your domain. I am not using GHC on a daily basis - so feel free to correct me, but I believe the amount of control you exert is fairly limited, even with REWRITE pragmas. In Coq, on the other hand, you get "proof mode" - an interactive environment where you can inspect and explore the intermediate versions of your program as it is being optimized, and guide the process. There is a spectrum of possibilities from fully manual to fully automated transformations, programmable in the Ltac tactic language.

Another advantage (which makes this attractive to me) is that if you are targeting ML (OCaml, F#), you are working with a compiler that refuses to do too much optimization. So GHC-like magic is simply not available. Here extracting from Coq may come handy.

Yet another advantage is that you are working in a more expressive logic. This opens the door to writing more general programs than your target compiler would admit. I have not yet explored this too much, but it seems to be an easy way to bypass the lack of, say, higher kind polymorphism in F#.

Finally, I should also mention that as a proof assistant, Coq makes it practical to construct proofs that your programs (and transformations) are correct with respect to a given semantics. I find this appealing, even though this is a bit beside the point. As a beginner my experience has been that proving any non-trivial lemma takes unbelievably more time than constructing a program, so for now I am just exploring the possibility of using Coq as a programming language without doing much proof development.

Tuesday, March 19, 2013

Automate, automate, automate..

Latest snapshot of the WebSharper repositories (Bitbucket and at GitHub) showcases some build automation I finally managed to get working. It starts from MSBuild that uses NuGet to pull dependencies, including build dependencies, and then jumps to FAKE - using the alpha pre-release FAKE version has solved some F# version problems I faced earlier. It builds a lot of code, creates an output NuGet package, and also creates several Visual Studio templates that it packages up into a VSIX extensibility package - using the free to use API we are releasing under IntelliFactory.Build.

Things are looking a bit better since the build succeeds from scratch inside the AppHarbor environment that is currently hooked up - as a simple build server.

Despite the modest progress, build automation is still a nightmare. I think we need a much simpler story, one that would definitely involve the NuGet repository as a global binaries library and F# as the language for all build logic, with MSBuild in support role (generating MSBuild scripts for Visual Studio users).

See also the discussion on ScriptCS - this project attempts to do something very similar based on C#. I like their authors' emphasis on Node Package Manager as the model.

A thought that occurred to me frequently when debugging the builds was the simple question - why do all the parts have to work with process isolation barriers? It seems like we are trying to play the UNIX game on Windows, where the filesystem simply does not keep up, and processes have an unbearably slow cold start. For example, in principle, why invoke NuGet or MSBuild for that matter as a process through the command line, when there is .NET API for both? It seems that PowerShell at least allows Cmdlet's to keep some state in-memory, so reusing the same Cmdlet is vastly faster than invoking the same functionality in a separate process. I might be working on the tool along these lines as time permits.

A big win would be to have sane F# compiler (without the famous toxic nuclear waste that Haskell avoids by controlling side effects in the type system).. Then we would not have to invoke it in a separate process for each project in the solution. This could drastically reduce build times.

Thursday, March 14, 2013

Multi-targeting .NET projects with F# and FAKE

I am currently working on simplifying build configurations for a bunch of projects including WebSharper, IntelliFactory.Build, IntelliFactory.FastInvoke and eventually multiple WebSharper extensions. I am now using F# and FAKE when possible instead of MSBuild, and relying more heavily on the public NuGet repository.

The good news is that abstracting things in F# and sharing common build logic in a library via NuGet really works well, and feels a lot more natural than MSBuild. Consider this Build.fsx file from FastInvoke:

The FAKE-based build (well, together with some MSBuild boilerplate that I generate) accomplishes quite a few chores:

Bootstraps NuGet.exe without any binaries in the source repo
Resolves packages specified in solution packages.config, including pulling in FAKE and build logic such as IntelliFactory.Build
Determines current Mercurial hash or tag
Constructs AutoAssemblyInfo.fs with company metadata and mercurial tag
Constructs MSBuild boilerplate to help projects find NuGet-installed dependencies without specifying the version, for easy dependency version updates
Builds specified projects in multiple framework configurations

And you can, of course, do more inside the FAKE file.

The bad news is that I expected quite a bit more from FAKE, and I end up fighting it more than using it - note that these can be either legit problems with FAKE or else my limited understanding of it.

~~Running FAKE.exe drops your code into the 2.0 runtime, even if the host process was in 4.0 - not acceptable for me, had to replace FAKE.exe invocation with FSI.exe invocation~~ - UPDATE: when using pre-release alpha version of FAKE, the scripts default to the 4.0 runtime and use FSharp.Core 4.3.0.0 - problem solved
FAKE had no easy support for dependency tracking, such as not overwriting a file unless necessary (useful to prevent say the MSBuild it invokes from doing work twice) - had to roll a few or my own helpers
Surprisingly FAKE MSBuild helpers are calling MSBuild process instead of using the in-process MSBuild API. By using MSBuild API myself, I am able to speed things up a bit.
On a similar note, I toyed with invoking either FAKE or Fsi.exe in a slave AppDomain (should be faster than a separate process, right?) from the host MSBuild process that my build starts with. The approach failed miserably. Fsi.exe is reading System.Environment.GetCommandLineArgs() instead of reading the EntryPoint args, so that it does not see the args I pass to the slave AppDomain, but instead sees the args that MSBuild receives. And FAKE, again, drops me into the 2.0 runtime, probably starting another system process too.

In the end my impression is that there is tremendous value in automating build logic in F# instead of MSBuild. As to FAKE, it seems most of its value comes from the helper library - some shared direct-style imperative recipes. I would like to see that released separately, say as FAKE.Lib NuGet package, to make it easier to use standalone. Also, it seems that FAKE could really benefit from some extra standard features for dependency management such as comparing target input and output files by checksum or date, to be on par with rake.

ha4