ha4: haskell

Showing posts with label haskell. Show all posts

Tuesday, July 22, 2014

The fatal attraction of FRP

For about a year or so, I have made Functional Reactive Programming (FRP) taboo. I cringed at every mention of it. I think this was a mental self-defense reaction, my extended immune system was sending me a signal that I spent too much time thinking about this subject without any tangible results.

The fatal attraction of FRP is its simplicity. Semantics are beautiful and clear. There are Behaviors, functions of time, Events, timed occurrences, and they all dance together. You write causal transformations (future values depend on past values, not vise versa), and it works. You use equational reasoning to transform your program. You get inspired by Conal Elliott's papers.

However, how do we implement it? Many a ship has foundered in these waters.

Perhaps you do not even have to. People do different things to really "get," obtain an operational understanding of something. Mathematicians like denotational semantics. They understand object A by explaining it in terms of object B we already know. Only programmers like to implement things, but then they can use an existing implementation to play with. Once I play with Impl(A), I "understand" A.

At the Budapest functional programming beer night I used to go to, we had the fortune of having Gergely Patai and Csaba Hurska. These guys have built simply amazing Haskell projects, the Elerea FRP library, and the LambdaCube purely functional 3d engine. Gergely is a mathematician. He explained FRP to me by reducing things to Cartesian closed categories. As you guess, it did not help me - to this day my understanding of those things is shaky. Csaba is a little more of a programmer, he told be Haskell is a hacker's language. Both, now that I think of it, made fun of me for talking about dependent types being the future of programming, with good reason, as I now know. But I digress.

At a very high-level, problems with implementing FRP in a language like Haskell or F# are these:

It is quite hard to enforce causality of transformations with a vanilla type system. AFAIK most practical systems either design a new language and type system, or just leave this invariant unchecked.
If event streams are first class values, this can easily create memory leaks in a higher-order program, if the entire unfolding history is retained. Solutions here are: do not give the user first-class event streams (but only their transformers, or something like that), create a new language and type system that rules out those nasty higher-order programs (Elm.js), or use some kind of combination of convention, types, and clever implementation (weak pointers etc), to make it work.

Much as I despair getting FRP right, I am coming back to the topic as we are working on WebSharper UI.Next with Simon Fowler. While trying to implement a sub-FRP system, something that is not quite as general but hopefully much simpler to implement, I also realize I start slowly thinking about general FRP again.

Briefly, the compromises we are making in UI.Next are:

No first-class event streams. You want to transform event occurrences? Do it with imperative state and callbacks. That stops working? Do it with agents and communicating micro-processes. That stops working too? Use Concurrent ML (or Hopac). We are not quite there yet with providing CML primitives in WebSharper/JS but this is planned, probably as a proxy for Hopac.
We do, however, provide a dataflow abstraction that is almost, but not quite, approaching FRP behaviors. A View<'T> is something that varies over time, in discrete steps. It is now computed from zero or more Var<'T>, which are a reactive form of ref cells.
Using the mental model of communicating processes, we designed a protocol for View<'T> processes to synchronize in a way that is friendly to GC for most programs, even without relying on weak pointers (not at this point, at least). The protocol also does not preserve occurrences, it synchronizes to the "latest" value, which is nice.
Views can be observed imperatively or by constructing reactive DOM documents.

It is tempting to add Behaviors too, perhaps as View'T>.

As I am playing with this, I have one rebellious thought. Are FRP event streams or their transformers even worth the trouble? Behaviors are clearly wonderful. But do you really want first-class "mouseClicks"? Or a fold on those? In the context of an ML type system such as F#.

It remains to be seen. So far I like that our system is working well with embedding typical higher-order, stateful widget code, as found left and right in the JavaScript ecosystem. It is also simple enough so we could implement it.. Reasoning laws are NOT the ones for pure FP, but things are mostly tractable if you stick to the paradigm of communicating processes. CML might just fill the spot, but some cross-pollination might also happen. That is, we might end up adding process combinators that look a lot like FRP event and event/behavior combinators.

Wednesday, March 27, 2013

Generalizing records combinators a bit

Towards generic programming in F#: thoughts on generalizing the earlier combinators over records and unions...

Earley Parsing in Haskell

After a bit of a struggle I can run simple parsing examples using my Earley algorithm implementation [1] in Haskell. At this stage the implementation is likely to be flawed, with respect to correctness and likely performance. Regardless, the point I wanted to prove to myself is that it is perfectly plausible to define context-free grammars and their interpretation in plain Haskell, without resorting to code generation, or giving up full grammar analysis. I suspect it is a well-known method but it was a pleasure to discover. See for yourself:

Aycock and Horspool's paper [2] was my source for the algorithm. I admit I did not get very far, pretty much ignoring (for now) the bulk of their paper and the automaton construction, and focusing on their review of the basics. I also had to adjust things a bit to fit Haskell's purity, and devised a slightly modified (and possibly faulty) interaction of completer and predictor.

Earley's algorithm is beautiful. Very simple, fully general (all CFGs!), cubic in worst-case but close to linear on practical grammars, and, perhaps most importantly, incremental (think completion in an IDE). A highly recommended exercise.

[1] https://github.com/toyvo/haskell-earley
[2] http://webhome.cs.uvic.ca/~nigelh/Publications/PracticalEarleyParsing.pdf

Thursday, July 14, 2011

Combinatory Regular Expressions in Haskell

Combinatory parsing fascinates me. It is an area that has open problems, but also a lot of solved problems that make great exercises. A while ago I tried to define a combinatory regular expression matcher in F#, and was puzzled at how complex it turned out to be. As usual, marginal understanding was to blame.

Now I turned to Haskell for a change. I do think it helped, to learn more both about the problem and about Haskell. I hope to backport this exercise to the ML world, in the meanwhile here is the gist:

Sunday, March 8, 2009

Pandoc. Can we use it ouside of Haskell?

Pandoc is great. It a Markdown implementation on steroids, supporting a number of different input/output formats besides HTML, such as TeX. And yet it is Markdown: the source still looks great, just like plain email! As an extra bonus, Pandoc, being written in Haskell and compiled to native code, works a lot faster than many alternative markup language processors.

The problem is.. It is a Haskell library. If it is indeed the great software it seems to be, one would want to use it from scripting languages (PHP, Python, Perl), OCaml, Schemes, Java, .NET and so on.

The first step towards that is to bind it to C.

Good news: yes, it is possible, and Pandoc DLL or Shared Object builds on both Windows and Linux. It is callable from other languages (currently C and PLT Scheme).

Bad news: the binary is large, as GHC compiles everything, including its runtime, into it. The binding is untested, poorly configured and very incomplete. But, well, it is open-source and will get better with time. You can see for yourself: libpandoc

Saturday, March 7, 2009

learning to coax haskell with phantom types

The problem: write a function maxBits that will return 8 for Word8, 16 for Word16, 32 for Word32, and so on. Actually this is a function from types to values. The best I came up with is below:

maxBits :: (Bits a, Bounded a) => a -> Int 
maxBits x = bitSize m where 
    m = maxBound
    _ = x == m

Note the last line. It constrains x and m to be of the same type. More generally, it could be done for example thus:

sameType :: a -> a -> ()
sameType = undefined

maxBits :: (Bits a, Bounded a) => a -> Int 
maxBits x = bitSize m where 
    m = maxBound
    _ = sameType x m

With the maxBits we can do the following, which was what I was after.

toBits :: (Bounded a, Bits a) => a -> [Bool]
toBits x = [testBit x i | i <- [0 .. maxBits x - 1]] 

ofBits :: Bits a => [Bool] -> a
ofBits bits = foldr f 0 bits where
    f True n  = shiftL n 1 + 1
    f False n = shiftL n 1

propWord8Bits :: Int -> Bool
propWord8Bits x = y == ofBits (toBits y) where
    y :: Word8
    y = fromIntegral (x `mod` 256)

propWord16Bits :: Int -> Bool
propWord16Bits x = y == ofBits (toBits y) where
    y :: Word16
    y = fromIntegral (x `mod` 256)

propWord32Bits :: Int -> Bool
propWord32Bits x = y == ofBits (toBits y) where
    y :: Word16
    y = fromIntegral (x `mod` 256)

However, from the theoretical viewpoint, I am rather uncertain I understand the picture completely. Suppose there are uninhabited types:

data A
data B

Let's give them some properties:

class TShow a where
  showType :: string

But this is rejected by the compiler as showType does not mention a... Is there a way around? Yes. For example:

data Shown a = Shown String

class TShow a where
  showType :: Shown a

instance TShow A where
  showType = Shown "A"

instance TShow B where
  showType = Shown "B"

But when we want to write a function tshow that accepts a type as an argument, we have to either use showType and pass that argument in the return type (yes, lovely Haskell permits passing information through the return type), or make believe the type is inhabited, but to do it well we need to constrain the type of showType, which is tricky:

tshow :: TShow a => a -> String
tshow a = s where
    shown   = showType
    Shown s = shown                           
    constrain :: a -> Shown a -> ()
    constrain = undefined
    _ = constrain a shown

Then, we can use tshow like this:

tshow (undefined :: A)

Does it makes sense? Hm, not really. Passing the type through the return type makes more sense:

let (Shown x) = (showType :: Shown A) in x

is a better design.. So something like this should do instead:

data MaxBits a = MaxBits Int

maxBits :: (Bits a, Bounded a) => MaxBits a 
maxBits = r m where 
    m  = maxBound
    r :: Bits a => a -> MaxBits a
    r m = MaxBits (bitSize m)

I am still having this cognitive difficulty coaxing GHC to type expressions like maxBits. It does not seem natural. Maybe I am missing something beautiful right under my nose?

Friday, February 27, 2009

hdaemonize on hackage

Having got a green light from Andre, I uploaded hdaemonize to Hackage.

I also cleaned up the library a bit. Reading up on POSIX signals I decided that it is wrong for the daemonize library to mess with them beyond blocking HUP. Also, now it has two functions, daemonize :: IO () -> IO () which does the bare minimum, and serviced :: Program -> IO () that does that and more, for example writing a PID file, handling start/stop/restart, catching and logging exceptions to syslog.

Time to try it out in production.

Thursday, February 26, 2009

ANN: VIDEO SCREENING, A taste of Haskell (tutorial), Sat, Feb 28, 13:00-15:00, UEC, Kiev

To re-post a message from the LtU-Kiev maililng list, there will be a screening of Simon P. Jones' video tutorial A Taste of Haskell together with slides. Free entrance for up to 20 people, RSVP at the LtU-Kiev mailing list. Saturday, Feb 28, 2009. 13:00-15:00. Ukrainian Education Center, Kyiv, Ukraine. More about the venue here (eng).

Sunday, January 11, 2009

Haskell memoization with unsafePerformIO

Searching for Haskell memoization gives a lot of interesting material.. It is summarized at the Haskell wiki (Memoization). But unsafe/impure memoize functions are usually burried deep enough to make them hard to find. By impure I mean functions with this signature:


memoize :: (a -> b) -> (a -> b)

Or at least:


memoize :: Ord a => (a -> b) -> (a -> b)

where Ord allows to use log-n collections for storage, such as maps. Like this:


module Memoize (memoize, memofix) where

import qualified Data.Map as Data.Map
import System.IO.Unsafe
import Control.Concurrent.MVar

memoize :: Ord a => (a -> b) -> (a -> b) 
memoize f =
    unsafePerformIO $
    do cacheRef <- newMVar Data.Map.empty
       return (\x -> unsafePerformIO (g cacheRef x))
    where
      g cacheRef x = 
          do cache <- readMVar cacheRef
             case Data.Map.lookup x cache of
               Just y  -> return y
               Nothing -> 
                   do let y     = f x 
                      let cache = Data.Map.insert x y cache
                      swapMVar cacheRef cache
                      return y

memofix f = let g = memoize (f g) in g

I think the reason such memoize functions are shunned upon is because they run against the spirit of Haskell (use global state), and are usually unsafe in consuming unbounded memory (unless somehow cut off in the implementation). But they are still useful and much easier to use than pure versions.


import Memoize
let fib = memofix f where
  f rec 0 = 1
  f rec 1 = 1
  f rec n = rec (n - 1) + rec (n - 2)