- Subject matter: life, the universe, haskell and everything
- Intended audience: you
- Author: me
- Platform: http://www.steve.org.uk/Software/chronicle/
Testing HsColour. Here you go: fibonacci with iso-recursive types
> data Mu f = Mu {unMu :: f (Mu f)} > type List a = Mu ((,) a) > tl = snd . unMu > hd = fst . unMu > cns = (Mu .) . (,) > ones = Mu (1,ones) > fibs :: List Integer > fibs = cns 1 . cns 1 $ go fibs (tl fibs) > where go (Mu (x,xs)) (Mu (y,ys)) = cns (x+y) $ go xs ys
(NB. planet readers, you can't see the colours)
Irssi is a powerful but sometimes baroque irc client that seems to be the de facto standard nowadays. One of it's downsides is that the default theme, binds and settings are pretty awful.
By default, meta-0..9qwer..io
(the number row and the qwerty top row)
access the first 20 windows. However for most IRC junkies this won't
do, they need some way to access window numbers, say, up to one
hundred. Nike has one
solution for this but
this is what I prefer:
for i in $(seq 0 9); do for j in $(seq 0 9); do
echo "/bind meta-g-$i-$j change_window $i$j"; done; done
This shell snippet generates commands to bind the sequence alt+g
x y
to window 10x+y. This doesn't intrude on the default binds
so I can still access the first 20 windows with just one modified
keypress, but still keeps windows up to 100 nicely in reach.
To handle other shortcomings of irssi I recommend nospambot (sadly offline, it seems) for ignoring noisy bouncing, nickserv for authenticating with NickServ, act for removing noise from the activity. I've also designed two themes for irssi: trax (.theme file) (I know it's kinda tacky, I was young and 1337 back then) and simplicity (.theme file) (Works well on both black-on-white and white-on-black terminals).
I attended a compilers course this spring. The final excercise was to implement from scratch a compiler for what was basically a very restricted subset of C. The compiler was to generate LLVM assembly that could then be optimized and compiled into native binaries with the (very high quality) LLVM tools. I recommend having a look at the various LLVM benchmarks: it's going to be a major player in programming language implementation in the future.
All this however is but a backdrop to what I'm going to write about. I implemented the compiler in Haskell and bumped into very interesting design questions. First of all, I decided to implement the AST (abstract syntax tree, generated by the parser) as a generalized algebraic datatype (GADT) because I had been itching to find some real use for GADTs and ASTs were the canonical example of their use. The following is a condensed version of what the finished compiler uses (preceded by some boilerplate to make this post work as literate haskell):
> {-# LANGUAGE KindSignatures, EmptyDataDecls, MultiParamTypeClasses, > FlexibleInstances, GADTs #-} > > import Control.Monad > import Control.Monad.Fix > import Control.Monad.Error > > data Expr > data Stmt > data Location > > data NoAnn a = NoAnn > type Identifier = String > type Op = String > > data AST (ann :: * -> *) t where > If :: ann Stmt -> AST ann Expr -> AST ann Stmt -> AST ann Stmt > Block :: ann Stmt -> [AST ann Stmt] -> AST ann Stmt > Assign :: ann Stmt -> AST ann Location -> AST ann Expr -> AST ann Stmt > > Var :: ann Location -> Identifier -> AST ann Location > Ref :: ann Expr -> AST ann Location -> AST ann Expr > > Binary :: ann Expr -> Op -> AST ann Expr -> AST ann Expr -> AST ann Expr > NumLit :: ann Expr -> Integer -> AST ann Expr
In addition to this there were declarations, array types and more
control structures, but I'm trying to keep things simple. As you can
see the GADT is parametrised over an annotation type ann
and a
phantom type that indicates the type of the node (statement,
expression, location). The annotation type, however, is of kind
*->*
: it takes the node type as an argument. This is useful because
different node types need to have different annotations, for example
expressions have a type while statements do not. All this could also
be formulated as vanilla Haskell types as follows (with a simple
annotation type for simplicity):
> data Stmt' ann = If' ann (Expr' ann) (Stmt' ann) > | Block' ann [Stmt' ann] > | Assign' ann (Location' ann) (Expr' ann) > > data Location' ann = Var' ann Identifier > > data Expr' ann = Binary' ann Op (Expr' ann) (Expr' ann) > | Ref' ann (Location' ann) > | NumLit' ann Integer
However getting the different annotation types to flow as smoothly would be a bit bothersome. (The primes are added in order to keep this blog post a working .lhs file.)
I had decided to implement the compiler with attribute grammars, a very elegant and powerful formalism for tree traversals (see Why attribute grammars matter for an introduction). There exist attribute grammar systems that generate Haskell code from a description of the grammar (UUAG being the most noteworthy), but I decided to go on with a hand-coded approach. UUAG compiles the attribute grammar into a bunch of haskell functions that mutually recurse to calculate the attributes. This however mixes up all the attributes into one traversal and is very obfuscated. A more straightforward way would be to actually annotate the tree with attribute fields.
More specifically, syntetic attributes (ones which can be computed from a node's childs' attributes) would be stored in the node's annotation field while inherited attributes (ones which propagate from a node down to it's leaves) would be just passed down as arguments to the traversal function. The basic structure of the traversal would be something like
> class Traverse1 a inherited synthetic where > traverse1 :: AST NoAnn a -> inherited -> AST synthetic a > -- and instances for Stmt, Expr and Location
This solution however has the problem that makes the traversal monolithic. I'd like to have the different phases of the compiler (e.g typecheck, code generation) define their own simple traversals and then simply compose these in the driver. One other solution I toyed with was:
> type M = Either String -- for example, actually it was RWST on top of Error > > -- an action that we can mfix over (as long as M is an instance of MonadFix) > type Result a = a -> M a > > data AG a = AG { > block :: M [a Stmt] -> Result (a Stmt), > ifthen :: M (a Expr) -> M (a Stmt) -> Result (a Stmt), > assign :: M (a Location) -> M (a Expr) -> Result (a Stmt), > > var :: Identifier -> Result (a Location), > > ref :: M (a Location) -> Result (a Expr), > binary :: Op -> M (a Expr) -> M (a Expr) -> Result (a Expr) > } > > compose :: AG a -> AG a -> AG a > compose (AG b' i' a' v' r' bin') > (AG b i a v r bin ) = > AG (f1 b' b) (f2 i' i) (f2 a' a) (f1 v' v) (f1 r' r) (f3 bin' bin) > where f1 m n = \x -> m x >=> n x > f2 m n = \x y -> m x y >=> n x y > f3 m n = \x y z -> m x y z >=> n x y z > > class TraverseFix attr t where > traverseFix :: AG attr -> AST NoAnn t -> M (attr t) > > -- All we do in these instances is define the recursion scheme > instance TraverseFix attr Stmt where > traverseFix ag (Block _ ss) = mfix $ (block ag) (mapM (traverseFix ag) ss) > traverseFix ag (If _ e s) = mfix $ > (ifthen ag) (traverseFix ag e) (traverseFix ag s) > traverseFix ag (Assign _ l e) = mfix $ > (assign ag) (traverseFix ag l) (traverseFix ag e) > instance TraverseFix attr Location where > traverseFix ag (Var _ i) = mfix $ (var ag) i > instance TraverseFix attr Expr where > traverseFix ag (Ref _ l) = mfix $ (ref ag) (traverseFix ag l) > traverseFix ag (Binary _ o e1 e2) = mfix $ > (binary ag) o (traverseFix ag e1) (traverseFix ag e2)
This lets one compose multiple attribute grammars (AG
s) and
take their fixpoint over the AST: this makes the order of composition
irrelevant and actually allows for more expressive AGs. Also, it
evades actually storing the intermediate attributes in the tree quite
elegantly. All this is very true to the spirit of attribute grammars
but the requirement of knowing all attributes beforehand violates the
modularity I needed.
Stand by for part 2...
I saw Dante 01 with Kebbish yesterday. The film was mind-blowing, something in the vein of Riddick meeting Muad'dib under the gentle guidance of Kubrick. The script would've made a top-notch scifi novella on it's own but director Marc Caro pushed it even further. Kudos for this. Additionally, the multiple references to classical literature and the detached narration from Perséphone really made my day.
As a downside the narrative flow stumbles somewhat towards the end, resorting to a slight deus ex machina by a very flat character. Also, the film is divided into four segments, labeled "Cercle 1" to "Cercle 4", obviously alluding to Dante's Divine Comedy, but the Divine Comedy relates of nine circles of hell. This and the Space Odyssey 2001-esque ending might point towards truncating the script? My sentiment here is echoed by http://www.imdb.com/title/tt0487928/board/nest/94052306, though I don't agree with the writer's view of the ending. The film is a very compelling whole however, so the non-classical story arc might very well be intentional.
Spoiler warning!
The setting is space station Dante 01 in orbit around a fiery planet named Dante. The station is a mental asylum/prison/test center that houses criminals who have agreed to submit to medical experiments in order to evade a death sentence. The movie begins with a shuttle bringing a new doctor sent by the company running the station and a new inmate (the protagonist, soon christened Saint George by the other inmates).
The movie seems to be an allegory for a new scientific and social awakening of humankind. The medical company running the station injects an experimental DNA-altering nanovirus into the inmates and seemingly expects everybody to die (weapons research?). It also seems that the company is aware of the strange gift of healing that the protagonist has, and is seeking to commercially exploit it. As usual, the good prevail and the new doctor and an opportunistic inmate are the only ones to perish. Even the inmate who dies while trying to save the station from crashing into Dante is resuscitated by the protagonist. This lack of death nicely contrasts the harsh dystopian future and the grim visual language of the film.
In a psychedelic ending Saint George exits the station in a space suit and decomposes into a double helix (DNA anyone?) that rushes towards Dante. The planet turns green and Saint George has "beaten the dragon". Technology has been turned towards general progress instead of commercial gain. The collaboration of the crew and inmates points towards the need for a different kind of society, one that doesn't force human beings into cold robots with the hand of "established procedure".
Updated 15:50: minor changes, grammar
Updated 19:10: more on patchiness
As some irc acquaintances of mine already know, choosing a blog platform turned out pretty difficult for me. Actually, let me retake that, finding a blog platform that didn't suck turned out a veritable odyssey. Now it's been about a week and one national holiday since my struggles so I can hopefully recount my tale w/o resorting to tuomov-style rhetorics.
My ideal platform would be a blog compiler with some cgi to handle comments. I want to keep my posts in a darcs repo and serve static html when it's possible. Markdown support and code syntax hilighting would be bonuses. I'd like to plug in HsColour. Doesn't sound so hard right?
When discussing my plans I received two platform suggestions: blosxom and chronicle. Originally I had nanoblogger in mind but it soon turned out it just wouldn't do. I set up chronicle but started experimenting with other engines. Here's the rundown:
features
problems
Text::Template
features
problems
features
problems
(no, I didn't try it, I have enough experience with this monster already)
problems
So here I am, back using chronicle and not that pleased with it. Ikiwiki would be perfect for the job were it not for the need to write my own plugin (I already have) and some ideological problems. Maybe I'll get around to switching to it some time in the future.
Update 2008-06-23: language
A handy (and quite well-known) trick in haskell is to turn list concatenation (a O(n) operation) into function composition that avoids copying and is O(1):
> let list1 = [1,2,3] in list1 ++ list1 > let list2 = (1:).(2:).(3:) in list2 . list2 $ [] -- ta-dah! no copying
I benchmarked this ages ago when I first heard about the trick, but
morrow
publishing a a
library that
uses this made me rerun and publish my observations. Here's the test
code, the classical quicksort:
> module Main > where > > import Random > import Control.Monad > import Data.List > > qsort1 :: [Int] -> [Int] > qsort1 [] = [] > qsort1 (x:xs) = qsort1 [a | a<-xs, a<x] ++ [x] ++ qsort1 [a | a<-xs, a>=x] > > qsort2 :: [Int] -> [Int] > qsort2 xs = qsort xs [] > where qsort [] = id > qsort (x:xs) = qsort [a | a<-xs, a<x] . (x:) . qsort [a | a<-xs, a>=x] > > main = do > l <- replicateM 500000 $ randomRIO (1,500000) > let l1 = qsort1 l > let l2 = qsort2 l > print (and $ zipWith (==) l1 l2)
Results with ghc -O0
:
% for i in 1 2 3; do time ./qsort +RTS -P -K12000000; done
True
./qsort +RTS -P -K12000000 13.09s user 0.28s system 99% cpu 13.420 total
True
./qsort +RTS -P -K12000000 12.81s user 0.28s system 99% cpu 13.135 total
True
./qsort +RTS -P -K12000000 13.38s user 0.31s system 99% cpu 13.709 total
COST CENTRE MODULE %time %alloc ticks bytes
qsort1 Main 35.4 46.2 109 143631183
qsort2 Main 32.5 34.3 100 106768654
main Main 26.9 17.2 83 53500026
CAF Main 5.2 2.3 16 7001237
and with ghc -O3
:
% for i in 1 2 3; do time ./qsort +RTS -P -K12000000; done
True
./qsort +RTS -P -K12000000 9.40s user 0.30s system 99% cpu 9.720 total
True
./qsort +RTS -P -K12000000 10.58s user 0.40s system 99% cpu 11.021 total
True
./qsort +RTS -P -K12000000 10.10s user 0.36s system 99% cpu 10.501 total
COST CENTRE MODULE %time %alloc ticks bytes
qsort1 Main 35.1 46.8 74 127559904
main Main 33.6 19.1 71 52001182
qsort2 Main 31.3 34.1 66 93115220
So it seems the compositive version is a tad faster and notably less
memory-hungry. Also, ghc optimizes the compositive version more
efficiently, presumably because (.)
enables all sorts of rewrites
(++)
doesn't.
As a sidenote, strangely enough replacing the list comprehensions in the above code with something like
> let (small,big) = partition (<x) xs in qsort small ++ [x] ++ qsort big
ended up in a performance decrease of some 3s with -O3
(didn't test
w/o optimizations)...
Update: As folks on #haskell
pointed out I should mention that
this isn't a real quicksort. Quicksort refers to the exact in-place
partitioning algorithm that the imperative version uses. This is of
course immaterial to the benchmark.
Update 15:01: quicksilver
noted that the profiling overheads for
the two implementations might be different. I ran a quick test that
showed the overheads to be roughly equal: unoptimized qsort2
was
about 3% faster than unoptimized qsort1
both with and without
profiling.
Today's small haskell satori: the least fixpoint operator
> fix f = let x = f x in x
(which seemed like dark magic when I first saw it) could be written a bit more intuitively as
> fix' f = let f' = f . f' in f' undefined
This captures the intuition that fix
just applies a function
iteratively to undefined
(i.e. bottom, _|_
). Actually we could use
any value in the place of undefined
but that would restrict the type
of fix'
. You see, the only reason f'
really needs to be applied to
anything is to appease the typechecker.
Of course the original (Haskell98) definition is more elegant and probably has better sharing characteristics too.
Update 2008-07-11: found some related musings on haskell-cafe. See also the reply
Paycheck came in today so I could finally pay off my library bills. While at it, borrowed some music:
Great stuff, gotta love the libraries in Helsinki!
Update 2008-07-05: Gah, forgot all about Autechre - Chiastic Slide, probably the best in this pile. Not quite up to the aggressive-disassociative madness of Autechre's Untitled, Chiastic Slide presents a calmer and warmer side of the artist w/o compromising the weirdness and smooth execution that are the cornerstone of their music.
I've also found a weird, shy love that hides behind the samples in Everything Ecstatic, but I can't shake the forlorn and melancholic vibes it sends down my spine. Definitely recommended.
Also, added juno.co.uk
links on the albums I could find,
enjoy the free samples!