Posted on Sat 12 May 2012

Scalisp - now with Compilation

It's been a while since my last update - I've been quite busy with course work. Part of that was a complete write of Scalisp - now it's truely List-based and it also includes a Compiler. Since it grew quite a bit, I split it up in several modules, which should be self-explaining:

1
2
3
4
5
6
7
8
+--------+   +--------------+    +----------+
| Parser +-->| Preprocessor +--->| Compiler |
+--------+   +------+-------+    +----------+
                   |
                   v
            +--------------+
            | Interpreter  |
            +--------------+

The Parser is very simple and quite short now, since it doesn't care about special forms anymore, it just makes sure that each expression has correct syntax:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
// grammar
def program: Parser[List[Any]] = rep(exp)
def list: Parser[List[Any]] = "("~>rep(exp)<~")"
def exp: Parser[Any] = ( 
    real
  | hexInteger
  | integer
  | quote
  | literal
  | list
  | replacement
  | token
  )
def integer: Parser[Long] = wholeNumber ^^ (n => n.toLong)
def real: Parser[Double] = ( """\-?\d+\.\d*([eE]\-?\d+)?""".r ^^ (d => d.toDouble)
  | """\-?\d+[eE]\-?\d+""".r ^^ (d => d.toDouble) )
def hexInteger: Parser[Long] = """\-?0x[\da-fA-F]+""".r ^^ (n => java.lang.Long.parseLong(n.substring(2), 16))
def token: Parser[String] = """[^() ]+""".r ^^ (n => n.toString)
def literal: Parser[Literal] = stringLiteral ^^ (l => Literal(l.tail.init))
def quote = "'"~>exp ^^ (e => List("quote", e))
def replacement = ","~>token ^^ (t => Replace(t))

And that's it. The Preprocessor just expands macros, so the Interpreter and Compiler don't have to know about them. Most of the grunt work is now done in the Interpreter and the Compiler, depending on what you want. The interpreter just evals the lists, using nested dictionaries as environments.

I think the Compiler is the most interesting, because instead of compiling to byte- or machinecode, I decided to compile to Scala code instead. Therefore, if you tell it to compile, say, this Lisp code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
; note that you can also use comments
; and split functions over multiple lines for readability
(define msort 
  (lambda (list) 
    (if (<= (length list) 1) 
      list 
      (begin 
        (define split (/ (length list) 2)) 
        (merge 
          (msort (subseq list 0 split)) 
          (msort (subseq list split)) 
        ) 
      ) 
    ) 
  )
)

(define merge
  (lambda (a b)
    (if (< (length a) 1)
      b
      (if (< (length b) 1)
        a
        (if (< (car a) (car b))
          (cons (car a) (merge (cdr a) b))
          (cons (car b) (merge a (cdr b)))
        )
      )
    )
  )
)

Scalisp will happily spit out the corresponding Scala code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def merge(a: Any, b: Any): Any = {
  if(length(a) < 1l) {
    b
  } else {
    if(length(b) < 1l) {
      a
    } else {
      if(car(a) < car(b)) {
        car(a) :: merge(cdr(a), b)
      } else {
        car(b) :: merge(a, cdr(b))
      }
    }
  }
}

def msort(list: Any): Any = {
  if(length(list) <= 1l) {
    list
  } else {
    {
      var split = (length(list) / 2l)
      merge(msort(subseq(list, 0l, split)), msort(subseq(list, split)))
    }
  }
}

As I previsouly mentioned, both the Compiler and the Interpreter support macros, as those are handled by the Preprocessor. You can use them to introduce your own "special forms" - suppose you want an unless statement. Without macros, you'd write a function like this:

1
2
(defun unless-bad (cond exp)
  (if cond unit exp))

The problem, of course, is that both arguments will be evaluated before they are passed to the function - while this may not matter for simple expressions, it's a big problem for recursive function calls and functions with side effects! If you try it out, you see the problem immediately:

1
(unless-bad (= 5 5) (print "executed branch"))

will print executed branch even though it shouldn't do anything at all. Macros help solve this problem (the strange commas , tell the Preprocessor what he should replace):

1
2
3
(defmacro unless (COND EXP)
  (if ,COND unit ,EXP))
(unless (= 5 5) (print "executed branch"))

Now, nothing is printed, because the macro is expanded before interpretation / compilation takes place, so what the interpreter really sees is this:

1
(if (= 5 5) unit (print "executed branch"))

Which works correctly now, since if is a special form which only evaluates its arguments if necessary.

If you want to try it for yourself, you can download a precompiled .jar from the [cached]GitHub repo, or you can just clone it and use sbt to run the code.

Tags: programming, university

© Julian Schrittwieser. Built using Pelican. Theme by Giulio Fidente on github. .