Phillip Trelford's Array

POKE 36879,255

Basic Tuples & Pattern Matching

Over the last couple of weeks I’ve been building my own parser, interpreter and compiler for Small Basic, a dialect of BASIC with only 14 keywords aimed at beginners. Despite, or perhaps because of, Small Basic’s simplicity some really fun programs have been developed, from games like Tetris and 3D Maze to a parser for the language itself.

Small Basic provides primitive types for numbers, strings and associative arrays. There is no syntax provided for structures, but these can be easily modelled with the associative arrays. For example a 3D point can be constructed with named items or ordinals:

Named items Ordinals
Point["X"] = 1.0
Point["Y"] = 2.0
Point["Z"] = 3.0
Point[0] = 1.0
Point[1] = 2.0
Point[2] = 3.0

In languages like Erlang and Python this could be more concisely expressed as a tuple:

Erlang Python
Point = {1.0, 2.0, 3.0}
point = (1.0, 2.0, 3.0)

In fact sophisticated Erlang programs are built entirely from tuples and lists, there is no explicit class or inheritance syntax in the language. Messages can be easily expressed with tuples and behaviour via pattern matching.

Alan Kay, inventor of the Smalltalk language has said:

The notion of object oriented programming is completely misunderstood. It's not about objects and classes, it's all about messages.

In Erlang a hierarchy of shapes can simply be modelled using tuples with atoms for names:

Circle = { circle, 5.0 }
Square = { square, 7.0 }
Rectangle = { rectangle, 10.0, 5.0 }

The area of a shape can be expressed using pattern matching:

area(Shape) ->
  case Shape of
    { circle, R } -> pi() * R * R;
    { square, W } -> W * W;
    { rect, W, H } -> W * H
  end.

Select Case

The Visual Basic family’s Select Case functionality is quite rich. More so than the switch/case statements of the mainstream C dialects: Java, C# and C++, which only match literals.

In Visual Basic it is already possible to match values with literals, conditions or ranges:

Select Case agerange
  Case Is < 16
    MsgBox("Minor")
  Case 16 To 21
    MsgBox("Still Young")
  Case 50 To 64
    MsgBox("Start Lying")
  Case Is > 65
    MsgBox("Be Yourself") 
  Case Else
    MsgBox("Inbetweeners")
End Select

Given that Select Case in VB is already quite expressive, it feels support for tuples and pattern matching over them would feel quite natural in the language.

Extended Small Basic

To this end I have extended my Small Basic parser and compiler implementation with tuple and pattern matching support.

Tuples

Inspiration for construction and deconstruction was taken from F# and Python:

F# Python
let person = ("Phil", 27)
let (name, age) = person
person = ("Phil", 27)
name, age = person

So that tuples use explicit parentheses in the extended Small Basic implementation:

Person = ("Phil", 27)
(Name, Age) = Person

Internally tuples are represented using Small Basic’s built-in associative arrays.

Pattern Matching

First I implemented VB’s Select Case statements, which is not hugely dissimilar to parsing and compiling Small Basic’s If/ElseIf/Else statements.

Then I extended Select Case to support matching tuples with similar functionality to F#:

F# Extended Small Basic
let xor x y =
  match (x,y) with
  | (1,1) -> 0
  | (1,0) -> 1
  | (0,1) -> 1
  | (0,0) -> 0
Function Xor(a,b)
  Select Case (a,b)
    Case (1,1)
      Xor = 0
    Case (1,0)
      Xor = 1
    Case (0,1)
      Xor = 1
    Case (0,0)
      Xor = 0
  EndSelect
EndFunction

Constructing, deconstructing and matching nested tuples is also supported.

Example

Putting it altogether, FizzBuzz can now be expressed in my extended Small Basic implementation with functions, tuples and pattern matching:

Function Mod(Dividend,Divisor)
  Mod = Dividend
  While Mod >= Divisor
    Mod = Mod - Divisor
  EndWhile
EndFunction

Sub Echo(s)
  TextWindow.WriteLine(s)
EndSub

For A = 1 To 100 ' Iterate from 1 to 100
  Select Case (Mod(A,3),Mod(A,5))
    Case (0,0)
      Echo("FizzBuzz")
    Case (0,_)
      Echo("Fizz")
    Case (_,0)
      Echo("Buzz")
    Case Else
      Echo(A)
  EndSelect
EndFor

Conclusions

Extending Small Basic with first class support for tuples was relatively easy, and I feel quite natural in the language. It provides object orientated programming without the need for a verbose class syntax. I think this is something that would probably work pretty well in other BASIC dialects including Visual Basic.

Source code is available on BitBucket: https://bitbucket.org/ptrelford/smallbasiccompiler

Extending Small Basic with Function Procedures

Microsoft Small Basic is a minimal implementation of the BASIC language aimed at beginners, with only 14 keywords.

A few years back I used Small Basic as an early introduction to programming for my 2 young children (at the time 9 and 5). Small Basic’s library has a nice simple API for drawing shapes and moving a turtle around the screen which the kids loved. However I found the lack of function arguments and return values was quite limiting from a teaching perspective. After creating shapes we wanted to refactor the code so that the drawing procedures could be parameterized, which can only be achieved with global state. Not wanting to get my kids in to bad habits early we swiftly moved on to F#.

Coincidentally Small Basic’s library is a .Net assembly that can be easily consumed from C#, F# and VB.Net, which I’ve used once while introducing C# programming to adult beginners.

Inspired by Small Basic’s simplicity, I’ve also built an open source library along similar lines called SmallSharp which I’ve used on occasion to introduce programming in F#. I feel the ability to start drawing shapes on the screen with just a few commands gives a real sense of excitement and achievement. The Processing language is another good option in this space. In contrast large frameworks like WinForms and WPF, although highly customizable, typically require a lot of work up front before you see any results.

In the previous three articles I’ve described steps to building a Small Basic compiler. First creating an abstract syntax tree (AST) with F# discriminated unions, then parsing with the FParsec parser combinator library and finally transforming to CIL with reflection emit.

In this article I’ll give some details on how I’ve extended the compiler with function procedures, bringing the functionality of Small Basic closer to that of VBScript.

The source code including the function procedure extension is available on BitBucket.

Extending the AST

First off the AST must be extended to support function declarations:

    | Sub of identifier * string list
    | EndSub
    // Language extensions
    | Function of identifier * string list
    | EndFunction

Next a way is needed to call functions:

and invoke =
    | Method of string * string * expr list
    | PropertyGet of string * string
    | Call of string * expr list // Language extension

Extending the Parser

Now the parser needs to recognise the new syntax:

let pparams = 
    between (str_ws "(") (str_ws ")") (sepBy pidentifier_ws (str_ws ","))
let pmethod = 
    pidentifier_ws .>>. opt pparams
    |>> (fun (name,ps) -> name, match ps with Some ps -> ps | None -> [])
let pfunction = 
    str_ws1 "Function" >>. pmethod |>> (fun (name,ps) -> Function(name,ps))
let pendfunction = 
    str_ws "EndFunction" |>> (fun _ -> EndFunction)

The Function keyword expects a method declaration which is composed of an identifier and optional parameters between parentheses.

Calls in expressions are recognized as a single identifier with a list of arguments:

let pcall = pidentifier_ws .>>. ptuple 
            |>> (fun (name,args) -> Call(name, args))


Code Generation

The code generator now needs to generate methods for both Sub and Function statements, with void and Primitive return values respectively. The generated methods may also have named arguments, passed by value. When an identifier is referenced in a statement the generator checks the method arguments in precedence to global fields. Finally return values are taken from the field with the same name as the method, as is the convention with the Visual Basic series of languages.

Sample

Here’s FizzBuzz in Small Basic utilizing the new extension:

' Returns Modulus
Function Modulus(Dividend,Divisor)
  Modulus = Dividend
  While Modulus >= Divisor
    Modulus = Modulus - Divisor
  EndWhile
EndFunction

For A = 1 To 100 ' Iterate from 1 to 100
  Mod3 = Modulus(A,3) ' A % 3
  Mod5 = Modulus(A,5) ' A % 5
  If Mod3 = 0 And Mod5 = 0 Then
    TextWindow.WriteLine("FizzBuzz")
  ElseIf Mod3 = 0 Then
    TextWindow.WriteLine("Fizz")
  ElseIf Mod5 = 0 Then
    TextWindow.WriteLine("Buzz")
  Else
    TextWindow.WriteLine(A)
  EndIf
EndFor

Conclusions

Extending the language touched small parts of the AST, parser and code generation steps. The whole process took only a couple of hours.

With the simple addition of function procedures with arguments and return values, Small Basic starts to approach the functionality of VBScript, and feel more like a grown up language for scripting.

It’s also starting to feel like the Small Basic AST and parser could be easily extended to support any of the Visual Basic family of languages, from VBScript to VBA to VB.Net.

Future

Just as Small Basic currently has an export to VB.Net option, another interesting future direction might be to transform Small Basic programs to JavaScript allowing truly cross platform deployment.

Small Basic Compiler

Small Basic, from Microsoft Dev Labs, is a minimal implementation of BASIC, employing only 14 keywords, aimed at beginners. In the last 2 articles I described building an internal DSL and parser to build an abstract syntax tree (AST) representation of the program with an interpreter for execution. In this article I’ll describe how to transform the AST to executable .Net IL code, thus building a compiler.

Small Basic is an imperative programming language where programs are built up of procedures containing statements. All variables are defined in global scope and are of variant type, as per classic VB

A Small Basic program can be modelled in .Net IL as a static type with a static method entry point for the main procedure and further static methods for subroutines. Variables can be defined as static fields holding variants of Primitive type (from the Small Basic library) . The control flow statements: If, For, While and Goto can be implemented as branches and GoSub as a method call.

Compiling a program involves passing over the statements in the AST and emitting IL code using .Net’s Reflection.Emit API to produce a .Net assembly.

Creating an assembly

The Small Basic program will be compiled to an executable .Net assembly, containing a module and underneath a type to hold the fields and methods.

Defining a dynamic assembly:

let assemblyBuilder =
    AppDomain.CurrentDomain.DefineDynamicAssembly(
        AssemblyName(name),
        AssemblyBuilderAccess.RunAndSave)  

Defining a module:

let moduleBuilder = assemblyBuilder.DefineDynamicModule(name+".exe")

Defining a type:

let typeBuilder = moduleBuilder.DefineType("Program", TypeAttributes.Public)

Saving the assembly to disk:

assemblyBuilder.Save(name+".exe")

Variables

The first pass of the transform finds all the variables that are explicitly set in the Small Basic program and then defines static fields for them.

There are 2 statements in the AST with explicit assignment:

type assign = Set of identifier * expr
/// Small Basic instruction
type instruction =
    | Assign of assign
    | For of assign * expr * expr

The variables can be collected by pattern matching over the instructions to yield the names:

[for instruction in instructions do
    match instruction with
    | Assign(Set(name,_)) -> yield name        
    | For(Set(name,_),_,_) -> yield name
    | _ -> ()
]

Static fields are generated of type Primitive:

let generateField name = 
    typeBuilder.DefineField(name, typeof<Primitive>, FieldAttributes.Static)

Procedures

The next pass finds all subroutines, Sub statements, and defines methods for them:

[for instruction in instructions do
    match instruction with
    | Sub(name) -> yield name
    | _ -> ()
]

Subroutines have no return value so they are defined as void:

let generateMethod name = 
    typeBuilder.DefineMethod(
        name, 
        MethodAttributes.Static ||| MethodAttributes.Public,
        typeof<Void>,
        [||])

Statements

The final pass iterates over the instructions and emits the corresponding IL codes:

let emitInstruction (il:ILGenerator) = function       
    | Assign(Set(name,expr)) ->
        let ty = emitExpression il expr
        il.Emit(OpCodes.Stsfld, fieldLookup name)
    // ... other statement types

In the code above the right hand side expression is evaluated and placed at the top of the stack. Then the Stsfld opcode is used to store the value in the named static field.

Expressions can be composed of literal values, variable lookups, function calls and arithmetic, comparison or logical operators:

let rec emitExpression (il:ILGenerator) = function
    | Literal(x) -> emitLiteral il x
    | Var(name) -> il.Emit(OpCodes.Ldsfld, fieldLookup name)
    | Arithmetic(lhs,Add,rhs) -> emitOp il lhs rhs "op_Addition" 
    | Arithmetic(lhs,Subtract,rhs) -> emitOp il lhs rhs "op_Subtraction"
    // ... other expression types

For literals the value is again put on top of the stack and passed to the Primitive constructor:

let emitPrimitive (il:ILGenerator) t =
    let ci = typeof<Primitive>.GetConstructor([|t|])
    il.Emit(OpCodes.Newobj, ci) 
let emitLiteral (il:ILGenerator) = function
    | Int(n) -> 
        il.Emit(OpCodes.Ldc_I4, n)
        emitPrimitive il typeof<int>
    | Double(n) -> 
        il.Emit(OpCodes.Ldc_R8, n)
        emitPrimitive il typeof<double>
    | String(s) -> 
        il.Emit(OpCodes.Ldstr, s)
        emitPrimitive il typeof<string>
    // ... other literal types

That completes transformation of the AST to IL code.

Source code

The full source code for the Small Basic compiler is available on BitBucket, the compiler is just a few hundred lines of code. The code parses and compiles a simple FizzBuzz program that can be run at the command prompt.

Conclusions

In the previous 2 articles we’ve seen that F#’s discriminated unions let you easily describe the abstract syntax tree (AST) of a programming language. Source code can then be easily transformed into AST form by incrementally building a parser in F# interactive using the FParsec parser combinator library. Finally pattern matching makes light work of passing over the AST to interpret the code or transform it to an executable assembly via .Net’s reflection emit API.

The parser, compiler or interpreter could be used for simple runtime scripting of an application and may be easily extended with new keywords and operators.

The technique described can be applied for a variety of DSLs or general purpose programming languages. Here’s the simple steps:

  1. Define an AST
  2. Write an internal DSL or a parser for an external DSL
  3. Build an interpreter or a compiler to execute programs

Further reading: Peter Sestoft’s Programming Language Concepts book, aimed at undergraduates, gives a great introduction to parser, compiler and interpreter technology, including a mini C like language implementation, with examples provided in F#.