Phillip Trelford's Array

POKE 36879,255

C# Records & Pattern Matching Proposal

Following on from VB.Net’s new basic pattern matching support, the C# team has recently put forward a proposal for record types and pattern matching in C# which was posted in the Roslyn discussion area on CodePlex, quote:

Pattern matching extensions for C# enable many of the benefits of algebraic data types and pattern matching from functional languages, but in a way that smoothly integrates with the feel of the underlying language. The basic features are: records, which are types whose semantic meaning is described by the shape of the data; and pattern matching, which is a new expression form that enables extremely concise multilevel decomposition of these data types. Elements of this approach are inspired by related features in the programming languages F# and Scala.

There has been a very active discussion on the forum ever since, particularly around syntax.

UPDATE: the discussion appears to have dried up in mid-September 2014 and the pattern matching in C# proposal doesn’t appear to have progressed,

Background

Algebraic types and pattern matching have been a core language feature in functional-first languages like ML (early 70s), Miranda (mid 80s), Haskell (early 90s) and F# (mid 00s).

I like to think of records as part of a succession of data types in a language:

Name Example (F#) Description
Scalar
let width = 1.0
let height = 2.0
Single values
Tuple
// Tuple of float * float
let rect = (1.0, 2.0)
Multiple values
Record
type Rect = {Width:float; Height:float}
let rect = {Width=1.0; Height=2.0}
Multiple named fields
Sum type(single case)
type Rect = Rect of float * float
let rect = Rect(1.0,2.0)
Tagged tuple
Sum type(named fields)
type Rect = Rect of width:float*height:float
let rect = Rect(width=1.0,height=2.0)
Tagged tuple with named fields
Sum type(multi case)
type Shape=
   | Circle of radius:float
   | Rect of width:float * height:float
Union of tagged tuples

Note: in F# sum types are also often referred to as discriminated unions or union types, and in functional programming circles algebraic data types tend to refer to tuples, records and sum types.

Thus in the ML family of languages records are like tuples with named fields. That is, where you use a tuple you could equally use a record instead to add clarity, but at the cost of defining a type. C#’s anonymous types fit a similar lightweight data type space, but as there is no type definition their scope is limited (pun intended).

For the most part I find myself pattern matching over tuples and sum types in F# (or in Erlang simply using tuples where the first element is the tag to give a similar effect).

Sum Types

The combination of sum types and pattern matching is for me one of the most compelling features of functional programming languages.

Sum types allow complex data structures to be succinctly modelled in just a few lines of code, for example here’s a concise definition for a generic tree:

type 'a Tree =
    | Tip
    | Node of 'a * 'a Tree * 'a Tree

Using pattern matching the values in a tree can be easily summed:

let rec sumTree tree =
    match tree with
    | Tip -> 0
    | Node(value, left, right) ->
        value + sumTree(left) + sumTree(right)

The technique scales up easily to domain models, for example here’s a concise definition for a retail store:

/// For every product, we store code, name and price
type Product = Product of Code * Name * Price

/// Different options of payment
type TenderType = Cash | Card | Voucher

/// Represents scanned entries at checkout
type LineItem = 
  | Sale of Product * Quantity
  | Cancel of int
  | Tender of Amount * TenderType

Class Hierarchies versus Pattern Matching

In class-based programming languages like C# and Java, classes are the primary data type  where (frequently mutable) data and related methods are intertwined. Hierarchies of related types are typically described via inheritance. Inheritance makes it relatively easy to add new types, but adding new methods or behaviour usually requires visiting the entire hierarchy. That said the compiler can help here by emitting an error if a required method is not implemented.

Sum types also describe related types, but data is typically separated from functions, where functions employ pattern matching to handle separate cases. This pattern matching based approach makes it easier to add new functions, but adding a new case may require visiting all existing functions. Again the compiler helps here by emitting a warning if a case is not covered.

Another subtle advantage of using sum types is being able to see the behaviour for all cases in a single place, which can be helpful for readability. This may also help when attempting to separate concerns, for example if we want to add a method to print to a device to a hierarchy of classes in C# we could end up adding printer related dependencies to all related classes. With a sum type the printer functionality and related dependencies are more naturally encapsulated in a single module

In F# you have the choice of class-based inheritance or sum types and can choose in-situ. In practice most people appear to use sum types most of the time.

C# Case Classes

The C# proposal starts with a simple “record” type definition:

public record class Cartesian(double x: X, double y: Y);

Which is not too dissimilar to an F# record definition, i.e.:

type Cartesian = { X: double, Y: double }

However from there it then starts to differ quite radically. The C# proposal allows a “record” to inherit from another class, in effect allowing sum types to be defined, i.e:

abstract class Expr; 
record class X() : Expr; 
record class Const(double Value) : Expr; 
record class Add(Expr Left, Expr Right) : Expr; 
record class Mult(Expr Left, Expr Right) : Expr; 
record class Neg(Expr Value) : Expr;

which allows pattern matching to be performed using an extended switch case statement:

switch (e) 
{ 
  case X(): return Const(1); 
  case Const(*): return Const(0); 
  case Add(var Left, var Right): 
    return Add(Deriv(Left), Deriv(Right)); 
  case Mult(var Left, var Right): 
    return Add(Mult(Deriv(Left), Right), Mult(Left, Deriv(Right))); 
  case Neg(var Value): 
    return Neg(Deriv(Value)); 
}

This is very similar to Scala case classes, in fact change “record” to case, drop semicolons and voilà:

abstract class Term
case class Var(name: String) extends Term
case class Fun(arg: String, body: Term) extends Term
case class App(f: Term, v: Term) extends Term

To sum up, the proposed C# “record” classes appear to be case classes which support both single and multi case sum types.

Language Design

As someone who has to spend some of their time working in C# and who feels more productive having concise types and pattern matching in their toolbox, overall I welcome our new overlords this proposal.

From my years of experience using F#, I feel it would be nice to see a simple safety feature included, to what is in effect a sum type representation, so that sum types can be exhaustive. This would allow compile time checks to ensure that all cases have been covered in a switch/case statement, and a warning given otherwise.

Then again, I feel this is quite a radical departure from the style of implementation I’ve seen in C# codebases in the wild, to the point where it’s starting to look like an entirely different language… and so this may be a feature that if it does see the light of day is likely to get more exposure in C# shops working on greenfield projects.

Running TAP

The Test Anything Protocol (TAP) is a text-based protocol for test results:

 1..4
 ok 1 - Input file opened
 not ok 2 - First line of the input valid
 ok 3 - Read the rest of the file
 not ok 4 - Summarized correctly # TODO Not written yet

 

I think the idea is an good one, a simple cross-platform human readable standard for formatting test results. There are TAP producers and consumers for Perl, Java, JavaScript etc. allowing you to join up tests for cross-platform projects.

NUnit runner

Over the last week or so I’ve created a TAP runner F# script for NUnit test methods. It supports the majority of NUnit’s attributes including ExpectedException, TimeOut and test generation with TestCase, TestCaseSource, Values, etc., .

The runner can be used in a console app to produce TAP output to the console or directly in F# interactive for running tests embedded in a script.

TAP run

Tests can be organized in classes:

type TAPExample () =
    [<Test>] member __.``input file opened`` () = Assert.Pass()
    [<Test>] member __.``First line of the input valid`` () = Assert.Fail()
    [<Test>] member __.``Read the rest of the file`` () = Assert.Pass()
    [<Test>] member __.``Summarized correctly`` () = Assert.Fail()

Tap.Run typeof<TAPExample>

or in modules:

let [<Test>] ``input file opened`` () = Assert.Pass()
let [<Test>] ``First line of the input valid`` () = Assert.Fail()
let [<Test>] ``Read the rest of the file`` () = Assert.Pass()
let [<Test>] ``Summarized correctly`` () = Assert.Fail()
type Marker = interface end
Tap.Run typeof<Marker>.DeclaringType

Note: the marker interface is used to reflect the module’s type.

Console output

In the console test cases are marked in red or green:

image


Debugging

If you create an F# Tutorial project you get both, an F# script file that runs as a console application allowing you to set breakpoints in your script with the Visual Studio debugger.

image


Prototyping

When I’m prototyping a new feature I typically use the F# interactive environment for quick feedback and to do exploratory testing. The TAP runner lets you create and run NUnit  formatted tests directly in the script file before later promoting to a full fat project for use in a continuous build environment.

F# Scripting

Interested in learning more about F# scripting, pop along to the Phil Nash’s talk at the F#unctional Londoners tonight.

Ocean Revival

This weekend I had the pleasure of sitting on the Ocean Q&A panel at Revival 2014 in Wolverhampton. I worked at Ocean in Manchester in my early 20s on titles like Jurassic Park (PC & Amiga) and Addams Family Values (SNES & Megadrive). It was fun to reminisce about the good old days with other former Ocean employees and people who’d played the games.

Ocean panel

The panel closely coincided with the public release of The History of Ocean Software book by Chris Wilkins which was funded as a Kickstarter:

Ocean the history

There were plenty of old games to play at the event too. I particularly enjoyed Rez on a PS2, Omega Race on a Vic-20 and a Flappy Bird clone on a Commodore 64.

Flappy Bird C64

When we got home, my 7yo and I pulled the Vic-20 out of the garage, and played some more Omega Race with the joystick we’d just picked up:



My 7yo has been picking up Python recently, with a Coding Club - Python Basics book.

One of the tasks is to print out the 5 times table:

number = 1
while i <= 12:
    print(number,"x 5 =",number*5)
    number = number + 1

Funnily enough Vic-20 Basic (circa 1981) was easily up to the challenge too:

5 times table

And good old FizzBuzz was no bother either:

FizzBuzz Vic-20

Then my son had a go at Magic 8-ball, but sadly lost the code he’d spent a while typing in when it is closed, so we re-wrote it again in F# so there was less to type:

let answers =[
   "Go for it!"
   "No way Jose!"
   "I'm not sure. Ask me again."
   "Fear of the unknown is what imprisons us."
   "It would be madness to do that!"
   "Only you can save mankind!"
   "Makes no difference to me, do or don't - whatever."
   "Yes, I think on balance that is the right choice"
   ]

printfn "Welcome to Magic 8-Ball."

printfn "Ask me for advice and then press enter to shake me"
System.Console.ReadLine() |> ignore

for i = 1 to 4 do printfn "Shaking..."

let rand = System.Random()
let choice = rand.Next(answers.Length)

printfn "%s" (answers.[choice])

Why not dig out your old computers and have some programming and games fun! :)