Phil Trelford's Array
POKE 36879, 255

C# Records & Pattern Matching Proposal

August 25, 2014 13:27 by phil

Following on from VB.Net’s new basic pattern matching support, the C# team has recently put forward a proposal for record types and pattern matching in C# which was posted in the Roslyn discussion area on CodePlex:

Pattern matching extensions for C# enable many of the benefits of algebraic data types and pattern matching from functional languages, but in a way that smoothly integrates with the feel of the underlying language. The basic features are: records, which are types whose semantic meaning is described by the shape of the data; and pattern matching, which is a new expression form that enables extremely concise multilevel decomposition of these data types. Elements of this approach are inspired by related features in the programming languages F# and Scala.

There has been a very active discussion on the forum ever since, particularly around syntax.

Background

Algebraic types and pattern matching have been a core language feature in functional-first languages like ML (early 70s), Miranda (mid 80s), Haskell (early 90s) and F# (mid 00s).

I like to think of records as part of a succession of data types in a language:

Name Example (F#) Description
Scalar
let width = 1.0
let height = 2.0
Single values
Tuple
// Tuple of float * float
let rect = (1.0, 2.0)
Multiple values
Record
type Rect = {Width:float; Height:float}
let rect = {Width=1.0; Height=2.0}
Multiple named fields
Sum type(single case)
type Rect = Rect of float * float
let rect = Rect(1.0,2.0)
Tagged tuple
Sum type(named fields)
type Rect = Rect of width:float*height:float
let rect = Rect(width=1.0,height=2.0)
Tagged tuple with named fields
Sum type(multi case)
type Shape=
   | Circle of radius:float
   | Rect of width:float * height:float
Union of tagged tuples

Note: in F# sum types are also often referred to as discriminated unions or union types, and in functional programming circles algebraic data types tend to refer to tuples, records and sum types.

Thus in the ML family of languages records are like tuples with named fields. That is, where you use a tuple you could equally use a record instead to add clarity, but at the cost of defining a type. C#’s anonymous types fit a similar lightweight data type space, but as there is no type definition their scope is limited (pun intended).

For the most part I find myself pattern matching over tuples and sum types in F# (or in Erlang simply using tuples where the first element is the tag to give a similar effect).

Sum Types

The combination of sum types and pattern matching is for me one of the most compelling features of functional programming languages.

Sum types allow complex data structures to be succinctly modelled in just a few lines of code, for example here’s a concise definition for a generic tree:

type 'a Tree =
    | Tip
    | Node of 'a * 'a Tree * 'a Tree

Using pattern matching the values in a tree can be easily summed:

let rec sumTree tree =
    match tree with
    | Tip -> 0
    | Node(value, left, right) ->
        value + sumTree(left) + sumTree(right)

The technique scales up easily to domain models, for example here’s a concise definition for a retail store:

/// For every product, we store code, name and price
type Product = Product of Code * Name * Price

/// Different options of payment
type TenderType = Cash | Card | Voucher

/// Represents scanned entries at checkout
type LineItem = 
  | Sale of Product * Quantity
  | Cancel of int
  | Tender of Amount * TenderType

Class Hierarchies versus Pattern Matching

In class-based programming languages like C# and Java, classes are the primary data type  where (frequently mutable) data and related methods are intertwined. Hierarchies of related types are typically described via inheritance. Inheritance makes it relatively easy to add new types, but adding new methods or behaviour usually requires visiting the entire hierarchy. That said the compiler can help here by emitting an error if a required method is not implemented.

Sum types also describe related types, but data is typically separated from functions, where functions employ pattern matching to handle separate cases. This pattern matching based approach makes it easier to add new functions, but adding a new case may require visiting all existing functions. Again the compiler helps here by emitting a warning if a case is not covered.

Another subtle advantage of using sum types is being able to see the behaviour for all cases in a single place, which can be helpful for readability. This may also help when attempting to separate concerns, for example if we want to add a method to print to a device to a hierarchy of classes in C# we could end up adding printer related dependencies to all related classes. With a sum type the printer functionality and related dependencies are more naturally encapsulated in a single module

In F# you have the choice of class-based inheritance or sum types and can choose in-situ. In practice most people appear to use sum types most of the time.

C# Case Classes

The C# proposal starts with a simple “record” type definition:

public record class Cartesian(double x: X, double y: Y);

Which is not too dissimilar to an F# record definition, i.e.:

type Cartesian = { X: double, Y: double }

However from there it then starts to differ quite radically. The C# proposal allows a “record” to inherit from another class, in effect allowing sum types to be defined, i.e:

abstract class Expr; 
record class X() : Expr; 
record class Const(double Value) : Expr; 
record class Add(Expr Left, Expr Right) : Expr; 
record class Mult(Expr Left, Expr Right) : Expr; 
record class Neg(Expr Value) : Expr;

which allows pattern matching to be performed using an extended switch case statement:

switch (e) 
{ 
  case X(): return Const(1); 
  case Const(*): return Const(0); 
  case Add(var Left, var Right): 
    return Add(Deriv(Left), Deriv(Right)); 
  case Mult(var Left, var Right): 
    return Add(Mult(Deriv(Left), Right), Mult(Left, Deriv(Right))); 
  case Neg(var Value): 
    return Neg(Deriv(Value)); 
}

This is very similar to Scala case classes, in fact change “record” to case, drop semicolons and voilà:

abstract class Term
case class Var(name: String) extends Term
case class Fun(arg: String, body: Term) extends Term
case class App(f: Term, v: Term) extends Term

To sum up, the proposed C# “record” classes appear to be case classes which support both single and multi case sum types.

Language Design

As someone who has to spend some of their time working in C# and who feels more productive having concise types and pattern matching in their toolbox, overall I welcome our new overlords this proposal.

From my years of experience using F#, I feel it would be nice to see a simple safety feature included, to what is in effect a sum type representation, so that sum types can be exhaustive. This would allow compile time checks to ensure that all cases have been covered in a switch/case statement, and a warning given otherwise.

Then again, I feel this is quite a radical departure from the style of implementation I’ve seen in C# codebases in the wild, to the point where it’s starting to look like an entirely different language… and so this may be a feature that if it does see the light of day is likely to get more exposure in C# shops working on greenfield projects.


Tags:
Categories: .Net | C# | F# | Scala | Haskell | Erlang
Actions: E-mail | Permalink | Comments (3) | Comment RSSRSS comment feed

Orleans

April 3, 2014 10:28 by phil

Microsoft’s Build 2014 conference is currently in full flow, one of the new products announced is Orleans, an Actor framework with a focus on Azure.

There’s an MSDN blog article with some details, apparently it was used on Halo 4.

Demis Bellot of ServiceStack fame, tweeted his reaction:

I retweeted, as it wasn’t far off my initial impression and the next thing I know my phone is going crazy with replies and opinions from the .Net community and Microsoft employees. From what I can make out the .Net peeps weren’t overly impressed, and the Microsoft peeps weren’t overly impressed that they weren’t overly impressed.

So what’s the deal.

Actors

Erlang has distributed actors via OTP, this is the technology behind WhatsApp, recently acquired for $19BN!

The JVM has the ever popular Akka which is based heavily on Erlang and OTP.

An industrial strength distributed actor model for .Net should be a good thing. In fact Microsoft are currently also working on another actor framework called ActorFX,

The .Net open source community have projects in the pipeline too including:

There’s also in-memory .Net actor implementations with F#’s MailboxProcessor and TPL Dataflow. Not to mention the departed Axum and Retlang projects.

Orleans

From what I can tell, Orleans appears to be focused on Azure, making use of it’s proprietary APIs, so there's probably still a big space for the community's open source projects to fill.

Like Demis I’m not a huge fan of WCF XML configuration and code generation. From the Orleans MSDN post, XML and code-gen seem to be front and centre.

You write an interface, derive from an interface, add attributes and then implement methods, which must return Task<T>. Then you do some XML configuration and Orleans does some code generation magic for hydration/dehydration of your objects (called grains).

Smells like teen spirit WCF, that is methods are king, although clearly I could be reading it wrong.

From my limited experience with actors in F# and Erlang, messages and message passing are king, with pattern matching baked into the languages to make things seamless.

Initial impressions are that Orleans is a long way from Erlang Kansas…

The Good Parts

Building a fault-tolerant enterprise distributed actor model for .Net is significant, and could keep people on the platform where they may have turned with Erik Meijer to the JVM, Scala and Akka otherwise.

Putting async front and centre is also significant as it simplifies the programming model.

C# 5’s async is based on F#’s asynchronous workflows, which was originally developed to support actors via the MailBoxProcessor.

Coroutines

Underneath, Erlang’s processes, F#’s async workflows and C#’s async/await are simply implementations of coroutines.

Coroutines are subroutines that allow multiple entry points for suspending and resuming execution at certain locations. They’ve been used in video games for as long as I can remember (which only goes back as far as the 80s).

Coroutines help make light work of implementing state machines and workflows.

Methods

In Erlang messages are typically described as named tuples (an atom is used as the name), and in F# discriminated unions are typically employed.

Orleans appears to use methods as the message type, where the method name is analogous to an Erlang atom name, or an F# union case name and the parameters are analogous to a tuple. So far so good.

Where they differ is that return values are first-class for methods, and return values feel more like an RPC approach. In fact this is the example given in the article:

public class HelloGrain : Orleans.GrainBase, IHello
{
  Task<string> IHello.SayHelloAsync(string greeting)
  {
    return Task.FromResult("You said: '" + greeting + "', I say: Hello!");
  }
}

Also current wisdom for C# async is to avoid async void... which is why I guess they’ve plumped for Task as the convention for methods with no return value.

Messages

.Net’s built-in binary serialization is bottom of the league for size and performance, hopefully alternative serialization libraries like Google Protocol Buffers will be supported.

Judge for yourself

But these are just my initial impressions, try out the samples and judge for yourself.


Tags:
Categories: .Net | C# | F# | Erlang | Scala | Architecture
Actions: E-mail | Permalink | Comments (2) | Comment RSSRSS comment feed

Basic Tuples & Pattern Matching

January 18, 2014 06:42 by phil

Over the last couple of weeks I’ve been building my own parser, interpreter and compiler for Small Basic, a dialect of BASIC with only 14 keywords aimed at beginners. Despite, or perhaps because of, Small Basic’s simplicity some really fun programs have been developed, from games like Tetris and 3D Maze to a parser for the language itself.

Small Basic provides primitive types for numbers, strings and associative arrays. There is no syntax provided for structures, but these can be easily modelled with the associative arrays. For example a 3D point can be constructed with named items or ordinals:

Named items Ordinals
Point["X"] = 1.0
Point["Y"] = 2.0
Point["Z"] = 3.0
Point[0] = 1.0
Point[1] = 2.0
Point[2] = 3.0

In languages like Erlang and Python this could be more concisely expressed as a tuple:

Erlang Python
Point = {1.0, 2.0, 3.0}
point = (1.0, 2.0, 3.0)

In fact sophisticated Erlang programs are built entirely from tuples and lists, there is no explicit class or inheritance syntax in the language. Messages can be easily expressed with tuples and behaviour via pattern matching.

Alan Kay, inventor of the Smalltalk language has said:

The notion of object oriented programming is completely misunderstood. It's not about objects and classes, it's all about messages.

In Erlang a hierarchy of shapes can simply be modelled using tuples with atoms for names:

Circle = { circle, 5.0 }
Square = { square, 7.0 }
Rectangle = { rectangle, 10.0, 5.0 }

The area of a shape can be expressed using pattern matching:

area(Shape) ->
  case Shape of
    { circle, R } -> pi() * R * R;
    { square, W } -> W * W;
    { rect, W, H } -> W * H
  end.

Select Case

The Visual Basic family’s Select Case functionality is quite rich. More so than the switch/case statements of the mainstream C dialects: Java, C# and C++, which only match literals.

In Visual Basic it is already possible to match values with literals, conditions or ranges:

Select Case agerange
  Case Is < 16
    MsgBox("Minor")
  Case 16 To 21
    MsgBox("Still Young")
  Case 50 To 64
    MsgBox("Start Lying")
  Case Is > 65
    MsgBox("Be Yourself") 
  Case Else
    MsgBox("Inbetweeners")
End Select

Given that Select Case in VB is already quite expressive, it feels support for tuples and pattern matching over them would feel quite natural in the language.

Extended Small Basic

To this end I have extended my Small Basic parser and compiler implementation with tuple and pattern matching support.

Tuples

Inspiration for construction and deconstruction was taken from F# and Python:

F# Python
let person = ("Phil", 27)
let (name, age) = person
person = ("Phil", 27)
name, age = person

So that tuples use explicit parentheses in the extended Small Basic implementation:

Person = ("Phil", 27)
(Name, Age) = Person

Internally tuples are represented using Small Basic’s built-in associative arrays.

Pattern Matching

First I implemented VB’s Select Case statements, which is not hugely dissimilar to parsing and compiling Small Basic’s If/ElseIf/Else statements.

Then I extended Select Case to support matching tuples with similar functionality to F#:

F# Extended Small Basic
let xor x y =
  match (x,y) with
  | (1,1) -> 0
  | (1,0) -> 1
  | (0,1) -> 1
  | (0,0) -> 0
Function Xor(a,b)
  Select Case (a,b)
    Case (1,1)
      Xor = 0
    Case (1,0)
      Xor = 1
    Case (0,1)
      Xor = 1
    Case (0,0)
      Xor = 0
  EndSelect
EndFunction

Constructing, deconstructing and matching nested tuples is also supported.

Example

Putting it altogether, FizzBuzz can now be expressed in my extended Small Basic implementation with functions, tuples and pattern matching:

Function Mod(Dividend,Divisor)
  Mod = Dividend
  While Mod >= Divisor
    Mod = Mod - Divisor
  EndWhile
EndFunction

Sub Echo(s)
  TextWindow.WriteLine(s)
EndSub

For A = 1 To 100 ' Iterate from 1 to 100
  Select Case (Mod(A,3),Mod(A,5))
    Case (0,0)
      Echo("FizzBuzz")
    Case (0,_)
      Echo("Fizz")
    Case (_,0)
      Echo("Buzz")
    Case Else
      Echo(A)
  EndSelect
EndFor

Conclusions

Extending Small Basic with first class support for tuples was relatively easy, and I feel quite natural in the language. It provides object orientated programming without the need for a verbose class syntax. I think this is something that would probably work pretty well in other BASIC dialects including Visual Basic.

Source code is available on BitBucket: https://bitbucket.org/ptrelford/smallbasiccompiler


Tags:
Categories: F# | Erlang | Python
Actions: E-mail | Permalink | Comments (1) | Comment RSSRSS comment feed