Phil Trelford's Array
POKE 36879, 255

C# Records & Pattern Matching Proposal

August 25, 2014 13:27 by phil

Following on from VB.Net’s new basic pattern matching support, the C# team has recently put forward a proposal for record types and pattern matching in C# which was posted in the Roslyn discussion area on CodePlex:

Pattern matching extensions for C# enable many of the benefits of algebraic data types and pattern matching from functional languages, but in a way that smoothly integrates with the feel of the underlying language. The basic features are: records, which are types whose semantic meaning is described by the shape of the data; and pattern matching, which is a new expression form that enables extremely concise multilevel decomposition of these data types. Elements of this approach are inspired by related features in the programming languages F# and Scala.

There has been a very active discussion on the forum ever since, particularly around syntax.

Background

Algebraic types and pattern matching have been a core language feature in functional-first languages like ML (early 70s), Miranda (mid 80s), Haskell (early 90s) and F# (mid 00s).

I like to think of records as part of a succession of data types in a language:

Name Example (F#) Description
Scalar
let width = 1.0
let height = 2.0
Single values
Tuple
// Tuple of float * float
let rect = (1.0, 2.0)
Multiple values
Record
type Rect = {Width:float; Height:float}
let rect = {Width=1.0; Height=2.0}
Multiple named fields
Sum type(single case)
type Rect = Rect of float * float
let rect = Rect(1.0,2.0)
Tagged tuple
Sum type(named fields)
type Rect = Rect of width:float*height:float
let rect = Rect(width=1.0,height=2.0)
Tagged tuple with named fields
Sum type(multi case)
type Shape=
   | Circle of radius:float
   | Rect of width:float * height:float
Union of tagged tuples

Note: in F# sum types are also often referred to as discriminated unions or union types, and in functional programming circles algebraic data types tend to refer to tuples, records and sum types.

Thus in the ML family of languages records are like tuples with named fields. That is, where you use a tuple you could equally use a record instead to add clarity, but at the cost of defining a type. C#’s anonymous types fit a similar lightweight data type space, but as there is no type definition their scope is limited (pun intended).

For the most part I find myself pattern matching over tuples and sum types in F# (or in Erlang simply using tuples where the first element is the tag to give a similar effect).

Sum Types

The combination of sum types and pattern matching is for me one of the most compelling features of functional programming languages.

Sum types allow complex data structures to be succinctly modelled in just a few lines of code, for example here’s a concise definition for a generic tree:

type 'a Tree =
    | Tip
    | Node of 'a * 'a Tree * 'a Tree

Using pattern matching the values in a tree can be easily summed:

let rec sumTree tree =
    match tree with
    | Tip -> 0
    | Node(value, left, right) ->
        value + sumTree(left) + sumTree(right)

The technique scales up easily to domain models, for example here’s a concise definition for a retail store:

/// For every product, we store code, name and price
type Product = Product of Code * Name * Price

/// Different options of payment
type TenderType = Cash | Card | Voucher

/// Represents scanned entries at checkout
type LineItem = 
  | Sale of Product * Quantity
  | Cancel of int
  | Tender of Amount * TenderType

Class Hierarchies versus Pattern Matching

In class-based programming languages like C# and Java, classes are the primary data type  where (frequently mutable) data and related methods are intertwined. Hierarchies of related types are typically described via inheritance. Inheritance makes it relatively easy to add new types, but adding new methods or behaviour usually requires visiting the entire hierarchy. That said the compiler can help here by emitting an error if a required method is not implemented.

Sum types also describe related types, but data is typically separated from functions, where functions employ pattern matching to handle separate cases. This pattern matching based approach makes it easier to add new functions, but adding a new case may require visiting all existing functions. Again the compiler helps here by emitting a warning if a case is not covered.

Another subtle advantage of using sum types is being able to see the behaviour for all cases in a single place, which can be helpful for readability. This may also help when attempting to separate concerns, for example if we want to add a method to print to a device to a hierarchy of classes in C# we could end up adding printer related dependencies to all related classes. With a sum type the printer functionality and related dependencies are more naturally encapsulated in a single module

In F# you have the choice of class-based inheritance or sum types and can choose in-situ. In practice most people appear to use sum types most of the time.

C# Case Classes

The C# proposal starts with a simple “record” type definition:

public record class Cartesian(double x: X, double y: Y);

Which is not too dissimilar to an F# record definition, i.e.:

type Cartesian = { X: double, Y: double }

However from there it then starts to differ quite radically. The C# proposal allows a “record” to inherit from another class, in effect allowing sum types to be defined, i.e:

abstract class Expr; 
record class X() : Expr; 
record class Const(double Value) : Expr; 
record class Add(Expr Left, Expr Right) : Expr; 
record class Mult(Expr Left, Expr Right) : Expr; 
record class Neg(Expr Value) : Expr;

which allows pattern matching to be performed using an extended switch case statement:

switch (e) 
{ 
  case X(): return Const(1); 
  case Const(*): return Const(0); 
  case Add(var Left, var Right): 
    return Add(Deriv(Left), Deriv(Right)); 
  case Mult(var Left, var Right): 
    return Add(Mult(Deriv(Left), Right), Mult(Left, Deriv(Right))); 
  case Neg(var Value): 
    return Neg(Deriv(Value)); 
}

This is very similar to Scala case classes, in fact change “record” to case, drop semicolons and voilà:

abstract class Term
case class Var(name: String) extends Term
case class Fun(arg: String, body: Term) extends Term
case class App(f: Term, v: Term) extends Term

To sum up, the proposed C# “record” classes appear to be case classes which support both single and multi case sum types.

Language Design

As someone who has to spend some of their time working in C# and who feels more productive having concise types and pattern matching in their toolbox, overall I welcome our new overlords this proposal.

From my years of experience using F#, I feel it would be nice to see a simple safety feature included, to what is in effect a sum type representation, so that sum types can be exhaustive. This would allow compile time checks to ensure that all cases have been covered in a switch/case statement, and a warning given otherwise.

Then again, I feel this is quite a radical departure from the style of implementation I’ve seen in C# codebases in the wild, to the point where it’s starting to look like an entirely different language… and so this may be a feature that if it does see the light of day is likely to get more exposure in C# shops working on greenfield projects.


Tags:
Categories: .Net | C# | F# | Scala | Haskell | Erlang
Actions: E-mail | Permalink | Comments (4) | Comment RSSRSS comment feed

Code Golf

August 2, 2014 11:22 by phil

Last month Grant Crofton popped down from Leeds to the F#unctional Londoners meetup at Skills Matter to run a fun hands on code golf session. The idea of code golf is to implement a specific algorithm in the fewest possible characters. This is not the kind of thing you should be doing in enterprise code, but it is fun, and an interesting way of exploring features in a programming language.

On the night we attempted condensed versions of FizzBuzz and 99 bottles of beer, with Ross and I winning the first challenge and Simon & Adrian the second.

FizzBuzz Score99 Bottles Score

Thanks again to Grant for a really fun session.

F# FizzBuzz

A while back I strived to squeeze an F# implementation of FizzBuzz into a tweet, and with white space removed it weighs in at 104 characters (excluding line breaks):

for n=1 to 100 do 
 printfn"%s"
  (match n%3,n%5 with 0,0->"FizzBuzz"|0,_->"Fizz"|_,0->"Buzz"|_,_->string n)

The implementation, although quite clear, requires a fair number of characters for the pattern matching portion.

After some head scratching we came up with the idea of using a lookup to select the string to display:

N %3 N % 5 Index Output
0 0 0 N
>0 0 1 “Fizz”
0 >0 2 “Buzz”
>0 >0 3 “FizzBuzz”

This took the implementation down to 89 characters (without line breaks):

for i=1 to 100 do
 printfn"%s"["FizzBuzz";"Buzz";"Fizz";string i].[sign(i%5)*2+sign(i%3)]

Another trick is to abuse the sign function, to get 1 if the result of the modulus is above 0 and 0 otherwise.

The lookup trick can be used in other languages, and here’s a few examples, just for fun.

VB.Net FizzBuzz

VB.Net has a reputation for being a little verbose, but using the lookup trick it was possible to it get down to 96 characters (excluding line breaks):

For i=1 To 100:
Console.WriteLine({i,"Fizz","Buzz","FizzBuzz"}(-(i Mod 3=0)-(i Mod 5=0)*2))
:Next

In VB.Net true values translate to –1 and false to 0. This allowed me to simply negate the result of i % N = 0 to compute an index.

Python FizzBuzz

Using a similar trick in Python, where true translates to 1 and 0 to false, I was able to get to a very respectable 79 characters (excluding line breaks):

for x in range(1,101):
 print([x,"Fizz","Buzz","FizzBuzz"][(x%3==0)+2*(x%5==0)])

Python’s simple print function also helped to keep the character count down.

Amanda FizzBuzz

Amanda is a variant of David Turner’s quintessential functional programming language Miranda. Amanda runs on Windows, and is used for teaching FP at some universities.

Using a list comprehension it was possible to squeeze in to a mere 67 characters:

[["FizzBuzz","Buzz","Fizz",itoa x]!(min[1,x%3]+min[1,x%5]*2)|x<-[1..100]]

Note: this is cheating a little as we are not explicitly writing to the console.

APL FizzBuzz

APL is a very mature language, dating back to the 1960s, and is still used commercially today. It also wins hands down in code golf with just 54 characters:

⍪{'FizzBuzz' 'Buzz' 'Fizz'⍵[1+(2×1⌊5|⍵)+1⌊3|⍵]}¨⍳100

APL is particularly strong at matrix processing and provides single character representations for common operations:

Notation Name Meaning
B Index generator Creates a list from 1 to B
¨ Each for each loop
Table Produces a one column matrix
B Floor Greatest integer less than or equal to B

Try APL in your browser now.

Challenge

Can you beat the APL implementation of FizzBuzz?

Have fun!


Tags:
Categories: F# | Basic | Python | Haskell | APL
Actions: E-mail | Permalink | Comments (1) | Comment RSSRSS comment feed

Try 10 Programming Languages in 10 minutes

August 31, 2013 10:47 by phil

There are a lot of interesting programming languages out there, but downloading and setting up the environment can be very time consuming when you just want to try one out. The good news is that you can try out many languages in your browser straight away, often with tutorials which guide you through the basics.

Following the pattern of 7 languages in 7 weeks book, here’s a somewhat abridged version.

Dynamic Languages

Fed up of long compile times, want a lightweight environment for scripting? Dynamic languages could be your new friend.

Try Lua

Lua is a lightweight dynamic language with excellent coroutine support and a simple C API making it hugely popular in video gaming for scripting. Have fun with game engines like LÖVE and Marmalade Quick.

Try Clojure

Clojure is the brainchild of the hugely charismatic speaker Rich Hickey, it is a descendant of one of the earliest programming languages LISP. There’s a really rich community around Clojure, one of my favourite projects is Sam Aaron’s Overtone live coding audio environment.

Try R (quick registration required)

R is a free environment for statistical computing and graphics, with a huge range of user-submitted packages. Ever wondered how to draw an egg?

Functional Languages

Aspects of functional programming have permeated most mainstream languages from C++ to VB. However to really appreciate the expressiveness of the functional approach a functional-first language is required.

Try Erlang

Erlang is a really interesting language for building fault tolerant concurrent systems. It also has great pattern matching capabilities. It has many industrial applications and tools including the RabbitMQ messaging system and the distribute database Riak.

Try Haskell

Haskell is heavily based on the Miranda programming language which was taught in British universities in the 80s and 90s. Haskell added Monads and Type Classes, and is still taught in a few universities, it is also still quite popular in academic research.

Try OCaml

OCaml like Miranda is based on the ML programming language adding object-oriented constructs. F# is based on OCaml, there is even a compatibility mode. OCaml still has industrial application, for example at Jane Street Capital and XenSource.

Web Languages

There’s a plethora of languages that compile to JavaScript languages out there. Also worth a look are the new features in JavaScript itself, see Brendan Eich’s talk at Strangeloop last year on the The State of JavaScript. Here’s 3 *Script languages I find particularly interesting:

LiveScript

LiveScript is an indirect descendant of CoffeeScript with features to assist functional programming like pattern matching and function composition. Check out 10 LiveScript one liners to impress your friends.

Try Elm

Elm is a functional reactive language for creating highly interactive programs, including games. Reactive programming is an interesting direction and I think languages designed specifically for this are worth investigating.

PogoScript

Unfortunately there’s currently no online editor for this one, but there is a command line REPL. PogoScript is DSL friendly allowing white space in function names.

Esoteric Languages

Esoteric languages tend to be write-only, a bit like Perl but just for fun.

Try Brainfuck

Brainfuck is the Rubik’s cube of programming languages. I built the site last year with the interpreter written in plain old JavaScript, check out the fib sample.

Browser IDEs

With so many programming language experimentation environments available online, the next logical step is to host the IDE there. Imagine not having to wait 4 hours for Visual Studio to install.

Cloud 9 is an online environment for creating Node.js apps, pulling together sets of relevant packages. Tools like Sploder let you build games online.

The Try F# site offers arguably the most extensive online learning features of any language. Cloud Tsunami IDE also offers a rich online development experience for F#. In the near future CloudSharper will offer an online IDE experience for developing web applications with F# using WebSharper,

Scaling up

Once you’ve completed some basic tasks in a new language you’ll want to move on to slightly larger tasks. I like to use exercises from the coding Kata Catalogue like FizzBuzz, the Game of Life and Minesweeper.

Some people enjoy going through the Project Euler problems, others have their own hello world applications. For Martin Trojer it’s a Scheme interpreter and Luke Hoban often writes a Ray Tracer.

I’d also recommend joining a local meetup group. The London Scala meetup have a coding dojo every month and the F#unctional Londoners meetup have hands on session in the middle the month, the next one is on Machine Learning.

Programming language books that include questions at the end of sections are a good way to practice what you’ve learned but are few and far between. The recent Functional Programming with F# book is an excellent example of what can be done with questions at the end of each chapter.

While the basics of a language can be picked up in a few hours, expect it to take a few weeks before you’re productive and at least a few months before you start to gain mastery.

Want to write your own language? Pete Sestoft’s Programming Language Concepts book offers a good introduction to the subject.


Tags:
Categories: F# | Lua | JavaScript | Clojure | Erlang | Haskell | Scala
Actions: E-mail | Permalink | Comments (3) | Comment RSSRSS comment feed