Phillip Trelford's Array

POKE 36879,255

f(by) Minsk 2014

This weekend Evelina, Yan an I had the pleasure of speaking at f(by) the first dedicated functional conference in Belarus. It was a short hop by train from Vilnius to Minsk, where we had been attending Build Stuff. Sergey Tihon, of F# Weekly fame, was waiting for us at the train station to guide us to the hotel with a short tour of the city.

The venue was a large converted loft space, by the river and not far from the central station, with great views over the city. The event attracted over 100 developers from across the region, and we were treated to tea and tasty local cakes during the breaks.

FuncBy Group Photo

Evelina was the first of us to speak and got a great response to her talk on Understanding Social Networks with F#.

Evelina at f(by)

The slides and samples are available on Evelina’s github repository.

Next up Yan presented Learn you to tame complex APIs with F# powered DSLs:


My talk was another instalment of F# Eye for the C# Guy.

Unicorns

The talk introduces F# from the perspective of a C# developer using live samples covering syntax, F#/C# interop, unit testing, data access via F# Type Providers and F# to JS with FunScript.

In one example we looked at CO2 emissions using World Bank data (using FSharp.Data) in a line chart (using FSharp.Charting):

[gb;uk;by] => (fun i -> i.``CO2 emissions (kg per 2005 PPP $ of GDP)``)

 

CO2 emissions[3]

Many thanks to Alina for inviting us, and the Minsk F# community for making us feel very welcome.

C# Records & Pattern Matching Proposal

Following on from VB.Net’s new basic pattern matching support, the C# team has recently put forward a proposal for record types and pattern matching in C# which was posted in the Roslyn discussion area on CodePlex:

Pattern matching extensions for C# enable many of the benefits of algebraic data types and pattern matching from functional languages, but in a way that smoothly integrates with the feel of the underlying language. The basic features are: records, which are types whose semantic meaning is described by the shape of the data; and pattern matching, which is a new expression form that enables extremely concise multilevel decomposition of these data types. Elements of this approach are inspired by related features in the programming languages F# and Scala.

There has been a very active discussion on the forum ever since, particularly around syntax.

Background

Algebraic types and pattern matching have been a core language feature in functional-first languages like ML (early 70s), Miranda (mid 80s), Haskell (early 90s) and F# (mid 00s).

I like to think of records as part of a succession of data types in a language:

Name Example (F#) Description
Scalar
let width = 1.0
let height = 2.0
Single values
Tuple
// Tuple of float * float
let rect = (1.0, 2.0)
Multiple values
Record
type Rect = {Width:float; Height:float}
let rect = {Width=1.0; Height=2.0}
Multiple named fields
Sum type(single case)
type Rect = Rect of float * float
let rect = Rect(1.0,2.0)
Tagged tuple
Sum type(named fields)
type Rect = Rect of width:float*height:float
let rect = Rect(width=1.0,height=2.0)
Tagged tuple with named fields
Sum type(multi case)
type Shape=
   | Circle of radius:float
   | Rect of width:float * height:float
Union of tagged tuples

Note: in F# sum types are also often referred to as discriminated unions or union types, and in functional programming circles algebraic data types tend to refer to tuples, records and sum types.

Thus in the ML family of languages records are like tuples with named fields. That is, where you use a tuple you could equally use a record instead to add clarity, but at the cost of defining a type. C#’s anonymous types fit a similar lightweight data type space, but as there is no type definition their scope is limited (pun intended).

For the most part I find myself pattern matching over tuples and sum types in F# (or in Erlang simply using tuples where the first element is the tag to give a similar effect).

Sum Types

The combination of sum types and pattern matching is for me one of the most compelling features of functional programming languages.

Sum types allow complex data structures to be succinctly modelled in just a few lines of code, for example here’s a concise definition for a generic tree:

type 'a Tree =
    | Tip
    | Node of 'a * 'a Tree * 'a Tree

Using pattern matching the values in a tree can be easily summed:

let rec sumTree tree =
    match tree with
    | Tip -> 0
    | Node(value, left, right) ->
        value + sumTree(left) + sumTree(right)

The technique scales up easily to domain models, for example here’s a concise definition for a retail store:

/// For every product, we store code, name and price
type Product = Product of Code * Name * Price

/// Different options of payment
type TenderType = Cash | Card | Voucher

/// Represents scanned entries at checkout
type LineItem = 
  | Sale of Product * Quantity
  | Cancel of int
  | Tender of Amount * TenderType

Class Hierarchies versus Pattern Matching

In class-based programming languages like C# and Java, classes are the primary data type  where (frequently mutable) data and related methods are intertwined. Hierarchies of related types are typically described via inheritance. Inheritance makes it relatively easy to add new types, but adding new methods or behaviour usually requires visiting the entire hierarchy. That said the compiler can help here by emitting an error if a required method is not implemented.

Sum types also describe related types, but data is typically separated from functions, where functions employ pattern matching to handle separate cases. This pattern matching based approach makes it easier to add new functions, but adding a new case may require visiting all existing functions. Again the compiler helps here by emitting a warning if a case is not covered.

Another subtle advantage of using sum types is being able to see the behaviour for all cases in a single place, which can be helpful for readability. This may also help when attempting to separate concerns, for example if we want to add a method to print to a device to a hierarchy of classes in C# we could end up adding printer related dependencies to all related classes. With a sum type the printer functionality and related dependencies are more naturally encapsulated in a single module

In F# you have the choice of class-based inheritance or sum types and can choose in-situ. In practice most people appear to use sum types most of the time.

C# Case Classes

The C# proposal starts with a simple “record” type definition:

public record class Cartesian(double x: X, double y: Y);

Which is not too dissimilar to an F# record definition, i.e.:

type Cartesian = { X: double, Y: double }

However from there it then starts to differ quite radically. The C# proposal allows a “record” to inherit from another class, in effect allowing sum types to be defined, i.e:

abstract class Expr; 
record class X() : Expr; 
record class Const(double Value) : Expr; 
record class Add(Expr Left, Expr Right) : Expr; 
record class Mult(Expr Left, Expr Right) : Expr; 
record class Neg(Expr Value) : Expr;

which allows pattern matching to be performed using an extended switch case statement:

switch (e) 
{ 
  case X(): return Const(1); 
  case Const(*): return Const(0); 
  case Add(var Left, var Right): 
    return Add(Deriv(Left), Deriv(Right)); 
  case Mult(var Left, var Right): 
    return Add(Mult(Deriv(Left), Right), Mult(Left, Deriv(Right))); 
  case Neg(var Value): 
    return Neg(Deriv(Value)); 
}

This is very similar to Scala case classes, in fact change “record” to case, drop semicolons and voilà:

abstract class Term
case class Var(name: String) extends Term
case class Fun(arg: String, body: Term) extends Term
case class App(f: Term, v: Term) extends Term

To sum up, the proposed C# “record” classes appear to be case classes which support both single and multi case sum types.

Language Design

As someone who has to spend some of their time working in C# and who feels more productive having concise types and pattern matching in their toolbox, overall I welcome our new overlords this proposal.

From my years of experience using F#, I feel it would be nice to see a simple safety feature included, to what is in effect a sum type representation, so that sum types can be exhaustive. This would allow compile time checks to ensure that all cases have been covered in a switch/case statement, and a warning given otherwise.

Then again, I feel this is quite a radical departure from the style of implementation I’ve seen in C# codebases in the wild, to the point where it’s starting to look like an entirely different language… and so this may be a feature that if it does see the light of day is likely to get more exposure in C# shops working on greenfield projects.

Orleans

Microsoft’s Build 2014 conference is currently in full flow, one of the new products announced is Orleans, an Actor framework with a focus on Azure.

There’s an MSDN blog article with some details, apparently it was used on Halo 4.

Demis Bellot of ServiceStack fame, tweeted his reaction:

I retweeted, as it wasn’t far off my initial impression and the next thing I know my phone is going crazy with replies and opinions from the .Net community and Microsoft employees. From what I can make out the .Net peeps weren’t overly impressed, and the Microsoft peeps weren’t overly impressed that they weren’t overly impressed.

So what’s the deal.

Actors

Erlang has distributed actors via OTP, this is the technology behind WhatsApp, recently acquired for $19BN!

The JVM has the ever popular Akka which is based heavily on Erlang and OTP.

An industrial strength distributed actor model for .Net should be a good thing. In fact Microsoft are currently also working on another actor framework called ActorFX,

The .Net open source community have projects in the pipeline too including:

There’s also in-memory .Net actor implementations with F#’s MailboxProcessor and TPL Dataflow. Not to mention the departed Axum and Retlang projects.

Orleans

From what I can tell, Orleans appears to be focused on Azure, making use of it’s proprietary APIs, so there's probably still a big space for the community's open source projects to fill.

Like Demis I’m not a huge fan of WCF XML configuration and code generation. From the Orleans MSDN post, XML and code-gen seem to be front and centre.

You write an interface, derive from an interface, add attributes and then implement methods, which must return Task<T>. Then you do some XML configuration and Orleans does some code generation magic for hydration/dehydration of your objects (called grains).

Smells like teen spirit WCF, that is methods are king, although clearly I could be reading it wrong.

From my limited experience with actors in F# and Erlang, messages and message passing are king, with pattern matching baked into the languages to make things seamless.

Initial impressions are that Orleans is a long way from Erlang Kansas…

The Good Parts

Building a fault-tolerant enterprise distributed actor model for .Net is significant, and could keep people on the platform where they may have turned with Erik Meijer to the JVM, Scala and Akka otherwise.

Putting async front and centre is also significant as it simplifies the programming model.

C# 5’s async is based on F#’s asynchronous workflows, which was originally developed to support actors via the MailBoxProcessor.

Coroutines

Underneath, Erlang’s processes, F#’s async workflows and C#’s async/await are simply implementations of coroutines.

Coroutines are subroutines that allow multiple entry points for suspending and resuming execution at certain locations. They’ve been used in video games for as long as I can remember (which only goes back as far as the 80s).

Coroutines help make light work of implementing state machines and workflows.

Methods

In Erlang messages are typically described as named tuples (an atom is used as the name), and in F# discriminated unions are typically employed.

Orleans appears to use methods as the message type, where the method name is analogous to an Erlang atom name, or an F# union case name and the parameters are analogous to a tuple. So far so good.

Where they differ is that return values are first-class for methods, and return values feel more like an RPC approach. In fact this is the example given in the article:

public class HelloGrain : Orleans.GrainBase, IHello
{
  Task<string> IHello.SayHelloAsync(string greeting)
  {
    return Task.FromResult("You said: '" + greeting + "', I say: Hello!");
  }
}

Also current wisdom for C# async is to avoid async void... which is why I guess they’ve plumped for Task as the convention for methods with no return value.

Messages

.Net’s built-in binary serialization is bottom of the league for size and performance, hopefully alternative serialization libraries like Google Protocol Buffers will be supported.

Judge for yourself

But these are just my initial impressions, try out the samples and judge for yourself.