Yesterday I noticed a tweet recommending an “Uncle” Bob Martin video that is intended to “demystify compilers”. I haven’t seen the video (Episode 29, SMC Parser) as it’s behind a pay wall, but I did find a link on the page to a github repository with a hand rolled parser written in vanilla Java, plus a transformation step that compiles it out to either Java or C code.
The parser implementation is quite involved with a number of files covering both lexing and parsing:
I found the Java code clean but a little hard to follow so tried to reverse engineer the state machine syntax off of a BNF (found in comments) and some examples (found in unit tests):
Actions: Turnstile
FSM: OneCoinTurnstile
Initial: Locked
{
Locked Coin Unlocked {alarmOff unlock}
Locked Pass Locked alarmOn
Unlocked Coin Unlocked thankyou
Unlocked Pass Locked lock
}
It reminded me a little of the state machine sample in Martin Fowler’s Domain-Specific Language book, which I recreated in F# a few years back as both internal and external DSLs.
On the train home I thought it would be fun to knock up a parser for the state machine in F# using FParsec, a parser combinator library. Parser combinators mix the lexing and parsing stages, and make light work of parsing. One of the many things I like about FParsec is that you get pretty good error messages for free from the library.
Finite State Machine Parser
I used F#’s discriminated unions to describe the AST which quite closely resembles the core of Uncle Bob’s BNF:
type Name = string
type Event = Name
type State = Name
type Action = Name
type Transition = { OldState:State; Event:Event; NewState:State; Actions:Action list }
type Header = Header of Name * Name
The parser, using FParsec, turned out to be pretty short (less than 40 loc):
open FParsec
let pname = many1SatisfyL isLetter "name"
let pheader = pname .>> pstring ":" .>> spaces1 .>>. pname .>> spaces |>> Header
let pstate = pname .>> spaces1
let pevent = pname .>> spaces1
let paction = pname
let pactions =
paction |>> fun action -> [action]
<|> between (pstring "{") (pstring "}") (many (paction .>> spaces))
let psubtransition =
pipe3 pevent pstate pactions (fun ev ns act -> ev,ns,act)
let ptransition1 =
pstate .>>. psubtransition
|>> fun (os,(ev,ns,act)) -> [{OldState=os;Event=ev;NewState=ns;Actions=act}]
let ptransitionGroup =
let psub = spaces >>. psubtransition .>> spaces
pstate .>>. (between (pstring "{") (pstring "}") (many1 psub))
|>> fun (os,subs) ->
[for (ev,ns,act) in subs -> {OldState=os;Event=ev;NewState=ns;Actions=act}]
let ptransitions =
let ptrans = attempt ptransition1 <|> ptransitionGroup
between (pstring "{") (pstring "}") (many (spaces >>. ptrans .>> spaces))
|>> fun trans -> List.collect id trans
let pfsm =
spaces >>. many pheader .>>. ptransitions .>> spaces
let parse code =
match run pfsm code with
| Success(result,_,_) -> result
| Failure(msg,_,_) -> failwith msg
You can try the parser snippet out directly in F#’s interactive window.
Finite State Machine Compiler
I also found code for compiling out to Java and C, and ported the former:
let compile (headers,transitions) =
let header name =
headers |> List.pick (function Header(key,value) when key = name -> Some value | _ -> None)
let states =
transitions |> List.collect (fun trans -> [trans.OldState;trans.NewState]) |> Seq.distinct
let events =
transitions |> List.map (fun trans -> trans.Event) |> Seq.distinct
"package thePackage;\n" +
(sprintf "public abstract class %s implements %s {\n" (header "FSM") (header "Actions")) +
"\tpublic abstract void unhandledTransition(String state, String event);\n" +
(sprintf "\tprivate enum State {%s}\n" (String.concat "," states)) +
(sprintf "\tprivate enum Event {%s}\n" (String.concat "," events)) +
(sprintf "\tprivate State state = State.%s;\n" (header "Initial")) +
"\tprivate void setState(State s) {state = s;}\n" +
"\tprivate void handleEvent(Event event) {\n" +
"\t\tswitch(state) {\n" +
(String.concat ""
[for (oldState,ts) in transitions |> Seq.groupBy (fun t -> t.OldState) ->
(sprintf "\t\t\tcase %s:\n" oldState) +
"\t\t\t\tswitch(event) {\n" +
(String.concat ""
[for t in ts ->
(sprintf "\t\t\t\t\tcase %s:\n" t.Event) +
(sprintf "\t\t\t\t\t\tsetState(State.%s);\n" t.NewState)+
(String.concat ""
[for a in t.Actions -> sprintf "\t\t\t\t\t\t%s();\n" a]
) +
"\t\t\t\t\t\tbreak;\n"
]
) +
"\t\t\t\t\tdefault: unhandledTransition(state.name(), event.name()); break;\n" +
"\t\t\t\t}\n" +
"\t\t\t\tbreak;\n"
]
)+
"\t\t}\n" +
"\t}\n" +
"}\n"
Again, and probably not surprisingly, this is shorter than Uncle Bob’s Java implementation.
And again, you can try the compiler snippet out in the F# interactive window.
Conclusion
Writing a parser using a parser combinator library (and F#) appears to be significantly easier and more succinct than writing a hand rolled parser (in Java).