Phil Trelford's Array
POKE 36879, 255

Machine Learning from Disaster

July 12, 2013 00:03 by phil

Off the back of the popular Machine Learning hands on session at Skills Matter last month where we created a digit recognizer, last night we tackled a new dataset. Again we took a task from Kaggle’s online predictive modelling competitions. This time the data set was passenger details from the Titanic, with the task to analyse who was likely to survive.


Guided Task: http://trelford.com/titanic.zip (unblock the file, unzip to C:\titanic, load in VS2012 and run through the tasks in the titanic.fsx interactive F# script).

Kaggle provide a CSV file with the passenger details, we loaded this using FSharp.Data’s CSV provider which infers the fields and types of the data for you:

let [<Literal>] path = "C:/titanic/train.csv"
type Train = CsvProvider<path,InferRows=0>
type Passenger = Train.Row

let passengers : Passenger[] = 
    Train.Load(path).Take(600).Data 
    |> Seq.toArray

Then did some preliminary data analysis tasks looking at how well specific features predicted survival:

let females = passengers |> where female
let femaleSurvivors = females |> tally survived
let femaleSurvivorsPc = females |> percentage survived

Finally we used a provided decision tree learning algorithm for prediction:

let labels = [|"sex"; "class"|]

let features (p:Passenger) : obj[] = [|p.Sex; p.Pclass|]

let dataSet : obj[][] =
    [|for passenger in passengers ->
        [|yield! features passenger; 
          yield box (p.Survived = 1)|] |]

let tree = createTree(dataSet, labels)

I used the decision tree code from the Machine Learning in Action book porting the Python implementation to F#, here’s the gist of it. The Python Tools for Visual Studio (PVTS) came in handy for checking the outputs were the same on both implementations. Mathias Brandewinder has a great article on Decision Tree classification and also Random Forest classification in F# using the same Titanic data set. 

Again it was great to see a full house for the event with over 50 members in attendance:

full house

There’s a few more pictures from the event over on the Skills Matter Facebook page :)

Check out the F#unctional Londoners meetup page for upcoming meetings, the next one is 2 weeks on F# Mobile Apps. If you’re interested in more hands on sessions with F# I’d also highly recommend the Progressive F# Tutorials in New York this September and London in October, as there is still a great early bird rate:

miketempbannerprogfsharp-670x180px


Tags:
Categories: F# | Python | .Net
Actions: E-mail | Permalink | Comments (2) | Comment RSSRSS comment feed

Comments

July 12. 2013 02:29

Dom Fin

Thanks for posting this! Can't get down to London but enjoyed running through this with the slides etc... :-) :-)

Dom Fin

September 25. 2013 23:27

trackback

Progressive F# Tutorials NYC 2013

Progressive F# Tutorials NYC 2013

Phil Trelford's Array

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading