Phillip Trelford's Array

POKE 36879,255

Machine Learning from Disaster

Off the back of the popular Machine Learning hands on session at Skills Matter last month where we created a digit recognizer, last night we tackled a new dataset. Again we took a task from Kaggle’s online predictive modelling competitions. This time the data set was passenger details from the Titanic, with the task to analyse who was likely to survive.

Machine learning from disaster from ptrelford

Guided Task: http://trelford.com/titanic.zip (unblock the file, unzip to C:\titanic, load in VS2012 and run through the tasks in the titanic.fsx interactive F# script).

Kaggle provide a CSV file with the passenger details, we loaded this using FSharp.Data’s CSV provider which infers the fields and types of the data for you:

let [<Literal>] path = "C:/titanic/train.csv"
type Train = CsvProvider<path,InferRows=0>
type Passenger = Train.Row

let passengers : Passenger[] = 
    Train.Load(path).Take(600).Data 
    |> Seq.toArray

Then did some preliminary data analysis tasks looking at how well specific features predicted survival:

let females = passengers |> where female
let femaleSurvivors = females |> tally survived
let femaleSurvivorsPc = females |> percentage survived

Finally we used a provided decision tree learning algorithm for prediction:

let labels = [|"sex"; "class"|]

let features (p:Passenger) : obj[] = [|p.Sex; p.Pclass|]

let dataSet : obj[][] =
    [|for passenger in passengers ->
        [|yield! features passenger; 
          yield box (p.Survived = 1)|] |]

let tree = createTree(dataSet, labels)

I used the decision tree code from the Machine Learning in Action book porting the Python implementation to F#, here’s the gist of it. The Python Tools for Visual Studio (PVTS) came in handy for checking the outputs were the same on both implementations. Mathias Brandewinder has a great article on Decision Tree classification and also Random Forest classification in F# using the same Titanic data set. 

Again it was great to see a full house for the event with over 50 members in attendance:

full house

There’s a few more pictures from the event over on the Skills Matter Facebook page :)

Check out the F#unctional Londoners meetup page for upcoming meetings, the next one is 2 weeks on F# Mobile Apps. If you’re interested in more hands on sessions with F# I’d also highly recommend the Progressive F# Tutorials in New York this September and London in October, as there is still a great early bird rate:

miketempbannerprogfsharp-670x180px

Comments (1) -

  • Dom Fin

    7/12/2013 2:29:23 AM |

    Thanks for posting this! Can't get down to London but enjoyed running through this with the slides etc... Smile Smile

Pingbacks and trackbacks (1)+

Comments are closed