Posterior samples – SEB113 edition

I’m going to be frank, a lot of this relates to SEB113 – Quantitative Methods for Science, a subject I tutored last semester.

One of the students from SEB113 last semester, Daniel Franks, is live-tweeting his Bachelor of Science degree. Daniel was in my workshop group last semester and I recognise some of the events he talks about in his timeline. It’s interesting to see his perspective not just on SEB113 but on the other three units that form the first semester of QUT’s new Bachelor of Science course.

SEB113 is getting a small makeover for Semester 2. One of the things we’re considering is the use of ggplot2 instead of a combination of the base graphics package, heatmaps from dendrograms with colorbars from yet another package, etc. Lattice doesn’t have the nicest interface and it’s nigh on impossible to add elements afterwards (I hate you, levelplot). It’s possible to do small multiples in ggplot2 fairly easily.

We ought to be sticking to the same steps in data analysis as we did last semester, and Daniel’s tweets refer to an experience in class last semester where we discussed drawing the analysis method out of exploratory plots of the data, rather than trying to pick the “best” model a priori and making the data fit the model. Roger Peng’s got a good five step technique for analysing data:

  1. Exploratory analysis
  2. Model fitting
  3. Model building
  4. Sensitivity analysis
  5. Reporting

Via Luis Apiolaza at Quantum Forest I’ve stumbled across Thomas Lumley‘s Tumblr, where he’s doing some personal blogging about statistics. An interesting post of his is on the role of Bayesian stats in introductory classes. I would love to turn SEB113 into a Bayesian statistics  based class but for the time being I will have to settle for it dealing with modelling over tests (which is still a big win, pedagogically).

Teaching Bayesian statistics generally relies on a good grounding in calculus, otherwise writing down full conditionals is going to be quite difficult. When people tell me that statistics is so different to mathematics I like to point out that it’s just a combination of calculus, linear algebra and some discrete mathematics. Daniel Kaplan writes at AMSTATNews about ditching mathematical formalism to make statistics more accessible. The American undergraduate model is very different to what we have in Australia, but I take his point about a first year calculus class not being as relevant to graduates as a first year statistics course that teaches statistical thinking over statistical calculation. I really like the focus in SEB113 on modelling using R rather than statistical tests by hand with pages of tables (as MAB101 was when I did my Bachelor of Science). If people finish SEB113 knowing how to read their data in to R and perform a Generalised Linear Model I think we’ll have done our job.

If they want to go on to further statistics from there, the statistics units in the School of Mathematical Sciences work from a calculus perspective and while they require a calculus pre-requisite (MAB121 or MAB122 for those QUT students reading) you could do a lot worse than taking MAB210 and MAB314. I hope a follow-up data analysis course will be offered to Bachelor of Science students that builds on SEB113 and covers some more advanced topics and introduces enough mathematics to make those topics worthwhile. We’ll have to see how it all unfolds as this first cohort make their way through.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s