Just got back from China

I’ve spent the last few days travelling to and from Beijing, China for the launch of the new Australia-China Centre for Air Quality Science and Management. This is a huge thing for us at ILAQH because it sets up an international collaborative agreement between a bunch of institutions in Australia and China who each have a different set of expertise that they can bring to the table, allowing us to undertake more ambitious projects than before and seek funding from a wider range of sources.

There was a lot of prep and behind the scenes meetings to take place on the first day, so Mandana (a colleague of mine) and I took a trip downtown to the Forbidden City and Tian Anmen Square and did some shopping Wangfujin. It would have been cold enough without the wind but it was such a clear day and everything was wonderful.

Day 14 - Tian Anmen Square Beijing

The first day of the launch, held at CRAES, featured the constituent groups giving a short presentation about their work. There’s a lot of great people working on some really interesting stuff and I’m very excited about the prospect of working with some of them over the coming years.

Day 15 - Lina

The second day had us split into three groups to propose various objectives and projects and put names on paper for who might be good leaders or key players in these fields as part of our centre. I joined the Transport Emissions group to discuss the control of emissions at their source, the investigation of atmospheric transformation processes and the development and uptake of new technologies. A lot of the ground work had already been laid at a planning meeting earlier in the year but it was good to put together some more concrete research topics.

After this, we went out to the National Jade Hotel for dinner, where we got to try another of the varied styles of Chinese cuisine; this time from a coastal region in North-Eastern China. I wish I had’ve paid more attention to the names of the various styles, but I enjoyed trying everything over the course of the trip, even the tripe.

Day 16 - The most important decision

The final day saw us tidying up the proposals, and an early finish meant that I got the afternoon off with Mandana, Felipe and Dion. After stumbling our way through a menu with pictures but no English translations, we had a big lunch and set off on the subway to the Temple of Heaven. It was certainly warmer than the Forbidden City (less stone, more trees) and was a very peaceful and pleasant end to the trip as we sat down at a bakery café and discussed If You Are The One while eating cream buns and drinking coffee (or in my case, peach black tea with milk). After heading back to the hotel, Mandana and I took a walk around the Bird’s Nest stadium which was only a block from our hotel. It looks like Beijing is putting effort into maintaining the area as a public plaza rather than just the grounds of a sports stadium, so even late in the cold evening it was full of families and groups of friends walking, talking and taking photographs.

Day 17 - Bird's Nest

An early morning taxi to the airport saw the start of 18 hours of travel. It’s nice to be back in one’s own bed, but I’m off to the airport again tomorrow for a 5:30am flight to Sydney for a workshop on exposure assessment with colleagues from the Centre for Air quality and health Research and evaluation. After a week of disastrously bad coffee, I’m glad that I booked accommodation which advertises itself as being 15m from the Toby’s Estate café.


A few things that warrant an update

The Finnish paper is pretty much ready for submission to Annals of Applied Stats. I’ve updated my publications page to include a link to the preprint on the arXiv. I will update my CV soon and I’ll add my posters and slides to the publications page.

The Higgs boson – what’s the deal with that? NPR has a good article on it.

I spent the evening watching the Higgs announcements at uni and even though I thought the CERN slides were pretty cluttered and didn’t like the layout, nothing had prepared me for the ATLAS slides. Bad colour schemes (if you can call them that) and Comic Sans MS? Yuk. You don’t make science accessible by making people think “Hey. That looks like I could have designed that. Scientists aren’t so different to me after all!” Some clearly presented slides that weren’t stuffed full of text and images are, in my opinion, the key to a good scientific presentation. 1 slide per minute, no more than 5-6 lines with no more than 5-6 words per line. The slides should touch on the key points so people can, at a glance, get a good idea of what’s going on. The talk that accompanies the slides is what conveys the rest of the information in more natural language. There was a lot of great science presented tonight, but it wasn’t presented well.

David Spiegelhalter explains the five sigma significance of the ATLAS/CERN results. P-values and confidence intervals are two things where I think frequentist probability stops being conceptually simpler than Bayesian statistics and becomes about questions like “What is the probability of observing the data I have seen given that I have this model and these parameter estimates?” and “If I did an infinite number of trials how many times would I expect this interval for my sample mean to cover the true mean?” and “Something something agriculture”.

Healthy Buildings is coming up next week. It’s all hands on deck at ILAQH while we put the finishing touches on the program and sort out the behind the scenes stuff. It’s going to be great. I’ll be giving two talks; the first is about how we can use the Dirichlet process for clustering in health survey data and the second is about the need for better statistics in science.

ISBA 2012 was a heap of fun and there were lots of good talks. I find meeting other statisticians very inspiring. I will try to write a wrap-up when HB 2012 is over. For now, you can enjoy my preliminary thoughts on Xian’s ‘Og.

ISBA 2012 – A few thoughts

Christian Robert asked for some guest bloggers for ISBA 2012 and today his ‘og features my thoughts as of this morning’s coffee break. There have been some really amazing talks in the sessions I’ve gone to, mostly in the NP Bayes talks.

My poster went well, I had a good discussion with Daniel Williamson about some of the shortfalls of P-spline models when smoothing temporal data. Hopefully I convinced him that my use of AR residuals means I’m not modelling noise with a highly oscillatory spline. I don’t think I can convince him of the validity of using an informative Gamma(1,b) prior for the smoothing parameter as he’s quite firmly in the subjective priors camp. Perhaps he and Sama should have a meeting.

I still haven’t been able to find Jukka Corander, he didn’t seem to be at the poster session where three of his students were presenting. Perhaps I just haven’t spotted him because we’ve only met once before and that was a year ago.

A few observations on Kyoto so far

I’ve been here roughly 24 hours and thought I might jot down a few things I’ve noticed about Kyoto, at least the parts I’ve been to.

It’s a very human scale city. A lot of the photos I’ve seen of Tokyo made me think that urban Japan is this sprawling metropolis of high rise buildings, neon lights, etc. Kyoto is not like that at all. Most of the buildings I’ve seen in the downtown area are perhaps four stories tall. You can see the mountains by looking down the street. There are not multiple lane highways running through the CBD, most streets near the central JR station are one lane in each direction, maybe two at the intersection. A few of the bigger streets may be three or four lanes in each direction but they aren’t clogged full of traffic and nor are they speedways. Contrast this to Brisbane where almost every street downtown is a four lane one way street with on-street parking on each side except in peak hour where all four lanes are on street parking because traffic slows to about 10km/hr.

There is a huge bike culture. It seems to be the easiest way to get around when making short trips. No one locks their bikes up. You just pop up the kick stand and go into the store for a few minutes. There are bike lanes at intersections for crossing and people ride on the road and the footpaths, and down the arcades in some places. There appear to be no helmet laws.

Bins are as rare as hen’s teeth yet the city doesn’t have a litter problem. Every vending machine has recycling bins next to it and people eat their food at the shops where they buy it, even if it’s a pokey little hole in the wall takeaway dumpling place in an arcade. People don’t walk around eating and drinking like they do in Australia. You sit down, eat your takeaway meal, dump the rubbish in the bin and then off you go. I must have spent ten minutes walking around with an empty takoyaki tray before deciding to walk into a Starbucks, dump by garbage, buy an iced tea, drink it quickly, dump the rubbish and leave.

The markets are wonderful. I spent the late morning walking around downtown with two other QUT ISBA attendees. We went to a few arcades where almost every shop seemed to be selling fresh fish, fermented vegetables, tofu, fried takeaway, dried beans, etc. The smells! Like nothing you get in Brisbane markets (I’m thinking Brisbane Square). We stopped for some takoyaki (octopus croquettes, basically) and I got a tempura lotus root on a stick. Not my usual fare but it’s not the sort of thing you see all that often in Brisbane so I decided I’d have a go. I couldn’t bring myself to eat the boiled baby octopus on a skewer, though; too Lovecraftian.

All in all, Kyoto appears to be a great little town with a fantastic public transport system, good food, friendly people all at a scale that’s not completely overwhelming. I would consider living here if there was a good postdoc position. I’d have to learn much more Japanese than “Arigato” and “Ohaio gozaimasu”.

Just a few quick thoughts

I’m setting up a laptop to take to ISBA with me as I have lots of thesis work to do. I must say, I’m really impressed with GitHub for Windows in regards to how simple it is to set up. It’s a matter of installing the program itself, then entering your github details. Cloning your GitHub repositories to your local machine is as simple as pressing a button. I haven’t had to faff about with ssh, pageant, etc.

Now I just have to finish setting up remote INLA (which will require faffing about with ssh), installing LaTeX and figuring out if I can use X forwarding without X-Win.

I also have to finish my ISBA poster and organise for it to be printed. Then there’s the two talks I am giving at Healthy Buildings 2012 which need writing and the Student Program work. I leave for Japan on Sunday. I should probably look at train travel from Osaka to Kyoto, find my travel money card, passport, etc.

I uploaded a paper to arXiv yesterday. I’ll post about it here when it appears.

Working on this Finnish paper

I figured I might as well describe how git made it possible to write the code and paper for the work I’ve been doing with Bjarke, Tareq, Kaarle and Jukka. Without git, we’d probably have been emailing code back and forth to each other or using something like Dropbox which would freak out over all the little changes we make, making it impossible to both be working on the same file at once.

Git is a distributed version control system that allows you to track revisions to your code and invite multiple collaborators to the project. I’ve talked about it previously but basically it’s this great system where you can work on a project with multiple people, making your changes, committing them on your local machine to save them. Once you are happy with the changes you’ve made and they don’t break anything, you can push the changes to the shared repository where all the other members of the project have access to them. If there’s a conflict, git lets you know and you can fix it up then re-commit and push. There are tools for reverting changes, making new branches, merging branches, etc.

June 13 2011. It’s still three weeks before I’m due to arrive in Finland. I upload the code from the book chapter on Bayesian Splines that I’ve been writing for BRAG. Bjarke and I spend a bit of time emailing back and forth about how splines work, as he hasn’t used them in a regression framework before. Bjarke has sent me a copy of the draft of his paper on a GLM with autoregressive residuals. I’ve still got the 8BNP workshop to attend before arriving in Finland.

July 5 2011. I arrive in Finland and meet Tareq and Bjarke for a meeting. We take a copious amount of notes during a long discussion where we set out what we want to achieve long term and what we want to have finished by the time I leave. The aim is to at least have some working code that combines my splines with Bjarke’s code that does autoregressive residuals.

July 6 2011. Bjarke’s code is added to the git repository and we get to work understanding what the other person has written. We’re both still getting to grips with how git works and end up accidentally making new branches. I spend most of my time annotating code so that I know where to look when things inevitably go wrong. Time is spent ensuring we have ways of visualising our results so we know if things are going totally wrong.

July 7-8 2011. We spend the next few days attempting to stitch the code together. Bjarke doesn’t use Google Chat or Facebook so there’s a little email correspondence at this time but it’s mostly office conversations.

July 9-10 2011. No work happens here as Bjarke and I are holidaying with his in-laws for the weekend at a summer cottage near Lappeenranta (near the Russian border).

July 11-16 2011. This is the most creative and chaotic period of working on the paper. Notes are made on A4 paper, transcribed as notes in a text file on git when they are worth following up and abandoned when they don’t lead anywhere. We start really getting to grips with multivariate splines, Metropolis-within-Gibbs, testing out new ideas, making new branches, merging them when they work, deleting them when they don’t, scribbling maths out on pieces of paper and running up and down the corridors whenever there’s a breakthrough.

July 19-31 2011. I return to Australia and we spend some time writing about what we managed to get done while I was overseas. We’re back to one branch and are largely discussing the methodology and making sure plotting works.

August, September 2011. I continue making changes to the way autoregressive residuals are handled, Bjarke codes up some diagnostics and begins examining a wide range of model specifications for the air quality data we’re working with in order to come up with a way of illustrating how what we’ve done is so cool.

October, November 2011. Some changes are made to the way the penalties are handled, the code becomes more functional and most of the focus is on plotting, diagnostics and model choice. Plots are saved as PDF files using export_fig.m within our script and are brought under the control of git so that we can replace one set of results with another in a single commit.

December 2011. Some radical changes are made to the way the autoregressive error structure is passed to the model, making it more flexible. These changes are contained in a separate branch so that Bjarke can continue working on his model comparison knowing that his code will continue to run. He checks it out and offers feedback.

January 2012. A lot of work is done on making sure the paper explains what’s going on. A few more features are introduced and the code is commented heavily.

February-April 2012. Bjarke spends a lot of time making sure the scripts to call the model fitting, forecasting and diagnostics work properly.

May 2012. A draft paper is sent around for feedback, some changes to the description of the method are recommended, as are a few different model specifications. Development on the code itself has stopped but the diagnostics, plotting and inference continues. Much of the work is now happening on QUT’s supercomputer as competing models are tested. Writing about the autoregressive errors is filled out a bit to ensure that the forecasting is highlighted.

June 2012. The paper is almost finished. We’re waiting on feedback from a co-author who has been quite sick. There have been some large rewrites based on Kerrie’s feedback, mostly to change the order so that it’s a punchier article which highlights the novelty of the method rather than me just talking about how cool splines are. Support is being canvassed among the authors for uploading the draft to arXiv and releasing the code once the paper is published.

And that’s where we stand at the moment. Hopefully I can make the git repository public and you can have a look at what’s happened and where we’ve come from with this. It might need a bit of pruning first to make sure that no data that shouldn’t be publicly available isn’t accidentally made public. There’s a minimal working example in the code where we simulate some data, so hopefully that’s enough to demonstrate what we’ve done. There are some really neat ways of visualising the work done on GitHub, including a network diagram of the committed changes and branches, contributions of each person over time, when commits occur most frequently, what (programming) languages the project uses and how frequent additions and deletions occur (and therefore the growth rate of the project).

I hope this sheds some light on the process that’s been used. GitHub was basically a way for the QUT and Helsinki groups to collaborate, with Bjarke and I acting as the conduits for reviews and comments. Git allowed us to write a whole bunch of code together, following up all sorts of crazy ideas without getting in each others’ way. The paper was written as we went and is subject to the same version control (after all, LaTeX is code too). I have found it a really great way of working. I’d like to see how it goes with a few more people programming and whether I can work with a few other people to try to make the changes to the paper directly via git rather me making the changes based on notes scribbled on a printed copy.

P.S. Wow, I can’t believe it’s been nearly a year since we started working on this. Well, I can, as we had a few delays where it turned out we needed to rewrite large chunks of code and the paper.

P.P.S. I just managed to merge the development branch with the modified way of dealing with the residuals back into the master branch without there being any conflicts. I didn’t expect conflicts but it’s nice to know that everything’s back in the master branch. Below is an image of the commit history. It doesn’t show the number of changes in each commit, but given that commits occur when an idea has been tested or a section written, it’s a good indication of a parcel of working being done.

Committed changes for the Finnish paper

For interest’s sake here’s a map of my time in Finland. I haven’t got the exact location of the summer cottage but it’s near Taipalsaari. Here’s my collection of photos from my time in Finland. I had originally uploaded them to Facebook and given detailed captions but the move to Google Plus ended up removing the captions. Leave a comment on them asking a question if you want to know more.

ISBA update – logistics

This morning I booked my flights to Osaka ($670 return, a very good deal) and am in the process of organising accommodation with Tamara Broderick. I’m getting excited about catching up with the various people that I met at the BNP conference last year and it’s even more exciting to see their names on the list of people giving talks. Tamara will be talking about Beta processes and Sergio Bacallado is giving a talk alongside Sonia Petrone and Yee Whye Teh, two of the big names in Bayesian non-parametrics.

A few QUT people are talking as well. Nicole White and Susanna Cramb will be talking about spatio-temporal disease modelling. One of my supervisors, Sama Low Choy, will be talking about combining subjective priors (expert elicitation) and another, Kerrie Mengersen, is organising that session and co-author on a few papers that are being presented by people in her BRAG group, as well as posters (such as mine).

I’m also looking forward to seeing presentations on INLA, spatial modelling, non-parametric estimation and environmental modelling.