API promised for 2011 Census data

Chances are, you missed last month’s publication of the Cabinet Office’s white paper on the 2011 Census. ‘Modern times demand modern approaches,’ declares Sir Michael Scholar, chair of the UK Statistics Authority: you’ll be able to complete your census form online, and ‘all standard outputs will be publicly accessible online, and free of charge, from the National Statistics website (whatever that is – as I understand it, the name disappeared in the UKSA rebranding).’

The Census represents a marvellous opportunity. We’re now many years into the post-web world, and online is now the main distribution channel for data. We’ve got several years to learn from the best practice of others, be they fellow statistical organisations around the world, or heavy-duty data disseminators like the financial markets. There’s no issue as regards a business model: the commitment to free availability has already been made. It’s an open goal.

Unfortunately, I probably wrote something almost identical to the preceding paragraph seven years ago, when I started working for ONS as Web Editor in Chief, full of optimism at what magic we could weave with the 2001 census data. It didn’t last; there was virtually zero consideration of public usage in the output plans, and I couldn’t persuade the key people of the cultural shift happening outside. There were some blazing rows. I left ONS in 2004; it says something that the website I built as a six-month stopgap in 2002 is still their main web presence – reskin aside, almost exactly as I left it.

A quick skim through the white paper provides little reason to restore my optimism. It has more to say about printed books of preformatted tables than it does about electronic methods – there’s no fleshing-out of what ‘online dissemination’ might mean. Instead, there’s a commitment to produce CDs and DVDs… seriously? in 2012?

But there may yet be hope. Back in December, ONS quietly launched a 2011 UK Census Output consultation – based, remarkably, on a Wiki platform. They’ve published initial survey findings from 500+ respondents, half of whom were in government; it’s a bit disappointing to see so little input from potential new customers (only 2%), as opposed to the ‘usual suspects’. Yet a clear majority of this normally conservative (small ‘c’) audience said they would be happy with electronic output alone.

And hallelujah! – elsewhere on the wiki there’s even mention of an ‘intention is to support a variety of electronic dissemination options through the use of an internet-based API [said on another page to be ‘publicly-available’] that can access the full range of aggregated Census statistics.’ There’s even a link to a list of the API calls to be offered – but it ‘does not (yet) exist’. Many a slip twixt cup and lip, as they say… but they’re undoubtedly talking the right talk here, and perhaps that’s all we can ask at this stage.

My only plea is that they remember the huge potential value for new users. Things have moved on dramatically since 2001; I can think of countless websites which would adore a system they could hook into, with fantastic potential benefits to ordinary web users. The wiki’s list of planned response formats betrays the ‘insiders first’ instinct again: nothing your average masher will be familiar with. Consult your community by all means, guys; but recognise there’s an even wider potential community these days.

  • PS: It’s not the Census group’s first venture into social media: two years ago, they took part in the Hansard Society’s Digital Dialogues initiative, with a blog centred on consultation on small area geography policy. Ten blog posts in three months (over Christmas) isn’t great, and the Hansard Soc was politely critical of the blogger’s failure to engage with the readership, and the organisation’s failure to take the initiative forward. Interestingly, the site has been wiped from the record books: the Hansard Soc’s graphics have been replaced by Flickr errors, and the onsgeography.net domain name appears to have lapsed. There’s always web.archive.org though… :)

Set the Census data free

One particularly difficult phase of my career was my time with National Statistics, in the aftermath of the 2001 Census. I tried, and ultimately failed, to persuade the organisation to recognise the tremendous asset they held in Census data, and to make wide public access a priority. I’m proud of some of the (relatively modest) things we managed to put out, but overall I’m disappointed at the many opportunities that were missed.

I remember my frustration at how everything was driven by very narrow ‘stakeholder consultation’, which ultimately resulted in the same old people asking for the same old things. The potential for civic engagement ranked well down the list of organisational priorities; the possibilities for data mashing didn’t even register. Despite the huge sums of money spent on countless consultancies, the end product was – ahem – somewhat underwhelming.

So when I discover that the 2011 Census outputs are the subject of the latest blog-based consultation, part of the Hansard Society‘s Digital Dialogues programme, of course I’m interested. And I think we all should be.

Two dates to bear in mind here. It’s nearly a year since the publication of the Mayo-Steinberg Power Of Information report, which called for ‘a strategy in which government … supplies innovators that are re-using government-held information with the information they need, when they need it, in a way that maximises the long-term benefits for all citizens.’ And just as importantly, we’re probably five years away from the first publication of census data.

This must be the first Census to take a truly web-first, and arguably even an API-first, approach to publication. Several reasons:

  • Because it’s a one-off event, for which we have several years to prepare.
  • Because if you think the world is web-first in 2008, just you wait and see what 2013 looks like.
  • Because outsiders – from Experian to MySociety – will almost certainly do a better job than the Civil Service (sorry).
  • Because it doesn’t actually prevent government doing the ‘old school’ thing itself, if it wants. In fact, if you think ‘API first’, it’ll probably result in the ‘old school’ outputs coming together easier and quicker too. Be your own client.
  • Because to have any validity, the Census requires the goodwill and engagement of every person in the country. It’s one of the rare occasions where every resident puts something into a national kitty. Even if it’s only symbolic, this should be the prime example of the state giving something back to them in return.

This is one government consultation where the geek community (by which I mean us, sadly) should bring its influence to bear. We all know it’s the right thing to do; but they won’t do it unless there’s a sizeable, quantifiable demand. This would be a huge symbolic victory for openness and democratisation. This is our chance.