taxonomy – Puffbox

9 Apr 2010

company, e-government

defra, taxonomy, wordpress

Our new Defra site plays nicely with PDFs

In the late March dash to complete jobs before the end of the financial year, not to mention the imminent election declaration, I pushed a couple of websites live without having the time to blog about them.

One of these was another little job for Defra: a WordPress-based satellite site for their Science Advisory Council. It’s a fairly modest site, sticking very closely to Defra’s house style, and based to a significant extent on previous work I’ve done for them. But it does have one innovation worth flagging.

The site’s main objective is transparency, making documents from its quarterly meetings publicly available. Inevitably, like it or not, that means a lot of PDF files. WordPress’s media library function makes it easy to upload these as attachments to posts (or indeed, pages): but we had the idea to take things a step further.

One underused feature of WordPress is the ability to add ‘custom taxonomies’: distinct classifications based on either the (hierarchical) category or (freeform) tagging model. But if you’re smart about it, you can also hook this same functionality into the media library – allowing you to add ‘tags’ to the documents you upload. And once you’re doing that, you can benefit from all the other features you’re used to with tags and categories.

So there’s a page on the SAC site which presents this custom taxonomy (unimaginatively labelled ‘Upload tags’) as a tag cloud. Click on a tag, and it takes you to an archive page of all uploads (ie PDFs) relevant to that tag – eg meeting minutes – which can be based on a special page template. (And you can also get RSS feeds of each element in the custom taxonomy, by adding ‘/feed’ to the URL in the usual way.)

We’ve also written a routine into the standard post template to extract details of attached files, and present them automatically in a nice table, with a bit of automated metadata too – see this page as an example.

There are plenty of reasons why you shouldn’t rely on PDF as your primary publishing channel; but sometimes, you have to accept it’s the only practical solution. And in this case, I hope we’ve shown it’s possible to make something of a virtue of it.
4 Jun 2008

news, technology

bbc, taxonomy, wikipedia

BBC's new /topics pages

See, this is what you can do when you’ve got lots of information, all properly tagged and structured. The BBC’s new /topics pages are entirely automated, and pull together content from across their online offerings – iPlayer, the News site, weather, /programmes – into a nicely presented ‘everything we know about X’ page. A modest 66 topics to start with, by my calculation, but the promise of many more. Try these examples: NHS, Gordon Brown, Liechtenstein. (And check out the pretty addressing, too.)

Over on the BBC Internet Blog, Matthew McDonnell explains how it uses ‘a variety of search techniques to create feeds of the latest BBC content’. I’m guessing a lot of it is down to a subject taxonomy, or free-text search for certain keywords. However it works, its beauty is encapsulated by this section:

Because the overhead involved in maintaining these pages is so low, we can cover many more subjects than we could using traditionally edited pages which had to be manually updated by a human being. As the feeds used in /topics are automatic, we can be confident that all the pages are bang up-to-date.

In many respects, this is the ‘holy grail’ of every taxonomy project. Well done to the BBC for actually making it happen; although it’s likely to encourage others to attempt to follow suit. And most will fail. (Yes they will.) And for the future?

We want to include high quality content from outside the BBC to enhance our pages. We’ll be working on providing feeds of news and blogs from sources other than the BBC. Yes, feeds [for you to build into your own website] will be available soon.

It’s genuinely brilliant: can we call it a hybrid of Wikipedia and Wikinews, with the added benefit of trusted editorial oversight? Just please, don’t try it at home.

Our new Defra site plays nicely with PDFs

BBC's new /topics pages