Information vs. understanding: Wikileakising the news

So I fire up the computer again after my statutory two weeks, only to catch the tail end of a diplomatic crisis. Apparently Wikileaks have released a load of stolen cables too, and it’s this latter (clearly less significant) scandal that I’d like to comment on. I know it may border on heresy to express this opinion online, but personally I am deeply concerned about the wikileakisation of the news (to coin an appropriately ugly term). It’s been going on a lot recently, and the fact that the latest leak happens to embarrass people whom I rather dislike= should not disguise the similarities with so-called climategate last year. I wrote then about the right of scientists to be indiscreet in their private communications, which in fact boils down to the simpler statement that I believe that lines of private, written communications should exist. Diplomats perhaps shouldn’t have the same license for indiscretion (the clue’s in the name), but quoting selected morsels of communications which are removed from all context is no way to understand the world.

Let me expand. I rely for my research on data collected by other people – often very large cooperatives of other scientists. And the first tenet of data-driven science is that data without metadata is useless. ‘Metadata’ is simply data about data: so for instance, it might be a document describing the column headings in a spreadsheet, or the field protocols used to collect the data (with information on known biases and so on). Presented only with the raw data, I can run fancy statistical models to my heart’s content, but the results will be completely devoid of meaning. I would never get such analyses published unless I could demonstrate a clear understanding of the scope and limitations of the data. To the extent that I could ever be considered an ‘expert’, this is my job: not simply to provide titbits (or even torrents) of information out of context, but to perform sensible analyses on the data and to interpret these in the context of what is known about the system. Of course, I am in favour of supporting any conclusion I might make with full access to the data and methods I used to arrive there, but that is somewhat different. Wikileakisation places the cart of data in front of the horses of professionsal competence. Were I a political journalist I would be ashamed of having my expertise undermined, but most seem content to report gossip as fact or insight.

As far as the clamour for complete disclosure, for absolute freedom of information goes, I would argue that simply spewing out data in an uncontrolled torrent is not complete disclosure. Quoting what diplomat x said about minor royal y to ambassador z does not fully inform us about the reality of the situation, not unless we understand the personal relationships involved, and the political context behind the communication.

In much the same way, in order to interpret a (hypothetical) email I sent along the lines of ‘this work is awful, we should try to make sure it is never published’ it would be essential to know whether I was addressing a good friend – and so to take it as a throwaway comment – or a senior academic editor with whom I had never shared a pint, wherein the implications would be somewhat different.

I think we could all agree that actively concealing information is rarely defensible (although more often so in diplomatic than in scientific circles). But at the same time, information is not understanding, and we should not confuse the two.

And I say this as a parent, so it must be true ;-)

Random Nature

I hope you’ll forgive my short attention span at the moment – if you read my last post, you’ll know why. What I thought I’d do, then, is to pluck a few random things from Nature that have interested me in the last few weeks. Publish your computer code

First, I really enjoyed Publish your computer code: it is good enough by Nick Barnes, who begins with the reassuring:

I am a professional software engineer, and I want to share a trade secret with scientists: most professional computer software isn’t very good

He goes on,

…you scientists generally think the code you write is poor. It doesn’t contain good comments, have sensible variable names or proper indentation. It breaks if you introduce badly formatted data, and you need to edit the output by hand to get the columns to add up. It includes a routine written by a graduate student which you never completely understood, and so on. Sound familiar? Well, those things don’t matter.

The point being that software is written to the point at which it is good enough – if your code does the job, it does the job, and that makes it good enough to do the job for someone else too. I couldn’t agree more. For instance, I’ve used some pretty awful R code from various supplementary appendices – but, with a bit of fiddling with names and formatting, I’ve got it to work, and with substantially less effort than had I started to program the routine from scratch. Some of my own code is doubtless equally clunky – I try to make it legible, reasonably well commented, and with a certain degree of robustness to catch errors, but as soon as it works, I tend to stop. Writing code, though, is a big part of what I do, and I like the fact that my efforts might be useful to others too. Then there’s the selfish side: if people use my code, they cite my work.

The origins of genius

Robert J. Sternberg’s review of Sudden Genius: The Gradual Path to Creative Breakthroughs by Andrew Robinson and Where Good Ideas Come From: The Natural History of Innovation by Steven Johnson, was fascinating. I’ve not read either book, but the review itself was insightful.

First, just to get this off my chest – Sternberg quotes Robinson as stating that ‘determination, practice and coaching’ are more important than innate talent in the creative process. This is a bugbear of mine: that somehow, the ability to practice determinedly is available to all, regardless of nature or nurture. So failure is a consequence of not trying hard enough, rather than being unable to try hard enough. I see no reason why there should be as much of a genetic component to being driven to practice, as there is in any other talent. Sternberg also points out that those lacking aptitude are more likely to give up – after several years as a kid torturing the dog with my violin, I tend to agree. I would have been much more inclined to put in the hours of practice if I’d seen some kind of concomitant increase in skill (as I did when I picked up a guitar instead).

I do agree with Robinson’s conclusion (as paraphrased by Sternberg) that genius is on the wane due to the need for increased specialisation, particularly in the sciences, which precludes the development of the kind of broad thinking necessary for genius. Sternberg continues,

And the astonishing amounts of complex knowledge that must be mastered today prevent most researchers from making deep connections between disciplines. Cross-disciplinary work, more and more, requires teams.

Interestingly, a few weeks before, Nature carried a letter decrying the specialisation of teaching in university, and calling for curriculum reform based around interdisciplinary themes – their manifesto makes for very interesting reading.

A defence of universities comes from Johnson’s book: (according to Sternberg) they

…encourage the free interchange of ideas. To maximise creativity, you need both the availability of a network [of minds] and the random collision of ideas within it, and universities offer both.

Well, they should do, anyway – I’m not sure that’s always the case, which is why I’ve got so much out of interdisciplinary programmes like NESTA Crucible, and Frontiers of Science – which provide the network and the collisions in a very concentrated form.

Sternberg’s own view is that inventive people are ‘crowd-defiers’, prepared to be combative in their challenge of received wisdom. Of course, this is also a characteristic of various other, less positive groups (conspiracy theorists, for example), but I do like this:

…the more creative an idea is, the harder it will be to sell. Reviewers of grant proposals and journal articles must recognise that highly creative research may be less developed than that which furthers established paradigms, and should make more allowances for originality.

Amen to that!

Don’t Fly Me To The Moon

Finally, mentioned almost as an aside in the editorial Space hitch-hiker was a study by Martin Ross in press at Geophysical Research Letters# on the potential environmental impact of commercial space flight. Or rather, the editorial focused on this study, but what I felt was the striking conclusion of the study was barely mentioned – that 1000 commercial space flights a year (a plausible number for the not-too-distant future) would “…add as much to climate change as current emissions from the global aviation industry.”

Now, I’m as disappointed as anyone who grew up wanting to be Han Solo that spaceflight isn’t yet commonplace. But, to double the climate impact of the aviation industry, for the momentary pleasure of a handful of the world’s richest people? Sorry, but isn’t that a tad… well… irresponsible?

Antinatal

I’m approaching impending dadhood (very impending, in fact – D-Day Thursday (although see below for the difference between ‘due date’ and ‘expected date’…)) with what I assume is about the usual mix of excitement and fear. One thing I definitely can’t wait to be rid of, though, is the torrent of antenatal advice that floods in from all directions. Oh, I know, advice on every aspect of parenting is sure to follow, but I figure I will at least be in a better position to sort the wheat from the chaff once Jr. enters the real world and ceases to be (for me – his/her mum, of course, has already become reasonably intimately acquanted with him/her as an actual physical being!) somwhat hypothetical. Now, as Mike wrote the other day, parenthood comes with no manual, so you would think I’d be grateful for advice. And that’s absolutely right – there are things we didn’t know, that we needed to know, and that we do now, thanks to a few NHS antenatal classes, as well as those run by the NCT (all of which I missed – Bad Dad!) and some yoga (which aimed to relax us, but had the opposite effect on me through use of the term ‘energy’ in a non thermodynamic sense). The problem is not that there is no good, sensible advice. It’s that it is more or less impossible to sift this out from the rubbish. You just know that some of what you hear or read comes straight from anti-vaccine nutjobs, or at least people who would rather place their trust in mystical gobbledegook than the nasty medical establishment. But, without spending an age tracking down sources for bit of information, it’s very hard to tell these points of view apart.

Part of the problem is the general aversion to numbers in any of the leaflets, classes, or whatever. Now, I know there are reasons for this. For instance, the midwife refused to give me any kind of ballpark figure about what, in degrees C, constitutes ‘warm’ (for a baby’s bath), largely I think because she doesn’t want people becoming absolutely neurotic about such things. (I asked, incidentally, out of interest – I used to make a lot of bread, so have a reasonable feel for different temperatures of water, and was curious what ‘warm’ is!)

Other times, it is to do with the nationalised nature of our health service. Now, I will not have a bad word said about the NHS – it’s a wonderful institution that, despite its faults, works amazingly well, and I fear for it under our current government. But, the NHS has to issue advice that will do the maximum good to the maximum number of people (the National Institute of Clinical Excellence has the tough job of making such calls, and does so very well). The calculation has clearly been done, for example, that the potential risk to the developing foetus of too much vitamin A outweighs, on average, the benefits of certain foods to pregnant mums: hence, no liver when pregnant. An individual-level assessment may have concluded that, in our case, liver would have been good – and would have maybe avoided a recent bout of anaemia.

Of course, as a society we are terrible at assessing risk, so it’s perhaps not surprising that there are so few details given about these things that thou shalt not do. So for instance, with an excess of vitamin A, what does the risk of foetal abnormality increase from and to? If we hadn’t bought a new matress for the second hand cot, I know from the FSID (who, by the way, do fantastic work, which you all should support) this would apparently increase the risk of Sudden Infant Death, but again, from what, and to what? For some issues, some of us might accept even a doubling of risk in some circumstances, if it was from, say, a miniscule risk to a tiny risk.

Now, I know this level of detail is going to be of interest to a very small proportion of expectant parents. And if I really wanted, I could spend hours on the internet hunting down sources (you would think, anyway – although a colleague, having absent-mindedly made some mayonnaise for his partner, failed in a flurry of panicked research to find any evidence of a raw egg ever having hurt anyone during pregnancy!) But for those of us who take an interest in such things, without obsessing over them, it would be nice if routine leaflets and so on had some kind of… appendix, I suppose, with some numbers there.

One final bugbear – as I said, our ‘due date’ is this Thursday, 18th. But, as everyone knows, first children are always late (or, about 90% of them are). So, the ‘due date’ is clearly not the ‘expected date’, in terms of the day that Jr. is most likely to appear. Wouldn’t it make more sense to give us a due date that reflects the expected arrival? Then I have a day or two more, anyway, of not jumping out of my skin every time the phone goes.

Right, off to take Mike’s advice and get some pre-emptive sleep…