Evidence in Science and Policy

Ben Godacre’s Bad Science column has recently been moved to the inside back page of the Saturday Guardian, which means I read it over breakfast (like most people with a passing interest in sport, and old enough still to read news on paper, I always read newspapers back to front, even when the sport is in a separate section…). Last week, he wrote about the dearth of evidence in politics. Specifically, on the resistance to actually finding out – collecting and analysing evidence, in other words – if policies do what they were intended to. I couldn’t agree more, and it’s an issue that has been frustrating me for a while. Those of us involved in environmental science (and I suspect it’s the same in other areas) are constantly bombarded with calls to feed in to the ‘evidence-based policy’ process. Now, basing policy on evidence seems to me a very good idea (although I suspect that ‘evidence-based policy’ is usually just a stalling mechanism – a way to avoid making difficult decisions by constantly calling for more evidence before acting), and one result of this push is that the evidence base for phenomena like climate change, biodiversity loss, etc., is fast becoming exceptional.

But I’ve often felt that whereas ‘scientific’ policy is held up to very high standards of evidence, the same is not the case for ‘social’ policy (nor economic policy either, which may be the subject of a future blog…)

Rather, when considering which policy to advocate, politicians seem as likely to be swayed by a snappily-titled book than by any substantive body of evidence. A title like Blink (‘the power of thinking without thinking’), Nudge (‘improving decisions about health, wealth and happiness’), and Sneeze (‘harnessing the creative power of hayfever’) is ideal (OK, so I may have made one of those up), wherein a (sometimes good) idea is stretched well beyond its limits, and a hodge-podge of facts are crammed into this shaky framework. The Big Society beloved of Mr Cameron falls into this category: a scheme which nobody has tested, but on which basis incredibly important decisions are now being made. (For my money, the ‘big’ is redundant anyway, all that’s being described is what we used to call society (when such a thing still existed…).)

So, yes, Ben Goldacre is absolutely right: let’s get evidence into the policy process, and put some numbers behind big decisions (such as the voting system). If, say, we make wholesale changes to the NHS, triple university tuition fees, or whatever, we must carefully record the outcome of this intervention so that in years to come, we have a fighting chance of deciding whether or not it succeeded.

Where I depart slightly Goldacre is in how we do this. He (like most medics) is a firm believer in the randomised controlled trial, a tremendously powerful way to assess the efficacy of a medical procedure. In some cases, it may be feasible to perform analogous trials in social policy, but this will rarely be the case – you can’t, for instance, change the whole governance structure of one hospital in a region without changing others; and if you then end up comparing regions, the randomising is lost, as regions will differ in a series of other metrics.

I should add that Goldacre’s column is predicated on two books about randomised trials in social policy, which I haven’t yet read. My scepticism is derived more from my experience in applied ecology, where there has been a move recently to adopt medical methods – specifically, systematic reviews – to assess the outcome of conservation interventions. The problem is, ecosystem manipulations are not clinical trials. Often, there is no standard intervention, and even if there is, it may be applied to very different systems (differing in species composition, and all kinds of physical characteristcs, not least spatial extent). And often too, there is no agreed-upon outcome – I could increase the species richness in my garden, for instance (at least for a while) by introducing Japanese knotweed, but few would argue that that would be a ‘good’ conservation outcome. In medicine, you treat a patient, and they get better or they don’t, making comparisons between trials much more straightforward.

The solutions that environmental scientists have come up with generally are highly sophisticated statistical methods, allowing us to draw powerful inference from nasty, heterogeneous data. Similar methods have of course been applied to social systems, but somehow they don’t seem to feed through to policy, at least not as often as they should; and even when they do, they risk being ignored if their message is politically undesirable .

To return to the original point, improving social and environmental policy requires that we know what has worked, and what has not, in the past and elsewhere. So solving this evidence problem (i.e. gathering it, and communicating it) should be top priority for both natural and social sciences.

Tense

I am in paper-writing mode, which means (among other trials and tribulations) I am wrestling with the issue of tense. Specifically, did I apply my methods in the past, or am I implementing them now? Did my results show something, or are they still showing them? Which tenses need to match (and when…)? Primarily, this is a matter of style, and there seems to be a consensus that the present tense is somehow indicative of more exciting, more relevant results. Start your paper ‘Here, we show for the first time…’ and you are right on the bleeding tip of the cutting edge. ‘Here, we showed for the first time…’, on the other hand, and you are already yesterday’s news.

Now, I’ll plead guilty to sprucing up my dry academic prose with liberal sprinklings of immediacy, probably more often than is strictly healthy. But this kind of perky presence can really grate if used to excess, and can lead one into tricky little linguistic culs de sac to boot. A special bugbear of mine, for instance, is the insistence of all TV and radio historians on relating long-past events exclusively in the present tense. “In 1066, William invades England. After a bloody battle, he is victorious…”, and suchlike. I assume they have all been told that it somehow brings history alive to talk in this way. It didn’t, it doesn’t, and it won’t. It just annoys me. (If you’ve never noticed this before, you will now, and I’m afraid I may have ruined your enjoyment of the otherwise wonderful Simon Schama, for which I apologise!)

To return to the matter in hand, the particular problem with writing, say, a description of your statistical methods in the present tense, is what happens when one thing leads to another? For example, suppose you fit(ted) a linear model to some data, but on inspection of the model output, decide to remove a non-signficant interaction in order to make interpretation easier. You could find yourself writing:

“We model y as a function of x1, x2, and their interaction. Because the interaction is not significant, we exclude it and re-run the model”, which doesn’t seem right to me – you have described a sequence of events, but only one point in time. But if you start with “We modelled y as a function of x1, x2, and their interaction…” you are then committed to the past tense throughout.

Similar choices have to be made in the results. Is y significantly related to x; or was it? I tend to prefer the present in this case, because to use the past tense implies that the results were somehow contingent on something specific I did on the single occasion I ran the models for the paper, rather than being a constant property of the data (analysed according to my protocol).

But consistency is the key. And with a large pool of co-authors, and sufficient iterations of a manuscript through multiple drafts, tenses do tend to drift. So you can be performing an experiment which produced certain results, and other logical slips.

None of this really matters, perhaps. But if you want reading your paper to be a pleasant experience (as well as a necessary one, naturally) for your peers, then maybe it is worth plotting the timeline of your sentences to make sure no wormholes have appeared.

(For more on scientific writing, by the way, Tine Janssens has collected a load of good links in her latest interesting post, so rather than replicate them here, I’d encourage you to read them there.)

To AV or to AV not?

We go to the polls in the UK tomorrow, in a national referendum to decide whether we should change the voting system we use in parliamentary elections. The choice is between our current system, First Past the Post (FPTP), and the Alternative Vote (AV) system. There’s a good, non-partisan explanation of the alternatives here, but briefly, under FPTP each voter gets a single choice, and the candidate with the most votes wins – even if they receive rather a low percentage of all votes cast. Under AV, voters can rank candidates according to preference, and second preferences are counted until one candidate has at least 50% of votes. Proponents of AV system argue that it will result in fewer ‘wasted’ votes – so I could vote, say Green, but have Labour as a second preference to indicate that I would rather they got in than the Conservatives. Opponents balk at the idea that someone can win the first round (i.e., get the most votes), but not get elected. I can see pros and cons in both systems, but the debate has descended into pettiness and misinformation, particularly on the part of the ‘No’ (No to AV, that is) campaign.

In particular, the No campaign have used a range of sporting analogies to suggest that it would be ridiculous in a race for someone to cross the line in first position, yet not be declared the winner. My gut feeling was that this is actually unlikely to happen very often in practice, but I hadn’t seen any data from countries which use AV, to back this up.

The Yes campaign have, however, used Australia as an example of a country which has used AV for years, without any general trend for less stable governments (coalitions, hung parliaments, etc.) than we’ve had in the UK under FPTP. But what I wanted to know was, how often does a candidate finish second or lower based on first preferences, and end up getting elected?

Turns out, there’s a ton of data easily available to look at this. I found information on the 1998 federal election which stated that “99 of the 148 electorates in the House of Representatives required the distribution of preferences. In 7 of these seats… the candidate who led on primary votes lost after the distribution of preferences.” So, in only 5% of cases did the caricature of the No campaign, i.e. a ‘loser’ winning, actually happen.

I had a look at the 2010 election too. The definitive data are available to download here, for someone to do a thorough job of this – if, for example, they were employed on a campaign in favour of AV, say. I have neither time nor inclination to do a proper job, so I relied on the ABC report of the election results.

Looking through each of the 150 seats, I counted 139 (93%) in which the candidate who lead on first preferences, ended up winning the seat. Of those 11 seats where the first preference winner lost on the second preference count, 10 were won by the candidate who came second on first preference. The remaining seat was won by the originally third-placed candidate. Excluding this 11th seat, the candidate who ended up winning was on average about 3.1% behind after the first round (range 0.1-9.5%); including all 11 seats, this changes to 3.7 (0.1-14.5).

So, in general, Australia offers little evidence that candidates who don’t come first based on first preferences will often end up winning seats, except in a (very) few pretty close-run races. Of course, AV may change voter (and campaigning) behaviour in all kinds of ways, some of which may not be desirable. But to base a campaign on such a straw man as ‘losers’ winning, is pretty disingenuous.

In the end, It looks like I will end up making my decision on this issue based on my distaste for a misleading and negative campaign, rather than through any great enthusiasm engendered by a campaign for positive change.