statistics

Wild extrapolation and the value of marine protected areas

Last week, the UK National Ecosystem Assessment published a follow-on report on the value of proposed marine protected areas (rMPAs) to sea anglers and divers in the UK. This report gained a fair bit of coverage, likely because the headline numbers it proclaimed are quite astonishing: “The baseline, one-off non-use value of protecting the sites to divers and anglers alone would be worth £730-1,310 million… this is the minimum amount that designation of 127 sites is worth to divers and anglers”. Furthermore, they claim an annual recreational value for England alone of the rMPAs of £1.87-3.39 billion, just for these two user groups (divers and anglers). These numbers are so astonishing, in fact, that my bullshit klaxon went off loud enough to knock me off my chair. See, I’ve been thinking recently about sea angling as an ecosystem service, and so know that there’s estimated to be somewhere around 1-2 million sea anglers in the UK. The number of divers is, I reckoned, likely to be considerably lower (there’s a higher barrier to entry in terms of equipment, qualifications, etc.). So these headline figures imply an annual spend -  purely on their hobby - somewhere in the order of £1000 for every single self-declared sea angler or diver. Which seems rather on the high side, given that one would expect a very long tail of ‘occasional’ dabblers in each activity (e.g. people who spin for a few mackerel on holiday).

So, I bucked down and read the 125 page report, to find that the authors had done some things really nicely. Their valuations are based on online questionnaires featuring a combination of neat choice experiments, willingness to pay (WTP) exercises, and an valuable attempt to characterise the non-monetary value of the sea-angling or diving experience (things like ‘sense of place’, ‘spiritual wellbeing’, etc.). But the headline numbers are highly dubious (worthless, in fact), because they did a few things very very badly indeed. Unfortunately, they did a different bad thing for each of their two major monetary valuation methods, so the numbers emerging from each are equally dodgy, as a modicum of mental arithmetic, common sense, and ground-truthing will show.

First, the annual recreational value models are nicely done, using a choice experiment based on travel distances to hypothetical sites with different features to assess which of those features are most valuable. Mapping these features onto the rMPAs leads to a ranking of these sites in terms of how attractive they are to anglers and divers. One could quibble with details here - perhaps the major quibble would be that there is no ‘control’, i.e. no assessment of the value of sites which are not proposed for protection. But in general, I think this analysis gives a decent estimate of how the survey respondents value the different sites.

They then attempt to get an overall annual value for each site by multiplying its value to individuals by the number of visits it receives in a year. This is where the problem arises: attempting to generalise from these respondents to the entire population of anglers (estimated at 1.1-2 million) or divers (estimated at 200,000). I’m going to concentrate on the anglers because the issue is most extreme here: their models are based on 273 responses, a self-selected group of anglers acknowledged within the report to be especially committed (averaging 3-4 excursions a month) and interested in marine conservation, and representing between 0.01 and 0.02% of the total population, i.e. 1 or 2 responses per 10,000 anglers (they also a self-selected sample of highly experienced divers, representing around 0.5% of all divers, i.e. 5 per 1000). Extrapolating from this sample to the entire UK angling population produces some interesting results.

For example, using this methodology Chesil Beach & Stennis Ledges rMPA on the Dorset Coast has an estimated 1.4-2.7 million visits by sea anglers annually. That translates to 3800-7400 visits every single day of the year. Compare this to a (highly seasonally variable) average of around 3000 visits per month to Chesil Beach Visitors Centre. Or you could look at the Celtic Deep rMPA, a site located some 70km offshore, where they estimate between 145,000 and 263,000 angling visits per year. That’s 400-720 visits a day, which translates to approx 40-70 typical sea angling boats, each full to the gunwales every single day of the year. Of course, this is simply because the tiny sample is uncritically extrapolated. In the case of the Celtic Deep, it is straightforward to calculate that there were actually 36 observed visits, which (when divided by 273 and multiplied by 1.1 or 2 million) gives you 145-263,000 estimated visits. Using this logic, the minimum number of visits a site could receive is (1/273)*1.1 million, or >4000. Diving numbers are similarly unrealistic, with estimates of 123-205,000 visits a year (340-560 per day) by divers to Whitsand & Looe Bay, or 26-44,000 a year (70-120 per day) at Offshore Brighton, which is around 40km offshore.

This kind of wild, uncritical extrapolation is staggering, akin to using the opinions of a focus group of LibDem party activists to predict a landslide in the next election. It’s a textbook example of the utility of a bit of simple guesstimation (e.g. a million visits a year means 10,000 visits/d for 100 days, or ballpark 2500/d over the whole year), allied to some common sense (have you ever experienced those kinds of numbers when you’ve visited the UK coast?)

So, we can discount the big annual recreational value figures. What about the WTP exercise? WTP has its fans and its critics. My view is that it’s a useful way of ranking scenarios according to preference, but I don’t give a lot of credence to the ££ generated, simply because by increasing the number of scenarios you can quickly get people to commit more cash than they intended. But regardless of that, the authors of this report appear to have made a very strange decision in aggregating the WTP estimates arising from their questionnaire. They worded the questions very carefully, presenting each respondent with a single site, outlining its features, and asking how much they would be WTP as a one-off fee for its protection - being sure to think of this amount as a real sum of money, in the context of their household budget. These numbers are then used to give an average WTP for all the rMPAs, which seems reasonable, and a useful way to rank the sites.

But They then simply multiply these site level averages by the whole UK angling (or diving) population to get a total WTP for the whole set of rMPAs.

Think about what they’ve done there.

They’ve asked people how much they would be willing to pay to protect a single site, and have then assumed that the same person will pay a similar amount for every site in the network. So if you agreed that you’d be prepared to pay a one-off sum of £10 to protect a site, you could find yourself with a bill for over £1000 to protect the whole network. (This is a slight over simplification, as specific values are site-specific, but it is essentially what they’ve done.) You simply cannot aggregate WTP like this. I mean, I’m not an economist, but if economists think you can do this, they are deluded.

Again, a bit of common sense would have helped here. The authors compare this WTP to an insurance premium, which is a useful analogy. But how many anglers or divers are really, when it comes down to it, prepared (or even able) to shell out a £1000+ insurance premium to prevent damage to the marine environment which may or may not occur in the future?

Anyway, that’s what’s been bugging me these last few days. I could go on (for instance, on a more philosophical level, is replacing strictly regulated commercial fishing with unregulated recreational angling necessarily a good thing for the marine environment? Will diving or - especially - angling actually be allowed in these rMPAs?). And there are some useful things in the report. It confirms that people do value the marine environment, really quite highly, and that different features are valued differently by different groups - a useful starting point for some more focused research, and helpful in placing relative values on different rMPAs. But unfortunately - inevitably - media attention has focused on the ludicrous headline numbers, something the authors have actively encouraged in their framing of the report.

A final positive point to end on: my bullshit klaxon seems to be in fine working order.

The big blue bit in the middle: still big, still blue

Last week, I had the dubious pleasure of revisiting some work I did over three years ago. Back then, as the Census of Marine Life was in its final stages, I got together with Edward Vanden Berghe, then managing the Ocean Biogoegraphic Information System (OBIS), to investigate the suspicion of CoML senior scientist Ron O’Dor that surveys of marine biodiversity largely overlooked ‘the big blue bit in the middle’ – the deep pelagic ocean, by far the largest habitat on Earth. The idea that Edward and I hit on was to use OBIS to produce a plot that would show if Ron was right. OBIS contained at that time around 20 million records, with each record representing the occurrence of a specific species in a particular location. Only around 7 million of these also recorded the depth at which the species had been recorded, but by comparing the depths of these 7 million samples with a global map of ocean depth, we were able to place each of them at a position in the water column. As you can see (and as we showed more rigorously in the resulting paper), Ron was right: in all regions of the ocean, biodiversity records from midwater are far less common than those from the sea bed, or those from surface waters.

Fig 1. Global distribution within the water column of recorded marine biodiversity, using approximately 7 million occurrence records extracted from from OBIS in 2009. The horizontal axis splits the oceans into five zones on the basis of depth, with the width of each zone on this axis proportional to its global surface area. The vertical axis is ocean depth, on a linear scale. The inset shows in greater detail the continental shelf and slope, where the majority of records are found. Note this is slightly different from the version previously published, as it is scaled to the 2013 range of data.

We discussed the implications of this chronic under-sampling of the world’s biggest ecosystem in our paper, but when talking about this work I prefer to quote from another paper, by Bruce Robison:

The largest living space on Earth lies between the ocean’s sunlit upper layers and the dark floor of the deep sea… Within this vast midwater habitat are the planet’s largest animal communities, composed of creatures adapted to a… world without solid boundaries. Thes animals probably outnumber all others on Earth, but they are so little known that their biodiversity has yet to be even estimated

Since our paper came out, I have continued to use OBIS data in my research attempting to describe and explain the distribution of diversity in our oceans. At the same time, OBIS has changed too, both structurally - it’s moved from Rutgers in NJ to the IOC Project office for IODE in Ostend - and in terms of its content, now housing over 35 million records, including almost 19 million which recorded sample depth.

So back to last week, and an email from the current manager of OBIS, Ward Appeltans, asking if I might be able to update the figure from our 2010 paper with new OBIS data.

With some trepidation, I opened up the file of R code I’d used for the original analysis. And got a pleasant surprise: it was readable! Largely this was because I submitted it as an appendix to our paper, and so had taken more care than usual to annotate it carefully. I think this demonstrates an under-apprecaiated virtue of sharing data and code: in preparing it such that it is comprehensible to others, it becomes much more useful to your future self. This point is nicely made in a new paper by Ethan White and colleagues on making using and reusing data easier.

So, rather than days of fiddling, I was able to get the code up and running with new data really quite quickly. Of course, there were a few minor bugs to sort out - one thing I always do with R code now, but didn’t at the time, is to insert the command rm(list = ls()) at the top, to clear my workspace. The fact my old code didn’t work immediately was, I think, down to my failure to do this - the code required an object that was clearly hanging around in my workspace at the time. But it was simply a matter of correcting a name inside a function and it all worked fine. (Actually, one thing still doesn’t work well, which is getting the figure from R into a satisfactory, scaleable vector format which looks nice in other packages - the PDF looks OK (but not great) in Preview but awful viewed in Acrobat, for example - but that’s another story…)

What happens, then, to our view of the depth distribution of marine biodiversity knowledge when we increase the number of observations from 7 million to 19 million?

Fig 2. Figure 1 updated to use the c. 19 million suitable occurrence records available in OBIS in 2013.

Actually, rather little: the overall pattern is pretty much the same, with far more records from shallow than deep seas, and a paucity of midwater records at all depths. The big blue bit in the middle remains both big and blue.

Postscript: Ward at OBIS emailed me to suggest that this post comes across a bit on the negative side, which was certainly not my intention. Even back in 2009 OBIS was a phenomenal resource for marine biodiversity research; the fact that in under 4 years, the number of useful records for my analysis has increased >2.5x is amazing. My view is still that the big blue bit in the middle remains both big and blue, but it's very heartening to see the fingers of yellow extending further and further into the colossal deep pelagic ocean. It would be nice to think that our data visualisation exercise has had something to do with this!

Zombie stats and hair-trigger outrage: reflections of a Twitter addict

It seems somehow odd to come over all reflective about Twitter, that most impulsive of online communication channels. But over the year or so that I’ve been using it as a kind of super-effective personalised newsfeed, several cautionary tales have played out in my Twitter feed, which I have here distilled into two key lessons. First: distrust numbers, even – especially – those whose implications sit well with your worldview. And second, reign in your outrage: issues are almost certainly more complicated than 140 characters allow. Twitter, by its very nature, gives you soundbites. If you’re lucky, you’ll get a link to something more substantial, but it is so very easy to retweet something that appeals to your sense of how the world works without scrutinising the numbers. My favourite example (not least because it got me onto BBC R4’s More or Less!) is the ludicrous ‘100 Cod in the North Sea’ story that I blogged about last year. Now it takes just a moment’s thought, if that, to realise that this number is very very wrong (by a factor of at least several hundreds of thousands, in fact). But it played so nicely into the ‘overfishing is devastating our seas’ narrative that many people declined to give it that moment, and unthinkingly retweeted.

This is just one example, but my Twitter feed is full of them. I’m interested in the natural environment, and the impacts that we are having upon it, so I follow a range of environmental groups who tend, for instance, to jump immediately on numbers making renewable energy look especially attractive. Now climate change terrifies me, and I am fully behind the idea that we need to decarbonise the economy as a matter of some urgency. But I also agree wholeheartedly with David McKay who, discussing the favourable carbon footprint on nuclear power in his (essential, free) Without the Hot Air, states “I’m not trying to be pro-nuclear. I’m just pro-arithmetic”. Fortunately, the vigilance of people such as Robert Wilson (@CarbonCounter_) provides a corrective to some of these numbers (see for example his dissection of an awfully inaccurate Guardian report on costs to consumers of gas vs. renewables). But such arguments rarely translate so well to Twitter soundbites and so the zombie stats – numbers that we know are wrong, but which are appealing – refuse to die.

What’s the harm in all this? In my post on the cod story I mentioned the fragile trust that now exists between the fishing, scientific and conservation communities which has led to promising progress in the recovery of North Sea cod stocks but which can easily be shattered by the promulgation of laughable statistics. More generally, dubious numbers muddy whatever water they fall into. In his excellent critique of mainstream economics The Skeptical Economist, which I reviewed previously, Jonathan Aldred warns that “dubious numbers are infectious: adding a dubious number to a reliable one yields an equally dubious number”. Which leads me to propose Mola mola’s second law1:

An argument advances with the rigour of its most dubious number

So it doesn’t matter how watertight the ethical case for regulating cod fisheries, or for moving away from fossil fuels; if you use farcical numbers to advance this case, the argument will fail to progress.

Another consequence of these kinds of zombie stats is the hair-trigger outrage for which Twitter is (in)famous. This applies just as much to the niche worlds of the practice and administration of science as it does to Westminster or celebrity gossip. In particular, it is rare for a day to pass without some call appearing in my timeline to sign a superficially worthy-looking petition. I am extremely wary of doing so, for a couple of reasons.

For instance, a while back I signed something against some kind of reforms (I didn’t read the details) in a European marine institute which I have visited a few times and where I have friends and colleagues. Surely I should support them in their hour of need? Hmmm. Well. Next time I went, my friend – an extremely conscientious and committed member of the institute – said words to the effect of ‘Please, nobody sign that petition. These reforms are exactly what we need and the people fighting against them do nothing here.’ Duly chastised, I resolved to be more discerning in future.

Then there was the curious incident of Nerc’s planned merger of the British Antarctic Survey with the National Oceanography Centre. Now it is my view that Nerc handled this pretty poorly, but although there were some pretty convincing arguments – both scientific and geopolitical – against this merger, there were good arguments for it too, and chatting to people I know at both NOC and BAS (as well as reading things like this excellent coverage from Mark Brandon) just confirmed to me that this was very far from the black and white issue painted by many environmental journalists and pressure groups who backed a petition against it. And the joy which met the announcement that the merger (which is what was proposed, although it was typically presented as the 'dismantling' or ‘abolition’ of BAS) made no mention of the budgetary constraints at Nerc which had prompted the proposal, which still exist, and which will now require that savings be made in some other area of environmental science.

To reiterate: I don’t know whether the correct decision was made. But that’s precisely the point. I know the issues and institutions involved pretty well, yet didn’t feel sufficiently well-informed to decide one way or another. In such a case, it would be hugely irresponsible of me to sign any petition. Yet many thousands of people did. Call me an old cynic, but I doubt all that many of them had read widely on the rationale for the merger. Sadly, it is now far easier to create a petition – let alone sign one – than it is to inform yourself about an issue. Of course, the UK government has made a rod for its own back here with e-petitions initiative and its commitment to debate in parliament any issue that gains 100,000 signatures. But it is our responsibility as thoughtful citizens to take the issue of signing a petition seriously. Which usually means basing our opinions on more than 140 characters of research.

Bearing these caveats in mind, and keeping critical faculties engaged at (almost) all times, Twitter remains for me an essential source of information, conversation and debate, and an invaluable means to publicise work and opportunities, and I encourage non-tweeters to have a go - good guides for sciencey-minded beginners here, here, here and here.

1First law here