Confessions of an Open Access Agnostic

The office that I worked in a few years ago had a window that opened onto the main University of Sheffield concourse. Every so often, lunchtimes would be enlivened by a student protest (typically over fees), during which someone with a megaphone would shout a lot. I remember clearly being struck with the thought, “I wonder if anyone has ever changed their mind about anything as a result of something they heard through a megaphone?” It certainly doesn’t work on me. Even if I broadly agree with the shouty person, the louder they shout the more inclined I am to pick holes in their argument. It is a character trait of mine, I’m not sure if you would call it a flaw or a virtue, that I hate being told what to do and, especially, what to think.

All of this explains, perhaps, my ambivalence towards Open Access (OA) publishing. I don’t like being told where I can and can’t publish. I distrust zealots, including well-resourced single-issue campaign groups which will hear no alternative views, which present shades of grey as simple black/white dichotomies, and which (a pet hate) bandy around variants on that tabloid favourite ‘tax-payers’ money’ (when they mean ‘public money’). I worry about people being pressurised into publishing in inappropriate journals, or – if they decide to stick with a non-AO journal, for whatever reason – not receiving the quality of review they deserve because of misguided boycotts. I don’t appreciate non-scientists in the media wading in with their ’aren’t you silly, you’ve been doing this all wrong for decades’ line. And I’m wary of the creeping sense – by no means restricted to science – that content should always be free, regardless of the costs involved in producing it. I’m not comfortable with the big publishers making huge profits from the outputs of science, but I also recognise that good publishers (and their employees) have done, and continue to do a terrific job to ensure the effective communication of science.

Of course, there is a more nuanced debate going on underneath the bluster. From what I see on Twitter, today’s debate at Imperial seems to be a good example (#OAdebate). Some very clever and thoughtful people have weighed things up and come down on the side of OA. And I’m not even sure that I don’t agree with them. Certainly, I am all in favour of the broader Open Science agenda – opening up the data we produce, and the tools we use to access and analyse it. But I remain to be convinced that access to primary research papers is such a big issue that it should be pushed above all else (partly because, with a bit of effort or an email or two, it’s usually possible to access most recent papers), and that all of this energy should be focused on it (whilst overlooking the interesting and potentially profound financial and sociological implications for scientists and their institutions).

My beef is not at all with OA, but rather in the way that the debate has been framed in terms of good and evil, right and wrong (not a million miles from the ongoing GM debate). Subscription-based (reader pays) publication of publicly-funded research costs public money, and has pros and cons. OA (author pays) publication of publicly funded-research costs public money, and has pros and cons. A shift to OA will not (I’m pretty certain) be accompanied by an injection of new cash, but will rather see a shift from funding infrastructure (especially libraries) to funding individuals (e.g. through research grants). And the debate should be on how best we spend limited public money to communicate the outputs of research in the most effective way. It could be that making all primary research available to everyone is the way to do this (although I don’t think accessing papers is quite so difficult as some would have us believe; and in any case, the readership for the vast majority of papers is tiny). It could be that we’d be better advised to concentrate on more effective communication of key results in other formats, or in making other products of our research (especially data) more widely available. Even if we hold OA as something to aspire to, I feel that blindly pushing it as a top priority risks sidelining more important debates about opening up science.

So thanks, but I won’t be signing any petitions just now.

The National Biodiversity Network and Biodiversity Research

Yesterday Nature reported on the launch of the Map of Life project, a new initiative to collate biodiversity records, which allows users to map these, to extract species lists for any area of the planet, and (ultimately) to upload their own data. Limited initially to terrestrial vertebrates and North American freshwater fish, the demo website still looks like a lot of fun. But it also reminded me somewhat of a UK-based project which has been running for a number of years, the National Biodiversity Network (and specifically the data service at the NBN Gateway). This gives me a good excuse to comment on the NBN, which I’ve been meaning to do for a while. Specifically, why hasn’t the NBN Gateway been used more by the research community? Let me first declare some interests. This question was raised by the British Ecological Society around 18 months ago, who convened a scientific working group chaired by Tim Blackburn, of which I was part. And the NBN Trust (@NBNTrust, if you like to Tweet) is actively trying to promote its potential as a research resource, and I’m writing this post partly in response to a request from Mandy Henshall, NBN Trust Information and Communications officer, to spread the word and to find out what it would take to get the data used.

The NBN grew out of the need to standardise and coordinate the many thousands of local, regional or national surveys to provide a national picture of the UK’s biodiversity. The NBN Gateway is simply the portal through which these data can be accessed. And it’s become an extremely impressive dataset: currently >75M records from >700 individual datasets. The Gateway itself if really nicely designed for the general user. You can search on an interactive map, or by site name, or by taxon, and quickly get a list of everything that’s been recorded – fantastic if you’re planning a trip to an RSPB reserve, say, and want to know what birds you’re likely to see; equally good if you’re leading a field trip and want to prime your students about what might be there. (Worth noting too that the NBN encompasses all taxa and habitats, including some limited coverage of marine systems.) As a citizen science / public engagement project, the NBN is absolutely superb, and I urge you to go and have a play.

But does it work as a tool for academic biodiversity research? Some things it does well, for instance the (nontrivial) task of standardising taxonomy across multiple datasets. But we identified several potential shortcomings, most obviously the fact that not all data are publicly available – it can be incredibly frustrating to see a great dataset identified by your search, but not to be able to access it. Of course, the problem of data access is not restricted to the NBN, and they clearly had to make a choice – include everything with restricted access, or include only a subset of available data which can be provided completely open. Other initiatives, for instance the Ocean Biogeographic Information System (OBIS) went this second route, the idea being that if sufficient people can be convinced to make their data available, peer pressure will mount on those who won’t. But this discussion of open data is best left for another day.

Other barriers we identified concerned the different ways that scientists like to access and download data, compared to the public. For instance, we often want to be able to access data programmatically, or at least to have an audit trail of specific queries, rather than working through nice friendly GUIs. And often we want to download data as a simple text file for further analysis, with no whistles and bells.

Finally, there is the matter of the data itself (and pedants: yes, data is a singular noun). The NBN contains some fantastic systematic scientific survey data, but also a lot of more haphazard observational data, which may be reliable in terms of recording the presence of a given species at a particular site, but which tells us little about absences. Suppose Mr Patel has a fascination with limpets, and has been counting them on Filey Brigg every week for years. His data would give us a fantastic picture of the limpet population, but the absence of records for barnacles or periwinkles doesn’t mean that they’re not there – crucial if you’re interested in the whole community.

Such limitations suggest that the researcher proceed with caution through the NBN gateway; but the advantages of such a huge dataset mean that simply to ignore it may be to miss out on a terrific resource. There are already various examples of NBN being used by students for research projects. The question is, what would it take for wider uptake by the research community?

Pure vs. Applied Research: Two outdated conceptions of science

Earlier this morning, the British Ecological Society tweeted:

Now, I knew they were doing this – my department is represented at the meeting, and I was involved in our preliminary discussions regarding what we thought were important questions. But I was always uneasy about the exercise, partly because important questions don’t always come in nice even numbers; but mainly because of the single word ‘pure’. Especially as this is a centenary exercise, and if we’ve learnt one thing in 100 years of ecology, surely it’s that you can’t separate ecology from people. Or, as another ecological tweeter had put it a couple of days before:

Turns out @lusseau is in good company. Back in 1965, Sir Peter Medawar delivered his Henry Tizard Memorial Lecture, Two Conceptions of Science. Although the conceptions he set out were the romantic or poetic, and the rational or analytical – “the one speaking for imaginative insight and the other for the evidence of the senses” [1] – and his lecture delves deeper into the philosophy of this decision than I intend to be here (subject, perhaps, for a future post) – he certainly has plenty to say on the basic versus applied division, with ‘romantic’ science “finding in scientific research its own reward” whereas ‘rational’ science calls “for a valuation in the currency of practical use.”

Medawar’s particular beef is with the class distinction which he saw as having grown up around the difference between ‘pure’ and ‘applied’ science. He characterises (or rather, consciously caricatures) his two sciences in terms of their practical use, as follows (evidently Medawar’s conceptions of science didn’t extend to female scientists, but perhaps we’ll forgive him, this was nearly 50 years ago!):

[Romantic] science can flourish only in an atmosphere of complete freedom, protected from the nagging importunities of need and use, because the scientist must travel where his imagination leads him. Even if a man should spend five years getting nowhere, that might represent an honourable and perhaps even a noble endeavour. The patrons of science – today the Research Councils and the great Foundations – should support men, not projects, and individual men rather than teams, for the history of science is for the most part a history of men of genius.

The alternative conception runs something like this… Scientific research is intended to enlarge human understanding, and its usefulness is the only objective measure of the degree to which it does so; as to freedom in science, two world wars have shown us how very well science can flourish under the pressures of necessity. Patrons of science who really know their business will support projects, not people, and most of these projects will be carried out by teams rather than by individuals, because modern science calls for a consortium of the talents and the day of the individual is almost done. If any scientist should spend five years getting nowhere, his ambitions should be turned in some other direction without delay.

Medawar was well aware that these two visions of science both have elements of truth, as science involves both “having an idea and testing it or trying it out”:

a scientist must indeed be freely imaginative and yet sceptical, creative and yet a critic. There is a sense in which he must be free, but another in which his thought must be very precisely regimented; there is poetry in science, but also a lot of bookkeeping.

Tempting though it is to continue quoting Medawar at length, we’re in danger of losing the thread here: let’s return to the division between ‘pure’ and ‘applied’ science. Medawar points out that in the early days of the Royal Society, “the idea of science pursued for its own sake was regarded as frivolous or even comic”, when “the opposite of useful was not pure but – idle.” And yet somehow, during the Romantic period, this perception got turned on its head: the notion of ‘purity’ arose (despite, as Medawar notes, no scientist ever expressing admiration of a piece of work as ‘how pure!’); applied science became rather vulgar, somehow morally inferior to pure research, “and with it the dire equation Useless = Good.”

It seems to me that we’ve come full circle, and now ‘pure’, ‘basic’ or ‘blue skies’ research is considered to be under attack from the dreaded impact agenda (although having claimed that science is vital to the economy, of course we are going to be asked constantly to affirm this!). But why should we be worried about the future of ‘pure’ research? “Invest in applied science for quick returns (the spiritual message runs), but in pure science for capital appreciation.”

In other words, the idea of some future, unspecified ‘use’ is often implicit in any defence of ‘pure’ research. For example, we make a virtue (says Medawar) of encouraging pure research in medical institutes – and I could say the same about the environmental sciences – in the hope that this basic research will someday bear fruit (i.e. result in a cure, or feed the world). “But there is nothing virtuous about it! We encourage pure research in these situations because we know no other way to go about it. If we knew a direct pathway leading to the clinical problem of rheumatoid arthritis, can anyone seriously believe that we should not take it?”

Medawar’s suggestion to break down this artificial division between basic and applied research is to reverse the usual process of university research, where :

…it is believed and hoped that something practically useful may be come upon in the course of free-ranging enquiry, whereupon research which has hitherto shed diffuse light will now come sharply into focus. This procedure… works sometimes, and it may be the best we can do, but… [m]ight not the converse approach be equally effective, given equal opportunity and equal talent? – to start with a concrete problem, but then to allow the research to open out in the direction of greater generality, so that the more particular and special discoveries can be made to rank as theorems derived from statements of higher explanatory value… Research done in this style is always in focus, and those who carry it out, if temporarily baffled, can always retreat from the general into the particular.

This, for me, is where the next century of ecological research lies: focus on the problem, use whatever means are necessary (from fundamental understanding to engineering solutions) to tackle it. Listing 100 questions in ‘pure’ ecology does not seem a sensible first step. But I’ll leave the last word to Medawar:

When I speak of our endeavours to make the world a better place to live in I neither say nor imply that this melioration can be achieved by purely material means, though I am quite sure it can’t be achieved without them… I believe, as many others do, that material progress is necessary for our improvement, but I do not know, and have never heard of, anybody who said that it could be sufficient… [My] general tone… may give the impression that I am an ‘optimist’, but indeed I am no such thing, though I admit to a sanguine temperament. I prefer to describe myself as a ‘meliorist’ – one who believes that the world can be improved by finding out what is wrong with it and then taking steps to put it right.

1 All quotes are from the text of Medawar’s 1965 Henry Tizard Memorial Lecture, and from the Introduction to his collection of writings Pluto’s Republic.