On endlings and singletons

There can be few words as poignant as ‘endling’, the name given to the last surviving individual of a species. Tell me you don’t find this image of Benjamin, the last Thylacine, heartbreaking? Or that you weren’t moved by the plight of Lonesome George? And what about Martha, the passenger pigeon? Doesn't her story make you weep at our limitless environmental profligacy? But what links all of these endlings is that we know they were once members of a thriving population - in Martha’s case, one of the most abundant vertebrate species on the planet. There are other species which are known only from a single individual. These species, perhaps lacking the poignancy of endlings but arguably more significant to students of biodiversity, are known as 'singletons'.

Singletons of course are to be expected when your survey area is small, or if your look for only a short period of time. If I counted the birds in my garden for an hour tomorrow morning, I’d expect to see multiple individuals of a few species - magpies, wood pigeons, house sparrows - but I’d be very surprised to see more than one sparrowhawk or wren. However, if I extended my search to my whole street, or to the whole of Sheffield, the number of singletons would drop off precipitously. And at the scale of the UK, over the course of an entire summer, any breeding bird species by definition must be represented by at least two individuals, so there should be no singletons at all in our core avifauna. Any that remain can be dismissed as shivering vagrants blown across the Atlantic, ornithological curiosities but ecologically insignificant.

Enter the sea, though, and the number of singletons remains stubbornly high, even when we expand enormously our study region. For instance, in an analysis of European benthic invertebrates I did a few years ago, I found that about 10% of the species in our very large database (2,300 species sampled from >15,000 locations throughout European seas) were singletons. Similar patterns appear when interrogating the Ocean Biogeographic Information System database I blogged about recently. For instance, OBIS contains records for >11,500 species occurring in the seas around Britain, yet over 45% of these are represented by a single record. At the global scale, 20% of the almost 80,000 marine animal species which occur in OBIS are singletons.

What's happening here? Are these singletons simply very rare species? We expect most species to be rare, but do our surveys of marine habitats really cover so small an area that we never pick up their conspecifics? And if this is so, what does this mean for marine ecosystem functioning? Do these rare species play a role? Individually, maybe not - Kevin Gaston has written on the dominance of common species in terms of numbers, biomass, and probably ecosystem functioning, in most communities. But collectively the singletons in a sample can be abundant, and if there were particular biological characteristics associated with being a singleton, then this could be significant. Unfortunately, about the one generalisation we can make about rarely observed marine species is that we know little about their biology, so we’re not yet in a position to answer this question.

An alternative explanation is that many singletons are mistakes. When analysing diversity surveys, we can take steps to ensure that taxonomic names are consistent, for instance by using the World Register of Marine Species to ensure that we use the accepted name for each species and not one of its (often many) synonyms (I've done that in the cases mentioned above). But what if the person who sorted the sample simply got their identification wrong? There’s not much we can do about that kind of mistake, although one would hope that errors of identification are not so frequent as to explain the very high prevalence of singletons.

Probably we won’t know the answers to such questions until sampling of a few large marine ecosystems reaches a sufficient intensity that we can have confidence that surveys accurately reflect the composition and relative abundance of the species present. For now, we can at least use the presence of singletons to tell us something about how far away from such complete knowledge we are. As I suggested in my last post, in certain marine systems the answer to this is: a very long way indeed. In the meantime, we are becoming more and more aware of the threats facing many marine species. We must hope that the singletons we find in our surveys are only statistical loners, the first observed rather than the last remaining individual. If they do in fact represent the Benjamins, Marthas and Lonesome Georges of their kind, then marine biodiversity is in more trouble than we thought.

The big blue bit in the middle: still big, still blue

Last week, I had the dubious pleasure of revisiting some work I did over three years ago. Back then, as the Census of Marine Life was in its final stages, I got together with Edward Vanden Berghe, then managing the Ocean Biogoegraphic Information System (OBIS), to investigate the suspicion of CoML senior scientist Ron O’Dor that surveys of marine biodiversity largely overlooked ‘the big blue bit in the middle’ – the deep pelagic ocean, by far the largest habitat on Earth. The idea that Edward and I hit on was to use OBIS to produce a plot that would show if Ron was right. OBIS contained at that time around 20 million records, with each record representing the occurrence of a specific species in a particular location. Only around 7 million of these also recorded the depth at which the species had been recorded, but by comparing the depths of these 7 million samples with a global map of ocean depth, we were able to place each of them at a position in the water column. As you can see (and as we showed more rigorously in the resulting paper), Ron was right: in all regions of the ocean, biodiversity records from midwater are far less common than those from the sea bed, or those from surface waters.

Fig 1. Global distribution within the water column of recorded marine biodiversity, using approximately 7 million occurrence records extracted from from OBIS in 2009. The horizontal axis splits the oceans into five zones on the basis of depth, with the width of each zone on this axis proportional to its global surface area. The vertical axis is ocean depth, on a linear scale. The inset shows in greater detail the continental shelf and slope, where the majority of records are found. Note this is slightly different from the version previously published, as it is scaled to the 2013 range of data.

We discussed the implications of this chronic under-sampling of the world’s biggest ecosystem in our paper, but when talking about this work I prefer to quote from another paper, by Bruce Robison:

The largest living space on Earth lies between the ocean’s sunlit upper layers and the dark floor of the deep sea… Within this vast midwater habitat are the planet’s largest animal communities, composed of creatures adapted to a… world without solid boundaries. Thes animals probably outnumber all others on Earth, but they are so little known that their biodiversity has yet to be even estimated

Since our paper came out, I have continued to use OBIS data in my research attempting to describe and explain the distribution of diversity in our oceans. At the same time, OBIS has changed too, both structurally - it’s moved from Rutgers in NJ to the IOC Project office for IODE in Ostend - and in terms of its content, now housing over 35 million records, including almost 19 million which recorded sample depth.

So back to last week, and an email from the current manager of OBIS, Ward Appeltans, asking if I might be able to update the figure from our 2010 paper with new OBIS data.

With some trepidation, I opened up the file of R code I’d used for the original analysis. And got a pleasant surprise: it was readable! Largely this was because I submitted it as an appendix to our paper, and so had taken more care than usual to annotate it carefully. I think this demonstrates an under-apprecaiated virtue of sharing data and code: in preparing it such that it is comprehensible to others, it becomes much more useful to your future self. This point is nicely made in a new paper by Ethan White and colleagues on making using and reusing data easier.

So, rather than days of fiddling, I was able to get the code up and running with new data really quite quickly. Of course, there were a few minor bugs to sort out - one thing I always do with R code now, but didn’t at the time, is to insert the command rm(list = ls()) at the top, to clear my workspace. The fact my old code didn’t work immediately was, I think, down to my failure to do this - the code required an object that was clearly hanging around in my workspace at the time. But it was simply a matter of correcting a name inside a function and it all worked fine. (Actually, one thing still doesn’t work well, which is getting the figure from R into a satisfactory, scaleable vector format which looks nice in other packages - the PDF looks OK (but not great) in Preview but awful viewed in Acrobat, for example - but that’s another story…)

What happens, then, to our view of the depth distribution of marine biodiversity knowledge when we increase the number of observations from 7 million to 19 million?

Fig 2. Figure 1 updated to use the c. 19 million suitable occurrence records available in OBIS in 2013.

Actually, rather little: the overall pattern is pretty much the same, with far more records from shallow than deep seas, and a paucity of midwater records at all depths. The big blue bit in the middle remains both big and blue.

Postscript: Ward at OBIS emailed me to suggest that this post comes across a bit on the negative side, which was certainly not my intention. Even back in 2009 OBIS was a phenomenal resource for marine biodiversity research; the fact that in under 4 years, the number of useful records for my analysis has increased >2.5x is amazing. My view is still that the big blue bit in the middle remains both big and blue, but it's very heartening to see the fingers of yellow extending further and further into the colossal deep pelagic ocean. It would be nice to think that our data visualisation exercise has had something to do with this!

Precision Phrased

Following on from my last post, which contained some pretty precise phrases (albeit not ones of my own), I’ve been reading with interest some good recent posts on the need for precision in scientific writing. On this network Matt Shipman (@ShipLives) wrote about the importance of defining technical terms, using examples like ‘extinction’ which is often used more loosely than its specific and precise meaning allows. Elsewhere, Lewis Spurgin (@LewisSpurgin) channels Orwell in an entertaining (and, I may add, spot-on) tirade against poor writing in science. Simon Leather (@EntoProf) has a more specific complaint regarding lazy referencing and ignoring precedence when citing the work of others. Although these three posts take aim at different targets, they are united by their abhorrence of sloppiness. Writing properly matters, whether that be choosing the right word, avoiding lazy phrasing, or applying rigorous standards of scholarship. On two occasions recently I’ve had pause to reconsider the precision of my own writing. First, I received an email about one of my papers (the one I discussed here). It was a nice email, but it contained this:

There was one sentence that I wanted to ask you about: ‘More generally, the environmental conditions (as measured by the typical spatial and temporal scales of environmental variation) that are likely to select for or against the evolution of specialisation are not restricted to either marine or terrestrial systems…’ I just want to make sure that I understand this correctly – are you suggesting that both marine and terrestrial systems have relatively variable environments and also relatively stable environments?

‘No, no, no!’ was my initial thought, ‘you’re not meant to actually read what I write, and certainly not to take it seriously!’ This partly was just an expression of my underlying anxiety about being rumbled and hounded out of the academy; but it also instigated a minor panic about my writing methods. Writing is something that comes quite easily to me - I’m not making any claims as to quality, but when I get going the words flow without major obstruction. A consequence of this is that sometimes I will write something that feels right, stylistically, without necessarily considering every nuance of meaning. My defence against this is to proof-read and edit mercilessly and repeatedly, and usually this works. On the occasion in question, for example, a second reading was reassuring, and I was able to respond that yes, that was exactly what I had meant; furthermore, I could happily stand by it as I was confident that it was a valid statement (even if not expressed with perfect precision).

The second instance is rather different. I was part of a group which drafted a call for a large funding programme in marine ecology. The resulting document suffered from all the pitfalls of writing-by-committee, and contains various ambiguities (is this a ‘work package’ or an ‘objective’?), lazy citations, important-sounding but essentially empty phrases, and incidences of imprecision that didn’t seem to matter at the time as we were aiming for ‘big picture’ stuff.

But now I’m part of a consortium that is actually trying to get at some of this funding, and every hollow phrase, every ambiguity or lazy shortcut, is coming back to haunt me. The document we wrote is now a sacred text. Sticking to the text is seen as crucial to securing the (multi-million pound) prize. Hushed and urgent conversations include phrases like, “We must follow exactly what the text says here”; “When they give examples of the kinds of things we might address, this means we must address exactly those things, and nothing else” [perish the thought that they were simply the first examples to occur to us]; “What do you think they meant by this?” [just sort of sounded right…]

The moral? Be careful what you write. Someday, somebody may read it carefully and quiz you on it. Worse, they may not, and take what you wrote at face value. You’d better hope then that your phrasing is sufficiently precise to bear the weight of responsibility.