The big blue bit in the middle: still big, still blue

Last week, I had the dubious pleasure of revisiting some work I did over three years ago. Back then, as the Census of Marine Life was in its final stages, I got together with Edward Vanden Berghe, then managing the Ocean Biogoegraphic Information System (OBIS), to investigate the suspicion of CoML senior scientist Ron O’Dor that surveys of marine biodiversity largely overlooked ‘the big blue bit in the middle’ – the deep pelagic ocean, by far the largest habitat on Earth. The idea that Edward and I hit on was to use OBIS to produce a plot that would show if Ron was right. OBIS contained at that time around 20 million records, with each record representing the occurrence of a specific species in a particular location. Only around 7 million of these also recorded the depth at which the species had been recorded, but by comparing the depths of these 7 million samples with a global map of ocean depth, we were able to place each of them at a position in the water column. As you can see (and as we showed more rigorously in the resulting paper), Ron was right: in all regions of the ocean, biodiversity records from midwater are far less common than those from the sea bed, or those from surface waters.

Fig 1. Global distribution within the water column of recorded marine biodiversity, using approximately 7 million occurrence records extracted from from OBIS in 2009. The horizontal axis splits the oceans into five zones on the basis of depth, with the width of each zone on this axis proportional to its global surface area. The vertical axis is ocean depth, on a linear scale. The inset shows in greater detail the continental shelf and slope, where the majority of records are found. Note this is slightly different from the version previously published, as it is scaled to the 2013 range of data.

We discussed the implications of this chronic under-sampling of the world’s biggest ecosystem in our paper, but when talking about this work I prefer to quote from another paper, by Bruce Robison:

The largest living space on Earth lies between the ocean’s sunlit upper layers and the dark floor of the deep sea… Within this vast midwater habitat are the planet’s largest animal communities, composed of creatures adapted to a… world without solid boundaries. Thes animals probably outnumber all others on Earth, but they are so little known that their biodiversity has yet to be even estimated

Since our paper came out, I have continued to use OBIS data in my research attempting to describe and explain the distribution of diversity in our oceans. At the same time, OBIS has changed too, both structurally - it’s moved from Rutgers in NJ to the IOC Project office for IODE in Ostend - and in terms of its content, now housing over 35 million records, including almost 19 million which recorded sample depth.

So back to last week, and an email from the current manager of OBIS, Ward Appeltans, asking if I might be able to update the figure from our 2010 paper with new OBIS data.

With some trepidation, I opened up the file of R code I’d used for the original analysis. And got a pleasant surprise: it was readable! Largely this was because I submitted it as an appendix to our paper, and so had taken more care than usual to annotate it carefully. I think this demonstrates an under-apprecaiated virtue of sharing data and code: in preparing it such that it is comprehensible to others, it becomes much more useful to your future self. This point is nicely made in a new paper by Ethan White and colleagues on making using and reusing data easier.

So, rather than days of fiddling, I was able to get the code up and running with new data really quite quickly. Of course, there were a few minor bugs to sort out - one thing I always do with R code now, but didn’t at the time, is to insert the command rm(list = ls()) at the top, to clear my workspace. The fact my old code didn’t work immediately was, I think, down to my failure to do this - the code required an object that was clearly hanging around in my workspace at the time. But it was simply a matter of correcting a name inside a function and it all worked fine. (Actually, one thing still doesn’t work well, which is getting the figure from R into a satisfactory, scaleable vector format which looks nice in other packages - the PDF looks OK (but not great) in Preview but awful viewed in Acrobat, for example - but that’s another story…)

What happens, then, to our view of the depth distribution of marine biodiversity knowledge when we increase the number of observations from 7 million to 19 million?

Fig 2. Figure 1 updated to use the c. 19 million suitable occurrence records available in OBIS in 2013.

Actually, rather little: the overall pattern is pretty much the same, with far more records from shallow than deep seas, and a paucity of midwater records at all depths. The big blue bit in the middle remains both big and blue.

Postscript: Ward at OBIS emailed me to suggest that this post comes across a bit on the negative side, which was certainly not my intention. Even back in 2009 OBIS was a phenomenal resource for marine biodiversity research; the fact that in under 4 years, the number of useful records for my analysis has increased >2.5x is amazing. My view is still that the big blue bit in the middle remains both big and blue, but it's very heartening to see the fingers of yellow extending further and further into the colossal deep pelagic ocean. It would be nice to think that our data visualisation exercise has had something to do with this!

Precision Phrased

Following on from my last post, which contained some pretty precise phrases (albeit not ones of my own), I’ve been reading with interest some good recent posts on the need for precision in scientific writing. On this network Matt Shipman (@ShipLives) wrote about the importance of defining technical terms, using examples like ‘extinction’ which is often used more loosely than its specific and precise meaning allows. Elsewhere, Lewis Spurgin (@LewisSpurgin) channels Orwell in an entertaining (and, I may add, spot-on) tirade against poor writing in science. Simon Leather (@EntoProf) has a more specific complaint regarding lazy referencing and ignoring precedence when citing the work of others. Although these three posts take aim at different targets, they are united by their abhorrence of sloppiness. Writing properly matters, whether that be choosing the right word, avoiding lazy phrasing, or applying rigorous standards of scholarship. On two occasions recently I’ve had pause to reconsider the precision of my own writing. First, I received an email about one of my papers (the one I discussed here). It was a nice email, but it contained this:

There was one sentence that I wanted to ask you about: ‘More generally, the environmental conditions (as measured by the typical spatial and temporal scales of environmental variation) that are likely to select for or against the evolution of specialisation are not restricted to either marine or terrestrial systems…’ I just want to make sure that I understand this correctly – are you suggesting that both marine and terrestrial systems have relatively variable environments and also relatively stable environments?

‘No, no, no!’ was my initial thought, ‘you’re not meant to actually read what I write, and certainly not to take it seriously!’ This partly was just an expression of my underlying anxiety about being rumbled and hounded out of the academy; but it also instigated a minor panic about my writing methods. Writing is something that comes quite easily to me - I’m not making any claims as to quality, but when I get going the words flow without major obstruction. A consequence of this is that sometimes I will write something that feels right, stylistically, without necessarily considering every nuance of meaning. My defence against this is to proof-read and edit mercilessly and repeatedly, and usually this works. On the occasion in question, for example, a second reading was reassuring, and I was able to respond that yes, that was exactly what I had meant; furthermore, I could happily stand by it as I was confident that it was a valid statement (even if not expressed with perfect precision).

The second instance is rather different. I was part of a group which drafted a call for a large funding programme in marine ecology. The resulting document suffered from all the pitfalls of writing-by-committee, and contains various ambiguities (is this a ‘work package’ or an ‘objective’?), lazy citations, important-sounding but essentially empty phrases, and incidences of imprecision that didn’t seem to matter at the time as we were aiming for ‘big picture’ stuff.

But now I’m part of a consortium that is actually trying to get at some of this funding, and every hollow phrase, every ambiguity or lazy shortcut, is coming back to haunt me. The document we wrote is now a sacred text. Sticking to the text is seen as crucial to securing the (multi-million pound) prize. Hushed and urgent conversations include phrases like, “We must follow exactly what the text says here”; “When they give examples of the kinds of things we might address, this means we must address exactly those things, and nothing else” [perish the thought that they were simply the first examples to occur to us]; “What do you think they meant by this?” [just sort of sounded right…]

The moral? Be careful what you write. Someday, somebody may read it carefully and quiz you on it. Worse, they may not, and take what you wrote at face value. You’d better hope then that your phrasing is sufficiently precise to bear the weight of responsibility.

Pretentious, moi? Literary quotes in science

The most important thing to consider as a PhD student writing up is, of course – I’m sure we’d all agree – what quotes you plan to use in order to show of to your examiners just how cultured and well-read you are. A decade and more after submitting my thesis, I’m still proud of my selections, feeling they tick both boxes. (I will leave it to you to decide whether they also tick a third, ‘pretentious git’.) Having finally, reluctantly come around to the fact that the total number of people ever to have read my masterwork is unlikely to increase any time soon, I thought I’d share them with you here. First thing to note: I took this quote selection process very seriously (as is right and proper) and started noting down potential candidates fairly early in my PhD. I was determined to avoid anything commonplace, and in particular steered well clear of quotation dictionaries. Also – I only now realise – it never really occurred to me to quote a scientist, still less a scientific paper. I guess I thought that side of me would be well represented throughout the rest of my work, and I wanted these choice quotes to reflect instead my more arty, sophisticated, fancy-cocktail-and-complicated-music sensibilities.

I also need to give some context. I spent my PhD studying the phenomenon of rarity. Rarity is common: most species are extremely restricted both in terms of numbers of individuals and spatial distribution. What are the causes and consequences of this? In particular, I was interested in whether rare species are in any sense special – for instance, do their biological characteristics differ consistently from those of common species? So throughout my studies I was on red alert for any interesting use of the word ‘rare’, and especially anything that carried connotations of oddity arising as a function of being rare.

The perfect quote finally arrived in the cinema, as I was watching Terry Gilliam’s masterful interpretation of the great Hunter S. Thompson’s Fear and Loathing in Las Vegas. I had no notebook, no pen; however, I knew I had the novel at home so simply had to re-read it (always a pleasure) to find the quote, no? No. Turns out it’s not in the book; so I bought the VHS (OK, OK: I'm old) when it came out and watched it, finger poised over the pause button (and rewinding several times to make sure I’d interpreted Johnny Depp’s drawl perfectly) until I grabbed the quote:

There he goes, one of God’s own prototypes – a high-powered mutant of some kind never even considered for mass production. Too weird to live, too rare to die. Raoul Duke, Doctor of Journalism, of his Attorney

The rather odd attribution was because I was unsure if it was a Thompson original, or directly from Gilliam’s sceenplay, so I stuck with the character names. Only later did I find the original source, in The Great Shark Hunt, a collection of Thompson’s writing, where he uses it to describe his (HST, Doctor of Journalism, alter-ego: Raoul Duke) real-(if larger-than)-life attorney, Oscar Zeta Acosta.

So that was all nice and relevant to the topic of my thesis, but how should I demonstrate the true depth of my intellectual facilities? Being a bit of a francophile, I thought I should have something in French; and who better to quote than Enlightenment poster-boy Voltaire? But I didn’t want anything run-of-the-mill – nothing from Candide, say. Fortunately, I’d read a collection of Voltaire’s work, and came across this from Memnon to start my introduction:

Memnon conçut un jour le projet insensé d’être parfaitement sage. Il n’y a guère d’hommes à qui cette folie n’ait quelquefois passé par la tête. Voltaire, Memnon (ou la sagesse humaine), 1747

My French is far rustier these days, but a (very) loose translation is something like, “One day Memnon came up with the ludicrous plan of becoming perfectly wise. There are few men to whom this mad idea has not occurred, from time to time.” Seemed somehow apt.

Finally, I needed something to start the general discussion. My thesis was rather a rambling affair (the first comment of my external examiner was, “Tell me, why did you decide to write two theses…?”), and I found a gem in Francis Wheen’s terrific biography of Karl Marx. I was not trying to make a political point – although it’s hard to disagree with the sentiment of ‘from each according to his ability, to each according to his needs’ – but through Wheen’s book I had become quite fond of Marx the fallible man, especially the contradictions between his socialist ideas and his own rather upwardly-mobile social pretensions. He was quite the procrastinator too, and as a writer nearing the end of this major project, my PhD thesis – and freshly out of funding and relying on benefits and the generosity of friends – I certainly empathised with the sentiment expressed here:

The material I am working on is so damnably involved… but for all that, for all that, the thing is rapidly approaching completion. There comes a time when one has forcibly to break off. Marx, letter to Joseph Weydemeyer, 1851

I have never really stopped struggling with this. (Neither did Marx: it took a further 16 years after he wrote the above for the first volume of Das Kapital actually to appear…) Knowing when to finish something, to submit and move on, is not my greatest strength. Perhaps this is the place.