Tom Webb

June 10, 2015

Science, Gender, and the Social Network

Tom Webb

June 10, 2015

Some while ago, preparing a piece for the British Ecological Society’s Bulletin on the general scarcity of female ecology professors, we had the pleasure of interviewing Professor Anne Glover. (Shortly afterwards Anne went on to become EU Chief Scientist. Coincidence? You decide…) One of the things that Anne talked to us about was the importance of informal social networks in career progression within science. Business conducted after hours, over drinks. Basically Bigwig A asking Bigwig B if he (inevitably) could think of anyone suitable for this new high level committee, or that new editorial board; Bigwig B responding that he knew just the chap. That kind of thing. In some ways this is one of the less tractable parts of the whole gender in science thing. Much harder to confront, in many ways, than the outright and unashamed misogyny of the likes of Tim Hunt, simply because it is so much harder to pin down. We know that all male panels in conferences, for instance, are rarely the result of conscious discrimination, more often stemming from thoughtlessness, laziness, or more implicit bias.

With something as public as a conference, of course, then we can easily point out such imbalances, and smart conference organisers can take steps to avoid them. (My strategy, by the way, is to identify the top names in your field, and invite members of their research groups. Has worked wonders for workshops I have run.) But how to get more diversity out of those those agreements made over a pint (or post-pint, at the urinals)?

One way is to take steps to help a wide range of early career scientists to raise their profile. Be nice to them online, invite them to give talks, promote their papers, and so on. But another way into prominence is through publishing. Not your own papers (though that helps, of course); but the process of publishing others. Get a reputation for reviewing manuscripts well, and invitations onto editorial boards will follow. From their, editorial board meetings and socials, and your name starts to gain currency among influential people.

All of which is fine, but peer review is an invitation-only club. If you’re not invited, you’re not coming in.

Which brings me to the point of this post. I’m on a couple of editorial boards - Journal of Animal Ecology and Biology Letters. As a handling editor, I am responsible, among other things, for inviting referees to review manuscripts. And when I do this, you can bet your life that I will be calling on those potential reviewers nominated by the authors. Not exclusively, but certainly they will figure.

And I started to wonder what kind of gender balance there might be among these suggestions. 34 papers in, here’s your answer. (I should stress that the identity of the journals has no bearing on the following, all statistics are purely the result of choices made by submitting authors.) Over 40% of submitting authors did not suggest any female referees, with female suggested referees exceeding males on only 2 occasions, and a median proportion of 15% female suggestions. The number of suggested female referees does not increase with the total number of referees suggested, neither is there any relationship between the proportion of female authors (median in this sample of 1/3) and proportion of female suggested referees (correlation of 0.05, if you want numbers). Here’s a couple of figures:

What’s the message here? Maybe we need to start thinking more carefully about lists of names we come up with, not just when these choices will be public - speakers at a conference for example; but also - perhaps especially - when they will not. And not just because of benefits that reviewers may or may not eventually receive in terms of board membership and so on. We get quickly jaded about the whole process of reviewing manuscripts, and forget too soon what a confidence boost it can be to be asked.

And just a coda: I’ve been thinking about this blog post for some time, a year at least. What is depressing is the number of occasions over that year - Hunt’s ridiculous outburst merely the most recent - when I have thought ‘I must get that post written, it’s so topical right now.’ How many years since Anne Glover outlined all these issues to us? (Eight, and counting.) How much has actually changed?

Well, one thing has, at least - the rise of new social networks, the online community that can be cruel but can also be incredibly supportive, providing a voice for those whom certain public figures would prefer to remain mute. These networks are open, no longer dependent - thank goodness - on 1950s values, beer-fuelled patronage, and old school ties.

Tom Webb

April 10, 2015

Cricket averages: what do you mean?

Tom Webb

April 10, 2015

Easter has always seemed a nothing sort of a holiday to me. Partly it’s because I never know when it will be (I would vote for a party that pledged to standardise Easter, but that’s another matter…) There is - of course - an R function, timeDate::Easter(), but Easter’s date will never be ingrained in the way that Christmas is, and thus anticipation will never build to the same extent. There’s not much to look forward too, either. Don’t get me wrong, I quite like chocolate; which is why I eat it whenever I feel like it, regardless of the time of year. And even when I was a pious little church-going boy, I could never actually get excited about Easter. But the end of the Easter holidays? Well, that was a different matter. Summer term meant many things - ties became optional, blazers were off most of the time, and the daily school bus ride was less of a trudge when the sun was out. The main thing, however, was the neat, flat, freshly mowed square of grass waiting for us in the middle of the playing field which meant one thing: cricket. For a few years I lived for cricket, and would play at every opportunity. And when I couldn’t play - when it was raining, or dark, or winter - I would pore over back issues of Wisden Cricket Monthly, soaking up the hallowed stats.

I guess many kids - I don’t want to fall in to gender stereotypes, but I could probably have written ‘many boys’ there without too much controversy - are introduced into quantitative thinking through a fixation with sports statistics. And cricket is great for stats - I’m not sure we have a dedicated R book yet in the way baseball does, but a game so slow and intricate, with so many things to measure and count, has spawned a wealth of stats, now fully searchable through interfaces such as cricinfo’s statsguru. Thus, more or less any notable feat in a cricket match is some kind of record - the highest score by an English wicketkeeper batting at number 7 in the 3^rd innings of a test match against Pakistan at Headlingly, and so on. As a kid I lapped all this up, and most numbers up to 501 (Brian Lara’s record for the highest first class score) have some cricketing resonance for me.

As my quantitative skills became more sophisticated, however, I began to realise that what are called ‘stats’ in sport are usually just data, there to be arranged, cherry picked, or otherwise massaged to tell whichever story suits a particular commentator’s overriding narrative. Furthermore, I started to question the gold standard by which cricketers are remembered - their ‘average’. For batsmen, this is the mean number of runs they have scored per completed innings; for bowlers the mean number of runs conceded per wicket. And these are the numbers most keenly studied by students of the game, used to judge one player against another, or to assess the vagaries of form of an individual player over the course of his career.

There are a number of reasons to dislike the naive arithmetic mean, even in situations where it is a good measure of central tendency. For instance, designing public transport to be comfortable for people of average height leaves the half of the population (that into which I fit…) generally uncomfortable. But how useful is it in judging a player’s performance? Well, it depends what you want to know.

Let’s take the most famous average in cricket, 99.94 (you’ll note the precision; cricket nerds love precision). That was the average that Don Bradman ended his test career with - famously finishing in his 80^th innings with a duck (0) against England at The Oval when a score of just 4 would have secured a career average of 100. Bradman’s average is the most freakish of outliers - no other batsman who has batted 20 or more times has averaged higher than 65, with 50 typically considered the halmark of an exceptional player - but a look at his figures still serves to illustrate some points.

First, you can see that the distribution of Bradman’s scores is highly skewed. This makes complete intuitive sense - batsmen are always vulnerable early in their innings (lots of low scores, including seven scores of 0), but once they get ‘in’ the best batsmen capitalise with a big score. Few if any did this better than Bradman - he passed fifty 42 times, converting 29 of these to scores of 100 or more, 18 of which were what the kids these days call ‘daddy hundreds’ (>150), two thirds of these eventually ending over 200 (ten double hundreds and two triples).

But what is also clear is that Bradman hardly ever scored anything close to his average. Only three times did he finish with a score within 5 runs of 99.94 - two scores of 103, one of 102. He was not dismissed in two of these innings, so in my plot they are added to the next completed innings - which, as it happens, includes in one case the third such innings. So, there’s a noticable hole in the frequency distribution of completed innings between scores of 89 and 111, exactly where the average lies. Bradman’s average, then, is a really poor indicator of his likely score in any particular innings - he was far more likely to score 0, or 225 (±10), than 100.

What might we do as an alternative? Bradman’s median score is the far less romantic 67, something he scored close to (±10 runs) about 10% of the time. His geometric mean score (problematically removing the problematic 0s) is 45.23, which again he was close to once every ten innings. Maybe we should cite too a measure of variability - the standard deviation, say, which is 94.17, or the median absolute deviation of 80.06.

All of this though misses the point, which is that Bradman’s average tells us one thing loud and clear: he was an astonishingly good batsman. And while we might want to make some distinctions between players from different eras, or in different forms of the game, for broad comparisons the average serves pretty well. It seems silly to read too much into the decimal places - was Alan Border, with a career average of 50.56, demonstrably better than my childhood hero Viv Richards (50.23)? Of course not. Occasionally, too, you’ll get a Jason Gillespie event - a player with a career average of 18 scoring a double hundred - just as Bradman got his ducks. So on an innings by innings basis, the average might not be useful, but over the course of a year or two scores will tend to, well, average out. Does an average of 42.35 then indicate a stronger batsman, likely to score more heavily than one averaging 10.74? Even when applying the arithmetic mean to a horribly skewed distribution? Well yes, I think it does.

(Oh, and if you wondered which players have averages of 42.35 and 10.74, well, they’re on the same team, but the data aren’t from cricinfo…)

Tom Webb

February 27, 2015

Diversity and extinction of tongues and species

Tom Webb

February 27, 2015

Some years ago, at a rather posh function in a swanky London venue, I got talking to a peer of the realm. By this point I had been drinking my endless glass of wine for some time (they have stealthy waiters at these kinds of dos), and didn’t quite catch his name, but he had been, apparently, head of a large supermarket chain. And his response to me mentioning the word ‘biodiversity’ has stuck with me. “When I took over at M&S”, he said - or was it Morrisons, or maybe Sainsbury’s? - “I noticed that we stocked loads of different kinds of tomatoes. I said that we should just stock one kind, but make sure it was a fucking good tomato. I sometimes think the same about biodiversity: focus on just a few species, but make sure they are fucking good species.”

Well, an interesting take I suppose, and perhaps the logical outcome of a purely utilitarian approach to nature. But not, I submit, a view that would go down well with many conservation groups. No place in this world for God’s own prototypes, the weird and the rare never considered for mass production. No place for a grass-powered bear reluctant even to reproduce, or a fish content to spend its entire life in a tiny pool.

So anyway I filed away the anecdote, to be dusted off from time to time when the occasion arises. But I got to thinking about it again just recently, after reading the excellent Lingo: A language-spotter’s guide to Europe by Gaston Dorren. Over 60 brief chapters, this book provides pen portraits of dozens of European languages, from the behemoths of English, German, and French to tiddlers like Manx, Monegasque and Sorbian. It is full of fascinating nuggets, such as the plural for the Welsh word cwm being (naturally) nghymoedd. There are also examples of useful words that English might consider - the German Gönnen, for example, “the exact opposite of ‘to envy’: to be gladdened by someone else’s fortune.” Interesting that we happily adopted Schadenfreude but not this… Other favourites include the Dutch Uitwaaien, to relax by visiting a windy, chilly, rainy place; the Sorbian Swjatok for the enjoyable hours that follow the end of the working day; the wonderful Greek Krebatomourmoúra, “similar in meaning to ‘pillow talk’ but with a greater element of discord”; and the Slovene Vrtičkar, “strictly speaking no more than a hobby gardener with an allotment, but the word also suggests that the person is more interested in spending time with other vrtičkars than in growing vegetables and flowers.”

More than these fun pieces of trivia, however, the book gives a valuable overview of the languages and people of my home continent, including useful tips - tricks to identify written languages, a primer in the cyrillic alphabet - as well as a potted history of conquest and subjugation. But it is also a study of loss: of the extinction and near extinction (and, more positively, occasional resurrection) of our continent’s linguistic diversity.

The parallels with biological diversity are striking, and of course I am not the first to make them. Indeed this lovely paper by Tatsuya Amano and colleagues actually presents a full macroecological analysis of the world’s 6909 languages, formally assessing extinction risk based on the same criteria that the IUCN use to assess species. They show that around a quarter of all languages are threatened based on a small ‘range’ or population sizes (spoken in an area of less than 20 square kilometres, or by fewer than 1000 people), or an alarming rate of decline. Their maps showing hotspots of diversity and threats, and their analyses of drivers of change, also have a familiar look to those of us more used to examining spatial patterns in biodiversity.

Of course this seems sad, just as the loss of diversity within languages is also troubling, as we lose the ability to express uniqueness of place and of our connection with the landscape. But the thing with language is that it is so personal - especially for me, now, watching my kids go through the endlessly fascinating process of acquiring it. And so whereas I unequivocally want to prevent the extinction of species, as far as languages go - well, a little part of me agrees with the good Lord above. Diversity is great in theory, but in practice…? Basically, I want my kids to learn a fucking good language.

Happily, at this point in time, I have no conflict to resolve: English, for better or worse, is just such a language. But what if I’d got that job in north Wales a few years back? Not only might I have had to contend with the frankly unthinkable proposition of children on mine shouting for Wales in the Six Nations, what about the possibilities for mischief opened up by kids speaking a language I can’t understand? And while bilingualism has many advantages, wouldn’t it be kinder to your kids to have them fill the ‘second language’ part of their brain with something more ‘useful’? Spanish or Mandarin or something else that opens up new parts of the world to them?

No doubt this attitude arises in part from my monoglot culture, beautifully captured in the Eddie Izzard quote with which Dorren begins his book, “Two languages in one head? No one can live at that speed! Good Lord, man, you’re asking the impossible!” On the contrary, learning two, three, four languages seems perfectly possible in many parts of the world. But for those seriously threatened languages, well, keeping them alive - truly alive, not simply remembered - means that some people’s children have to learn them. And I can’t help wondering: is that really fair?