Bet that title’s got you itching to read on! Feel free to skip this post if you think stats are boring. You’d be wrong, but I won’t judge you… Anyway, getting straight down to the nitty gritty, and assuming that if you’ve hung in until now then you’re not afraid of words like ‘linear’ and ‘model’, here’s the thing: when I learnt statistics, one thing that was drummed in to us was that, if you’re fitting a linear model which includes interactions, you can’t sensibly interpret the main effects. I’ve been telling people the same ever since. But, in almost every paper that I review or edit, and which uses such models, people do just that. My purpose here then is to explain why I think that’s wrong. And hopefully, to find out if I am wrong to think that way.
Let’s start with a contrived example. Suppose we’ve measured activity levels at different times of the day across 100 individual birds, 50 of which are larks and 50 are owls, and we get the following:
Of course, in analysing these data the obvious thing to do would be to fit a linear model with activity modelled as a function of hour, species, and their interaction. The fitted lines on the figure above illustrate this model. And the explanation is straightforward: activity increases towards dawn in larks, and decreases towards dawn in owls.
What doesn’t make sense is to make any statements about general differences between larks and owls, or about any general trend in activity with time from dawn, because these ‘main effects’ are completely entangled within their interaction.
Unfortunately, most statistical software packages will give an output which includes significance levels for both main effects and interactions, for instance:
Or in Anova form (R guys – I know this is wrong, but its particular flavour of wrongness is not important for this point!):
Both of these outputs make it look like the main effects are ‘significant’, and the coefficients even seem to tell you the direction of these effects. So the kind of interpretation I read again and again would look something like: “There is a significant interaction between activity and species (p < 0.0001). In addition, activity increases with hour from dawn (p < 0.0001), and is higher in owls than in larks (p < 0.0001).” Which, as we’ve just demonstrated, is nonsense. And just because this example has been contrived to emphasise this point, doesn’t mean it doesn’t apply equally whenever there is an interaction; sensible plotting of your data will usually reveal this.
What should you do instead? Report the significant interaction, and then describe it, using the table of coefficients (and your knowledge of how your stats package of choice uses aliasing) to calculate the slope (and intercept if you like) for each level in the interaction. Here for example, the slope for the relationship for larks is 1.04, and for owls is 1.00 (1.045 – 2.048) – and you can get confidence intervals for both easily enough. And if the interaction doesn’t seem to be important, take it out, and interpret main effects to your heart’s content.
That’s my take on it anyway. The very concept of the ‘significance’ of a main effects is meaningless in the presence of an interaction between them. And so I keep telling people. So if I’m missing something, please let me know!