Unnecessary Cladistics

There is one more thing about ‘Nature’ style descriptions that I want to address here that that affects more than just the highest ranked journals, (though they are the hardest hit / most culpable depending on your interpretation), and that is the pressure for descriptions of new taxa to be accompanied by cladistic analyses. Now let us get on thing clear right off, I love cladistics and I consider myself a cladist. As a tool for biological (and beyond as it happens) research, it is an extremely useful and powerful one and its correct use can add enormously to a description (by placing the organism in it’s correct phylogenetic context) and is generally useful for all kinds of evolutionary things.

However, it is a complex and difficult method to get to grips with. On the face of it, it is simple, assemble anatomical character descriptions, code them, run the program, print the tree. But that simple description masks a huge number of subtleties, complexities and details that take a long time to master and many simple mistakes can wreck an otherwise good analysis and render the results incorrect and effectively meaningless (by the way if you are horribly lost at this point, cladistics is the method used to create those evolutionary trees you see everywhere that show how taxa are related to one another). I consider myself a competent cladist only – I took more than one course as an undergraduate and masters student on the topic, cladistics formed a large part of my PhD research, and I have published several papers that include cladistic analyses, I have experimented with methods a little and taught the odd lecture on the subject. I would be happy to provide a phylogenetic analysis with a new taxon I was describing provided I was familiar with the anatomy of the clade in question and recent phylogenies based on (I did this for Fodonyx), but I am in the minority. Cladistics, despite its apparent simplicity is a tricky bugger and is rarely taught beyond the most basic principles (there are not many cladists out there), yet poorly conceived and poorly executed phylogenies are regularly and needlessly tacked onto descriptions, and this seems to be more common with the higher ranking journals who seem to pressure unwilling authors into including them.

There are several problems with this approach. First of all of course is people messing around with things they don’t understand. If you do not know how to do a cladistic analysis or are messing around with a clade from well outside your research area, get some help, or just don’t do it. A cladistic analysis is not essential to a description, and if it is that important you can always publish one later as a separate paper. Secondly, if pressure is applied by an editor to include one then not only is the author more likely to add one (skilled or otherwise) but the constraints of getting a high-impact publication will make them want to avoid bringing in extra co-authors. Added to that of course, cladistics take time, and a rushed analysis tacked on the end of a paper is unlikely to be especially good, no matter how expert the researcher. Worst of all of course is the ‘extended error’ that comes from a poor analysis.

It is one thing to produce a poor analysis of one’s own, but another for it to be used by someone else. Phylogenetic trees are so important now for so many kinds of analyses (like biogeographical studies, looking at evolutionary rates, patterns of evolution, extinctions etc.) that a single tree will often be used many times by many other people for their own research. The problem is though, that if that original tree was poorly constructed and did not accurately reflect the data that went into it, then any analysis based upon it will be flawed. Not all trees are the same in terms of quality – even two apparently very similar datasets can produce wildly different results in terms of the tree depending on which methods are applied to the data and why this was done. The tree might not even be suitable for further analyses such as those mentioned above, but this may not be realised by the researcher. Almost every mistake of this kind is entirely innocent – not everyone can be a specialist in every methodological field and so it is perhaps inevitable that inappropriate trees will be co-opted for these kinds of analyses, but it certainly does not help the literature when they are. If you do not even know that trees can be made in different ways or represent different things, then you are not going to even think about comparing it to other trees, or asking the original author about its origins etc.

However, there is a difference between innocent error as a result of ignorance, and of knowing or suspecting that you are playing around with methods well outside of your normal realm of research without the necessary experience or assistance and producing work you know, or at least suspect, might not be very good. I simply wish non-cladists would stop thinking it is so simple and making bad mistakes about how cladistics works, how it is used, and what the results mean. If they are not confident about it, fine, but why are they being pressured by themselves or by editors into adding unsuitable or poor analyses? It only compounds the problem that few people are experienced cladists as when the thrust of a new paper is on the organism and its interesting features, the priority for an editor will be to find reviewers who can check the main parts of the author’s arguments, and therefore are probably not cladists themselves, allowing these analyses to slip through the net.

In short, the practice of tacking on cladistic analyses to the ends of descriptive papers really should be stopped, or at least significantly tightened up. Even those simple ones where a new line for the new taxon is added to an old analysis are frequently not as reliable as one might expect, and often the original analysis itself had problems. Authors should be more aware of what they do, and do not, understand (which is true of any study, admittedly) and editors should be more aware of the fact that these issues are important. Phylogenies are integral parts of papers and need to be checked by cladists and not just published on the assumption they are correct without a proper review.

8 Responses to “Unnecessary Cladistics”

  1. 1 Zach Miller 15/07/2008 at 2:53 am

    I’ve never liked seeing “we added Tomosaurus to Dick, et al.’s phylogeny (1998), which resulted in Bobosaurus being the most basal member of the Harrysauridae.” I understand that original cladograms might take awhile, but why do one at all in a short, barely-there description in the first place? I mean, I’ve seen cladograms tacked onto the end of 3-page papers which describe a jaw from a new animal. A JAW!

    “Based on these two, maybe three, derived features, we think that Mandiblesaurus is the most basal member of the Craniosauridae.”


  2. 2 Brad McFeeters 15/07/2008 at 3:47 am

    If you want to see a really strange little cladogram in a Nature paper, take a look at Xu et al 2004. The sister taxon of Tyrannosauroidea is… Citipati?!

  3. 3 David Hone 15/07/2008 at 7:11 pm

    Brad, that is just a function of the taxa that are in the matrix, not one of bad practice. With just a couple of genera more derived than tyrannosaurs and a limited pool of characters, that is no surprise. The problem would be people taking that result and thinking the relationhsip between T and C is important.

    Similarly, Zach there is nothing fundamentally wrong with adding one line to an existing matrix, *provided* you understand exactly what the characters are and how they are scored and the matrix was analysed. It would be silly to expect each new species description to be accompanied by a new independent analysis with hundreds of characters and dozens of OTUs, but it *is* reasonable to expect them to do this competently, and that (in my opinion) happens too rarely.

  4. 4 Dave Godfrey 16/07/2008 at 11:09 pm

    Given the prevalence of cladistic analyses to accompany every new find I wonder if there’s an attitude that if you haven’t done one your paper won’t get published. Stick another line in an existing matrix, and you can say exactly where your new animal fits- even if the evidence is really somewhat lacking.

  5. 5 Jaime A. Headden 21/07/2008 at 5:35 pm

    I cannot add anything meaningful to the post. It is concise and descriptive. I would only say that in my own analyses, a single bone, depending on the constraints under which it evolves, can be phylogenetically informative. Bone series or complexes, such as a mandible-only analysis, is possible, and adding a mandible to a matrix of more complex material was a consideration of several papers in the 2003 JVP (23:2) that have attempted to analyze the effect of missing data. The journal Cladistics is regularly filled with more. The key is taxon sampling, not specimen sampling, though sampling specimens sufficiently is also important.

    Something interesting we are seeing is the growth of specimen-specific matrix samples, which can help clarify the cohesiveness and effect of long-branch attraction in close-knit taxa towards stems or the base or crown of the tree.

    I myself am working (and have been for some time) a matrix analyzing theropod dinosaur humeri and phylogeny-independant features (I will likely fail, but should I fail or succeed, this is useful info).

  6. 6 David Hone 21/07/2008 at 9:24 pm

    Dave, I really do think that is genuinely part of the problem, but people should avoid the temptation unless they understand hpw and why the initial matrix was designed, coded and analysed. In my opinions, too few people do.

  1. 1 Taxonomy « Dave Hone’s Archosaur Musings Trackback on 12/12/2008 at 8:43 am
  2. 2 No ado about much – new dinosaur footprints from China « Dave Hone’s Archosaur Musings Trackback on 16/12/2008 at 11:01 pm
Comments are currently closed.

@Dave_Hone on Twitter


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 574 other followers

%d bloggers like this: