My colleagues Mike Taylor and Andy Farke among others have done an admirable job of promoting the concept of open access in palaeontology, both for data and for the actual research papers that academics produce. However, while this is on the whole a very good thing, it has I believe (in conjunction with other phenomena) produced problems from the frontline scientists whom it is supposed to help.

While what I am about to write may be seen as a complaint it should not be – it is an observation. It is for me currently problematic, but that does not mean that I do not support open access (I do) or that this is a huge issue (it isn’t) or that on balance open access is a bad thing (not true either). With change comes problems, some foreseen and others not, and most if not all ultimately overcome or sidestepped to the general satisfaction of most so this is not something I expect to be a long-term issue. Here I simply want to illustrate a couple of problems that I have not seen commented on or discussed before. So with this in mind, what’s the problem?

The issue is one of the critical mass of literature that is now landing on the desks (or these days, hard-drives) of palaeontologists. Now I admit that since I dabble in quite a range of areas (theropods, sauropods, pterosaurs, body size, flight, evolution, ecology and so on) I’m probably at the wrong end of this and if I just stuck to reading so, only the pterosaur and cladistics literature to keep up with systematics and new pterosaurs it’d be fine. However, I don’t, and in any case there are still subjects just as ‘narrow’ as pterosaur phylogenies that produce far more papers (we’ve quite possibly had more tyrannosaurs alone this year than pterosaurs, certainly more phylogenies) so it’s not necessarily a fair comparison.

When I was finishing up my BSc which was only just 10 years ago I would go along to the library every week or two and flick through the issues of a couple of dozen journals that the library had the were relevant to my interests and the courses I had taken. It took a few minutes (and in the Biology department at Bristol where I was based we had a bigger selection than many others I knew of in other universities). If I was starting to read research on a new field then I could get to a pretty advanced (for its time) online catalogue system and search the catalogues of both my own university and that of a number of others in the area and try to identify specific papers (from the title and keywords alone, and of papers only going back to around 1980) that might be worth a look and then try to get them via an expensive and not especially fast loan.

In other words, if I wanted say, to see what literature was out there on lion behaviour I might be lucky and find a few in the library and then be able to loan out a couple more that may or may not have been useful.  It could take days to get them all and I might be left with half a dozen papers. It would not be great, but it might well be sufficient for the purpose intended. If I wanted more than this, then I could read what I have and start trying to track down the records of those papers cited in the ones I had that looked important. I could also trace some keywords and titles in vast volumes that listed them and find new papers from there. Even an expert with a big collection of papers would probably struggle to have an especially complete collection of papers on so specific a subject, especially rare and historical ones, or those published outside of western Europe and North America.

Now I can go straight to Google Scholar and access thousands, if not tens of thousands, of articles of interest almost instantly. Sure you have to hunt though those you don’t want or need and download a bunch of rubbish or find critical papers are not accessible but the time saved (all that photocopying!) and range of material is enormous. Even things you can’t access at once are easy enough to ask for. You can track down almost any researcher in minutes and e-mail them asking for their papers when you might have had to send a transatlantic letter to a researcher (you hoped was not in the field and had not moved) and wait for a response. I’d guess that from scratch I could get 100 papers on tyrannosaurs today if I tried, and a fair few more if I e-mailed around and asked, when it might have taken me weeks or months before and cost a fair bit of money in loans and paper to get 20. I can also get historical manuscripts, whole books, extra notes and commentary, translated articles of non-English journals, whole masters and doctorate theses and more. Tons of foreign language papers (and even English language ones) I never knew even existed 10 years ago are now available online and with them, thousands of new articles. All of this is, of course, good. But it’s hard not to look at this wealth of information and not see a few problems with it.

First of all, for the uninitiated or inexperienced, it’s far harder to find the really good stuff. It takes time to work out what are good papers and which are not, good and bad journals and good and bad authors. If you only had access to 20 papers, it was easy enough to skim them and pick the half dozen that should form the basis of your next generation of reading and research – when you start with 200 that instantly becomes much, much harder. Similarly it’s easy to get blinded into thinking that what you have is enough – you might be able to download and read 100 papers, but without the half dozen key ones that really stand as vital in the field, all that superficial stuff many just not be good enough. Now of course this need not be too bad – the experienced researcher knows a good paper when he sees one, even for a field he has never looked at before, and journals like Nature and TREE have always been great reads, and you should soon spot an obvious gap in your collection if people keep referring to a paper you don’t have. However, I do think this is still an issue for some whereby quantity is confused with quality or quality can be masked by quantity.

Secondly and more importantly this availability seems increasingly to be viewed as a necessity and not a bonus. I see authors trying to cram a reference to every damned paper on a subject into a manuscript in either some unnecessary one-upmanship contest with the rest of the world, or desperately trying to give themselves credibility for simply having read 25 papers on a subject. Perhaps there are other reasons, but I can’t figure them out. Worse, this is sometimes used as stick by some referees and editors (and others)  to beat authors with. You can cite four or five papers to support and argument and then get criticised for leaving out one 1964 Brazilian article on the subject, or accused of plagiarising an idea because you had not read / cited that 1964 Brazilian paper.

This is silly.

Yes, I well appreciate that as a researcher you have a duty to know the field in which you are writing. You must read and understand and take in a significant fraction of the most critical papers on a subject and to do otherwise is a disservice to your own work and that of others. But that does not mean you should try to track down every reference that has ever related to a field and read it and cite it. There has to be a balance. I simply cannot stop to read *every* paper on theropod morphology on the off chance that I have been pre-empted on a point I wish to make, or to note the two exceptions that occur, or document every time someone else has said it before. It could take literally months in places to read the literature on a subject that might refer to one otherwise minor point in a paper. No one would ever get any work done, or would be forced to specialise enormously so that you just had one incredibly minor area to work on and could keep on top of the new papers.

In short, we need to strike some kind of balance, or perhaps rather come to accept a new status quo. There is an absolute shedload of data out there that is now accessible and did not used to be and this is a very good thing indeed. However, it is beyond the practical means of researchers to be expected to read every single paper that has ever been published in their field, or even to read everything published each year (depending of course, on quite what your field is). Things will be missed and mistakes will be made, just as they always have, but I think the sheer volume of material now available and increased communication between researchers makes it appear much worse than it is. It used to be seen as a faux pas if you missed a 70’s Nature paper on something, now it’s seen as problematic if you miss a foreign language paper published 70 years ago and the two are not the same. Good researchers and good referees will I’m sure help make this situation easier, but in the meantime I suspect things will be testing for some.

12 Responses to “Keeping up with the literature”

  1. 1 David 02/11/2009 at 1:23 pm

    When I start to look at a new area in my field (environmental and resource economics) I look for recent survey papers and the most cited papers on the topic and read those first. I also somewhat rely on known (highly cited) authors It will then be clear if I’m missing something important. Reading everything is impossible and an inefficient strategy. But I don’t see much downside to the electronic era.

    • 2 David Hone 02/11/2009 at 2:10 pm

      As I say David, it depends on how you treat it and I have seen the alledged under-reading of the literature (i.e. missing old and / or obscure papers)as a stick with which to beat researchers. Simialrly, while I’d agree your strategy is sound – it relies on there being an up-to-date and accurate survey paper(s) of the field in question. In my experience in vert palaeo these things are all too often very few and far between for a great many subjects. Perhaps we simply don’t produce many of them, but I’m struggling to think of many good ones – they seem to be the exception and not the rule.

  2. 3 David 02/11/2009 at 2:39 pm

    I suppose economics is a much bigger discipline where there just as many or more applied workers in private business and government as academicians, so there are journals that just publish surveys more or less like Journal of Economic Literature, Journal of Economic Perspectives, and Journal of Economic Surveys and other journals who also publish some surveys/reviews.

    • 4 David Hone 02/11/2009 at 3:16 pm

      We do get reviews, but the only really good review journal I can think of for the whole of biology (that does many short punchy ones, as opposed to the occasional mammmoth one) is TREE (Trends in Ecology and Evolution) which is superb buit again is just one journal covering the whole of biology. I’m sure other commenters might well chip in with other journals, but I’d stand by the ‘reviews are just not that common in vert pal’ aspect of my comment. Which is a shame as they are *so* useful and many journals really don’t seem to like them.

      • 5 Jerry D. Harris 03/11/2009 at 10:00 am

        Well, there’s also Biological Reviews, but paleo-relevant stuff there is rare at best. Most reviews, when they come out at all, are in books (both the edited, collections of papers kinds and the all-chapters-by-the-same-authors kind.

      • 6 David Hone 03/11/2009 at 1:40 pm

        Right, which aren’t always the most easily accessible of tomes (a lot of libraries won’t carry the Indiana series on dinosaurs for example, and some universiteis may not have a copy of the Dinosauria lying around).

  3. 7 Andy Farke 03/11/2009 at 10:43 pm

    Interesting post, and I tend to agree for the most part. There is indeed a trend to over-cite a little bit, without genuinely meaningful citations.

    To extend the discussion in a slightly different direction, I would contend that perhaps an equally large problem is the tendency by paleontologists to completely ignore the highly relevant neontological literature. Without giving away any details (it doesn’t involve any of the regulars here, regardless), I recently reviewed a paper on the functional morphology of an extinct animal. This area of functional morphology has been an active zone of research by a number of very well-known biomedically- and zoologically-leaning researchers. . .and yet the authors of the paper I reviewed ignored this work in favor of citing one or two papers on dinosaurs (and a mostly irrelevant, but oft-cited in the paleo literature, paper on extant animals). I know at least one of the authors of the manuscript in question, a recognized “expert” in the eyes of pretty much everyone, should have known better, too! It’s frustrating, and such (very common) situations don’t do much to elevate paleontology’s reputation as a relevant discipline.

    • 8 David Hone 04/11/2009 at 8:13 am

      Well that is something I have complained about before (and mention above) that people have a duty to read the relevant stuff, but they also need to avoid getting bogged down in the irrelevent or minor stuff (next time you write the sentence ‘tyrannosaurs are monophyletic’ try to guess how many papaers you could cite to support that contention!).

