12 January 2011

Names versus numbers: The great referencing battle

ResearchBlogging.orgAcademic writing is set apart from almost all other writing by its obsession with citations.

Mike Taylor hates numbered references. He notes that numbered referencing saves space in print journals journals where space is at a premium. For journals on the web, there are no space constraints, and thus no reason to use numbered references.

I want to dig down into reference styles a little. I think I can show why professional scientists are more likely to prefer references using author and year (also know as Harvard referencing) than anybody else.

Several people say that the (Author, Year) reference format allows them to evaluate some of the paper’s claims on the fly. “Wait a minute, why are they citing Jones and Peabody 1988? They didn’t show that!”

But this only works if you know the literature extremely well. I’m skeptical of how often people are able to do this. Personally, I constantly have to look up which paper some particular claim was in. Heck, I can barely tell you the years some of my own papers were published. “Hm. Did that one come out in 2005 or 2006?”

But never mind my own shortcomings. The point is:

The more deeply a scientist specializes, the more he or she can recognize papers by the author and year alone.

Another reason professional scientists like the (Author, Year) format is because of the personal branding. If someone cites your paper multiple times, it builds name recognition – particularly if you have a slightly distinctive name in your field. And that can help facilitate networking at conferences and general career development.

Writers of scientific journal articles also have reasons to like the (Author, Year) style, because it’s easier to prepare a reference list. If you insert one reference in the middle of a manuscript, all the references after it have to change. That's why I use name and year references in blog posts; they're quick and easy to do by hand.

I think this also explains why having the full references at the end of the paper more common than footnotes. I shudder to think of doing footnotes on a typewriter.

Nobody doubts that numbered references are shorter that the Harvard style. But the advantage goes beyond saving paper.

Gregory (1992) gives an example where replacing “author and year” references with numbered references reduced 149 words to a mere 22. But he argued that this tremendously improved the readability:

The original passage is unspeakable and unreadable, but neither the author nor the editor is interested in whether anyone reads the paper. Indeed, they prefer nobody reads beyond the summary, or better still, beyond the authors’ names.

I don’t think it’s an accident that most books describing research for a general audience treat references with a light touch. Typically, you’ll find only a list of key references at the end, with very few (Author, Year) intrusions into the text.

The more rigorously documented and referenced a work is, the more difficult it is to read.

Numbered references are fairer to scientific authors than the (Author, Year) format. The tradition is that if you have three or more authors on a paper, the reference in the test is given by the name of the first author only.

If you’re author #2 on a three author paper, you’re screwed. Why should one person get on a large team get such a disproportionate amount of personal advertising by having their name printed throughout the main body of a paper, while the name of author 2 of 3 is only printed once in the literature cited? It’s no coincidence that order of authorship is a common cause of fights and discontent between researchers. More and more papers are being written by larger and large groups of authors, meaning this problem is only going to get worse.

As for the ease of preparing a manuscript, this is why specialty reference manager software exists. We are not living in the age of typewriters any more.

For online journals, technology should be able to provide a “best of both worlds” solution. It should be possible to have numbered references where you could pull up a complete citation in a little pop-up window by hovering the cursor over the number, similar to clicking the magnifying glass icon next to search results on Google.

So the style of referencing actually cuts to a very deep question: Who is a scientific journal article for? One format favours the professionals: the people who are deeply embedded in the literature, and who actively contribute to it. The other favours the readers: those just entering the field and the curious bystanders.

“Open science” is not just about who pays for research and reprints and journal subscriptions; it’s about access in the broadest sense. And something as simple as the reference could raise or lower the barrier to someone reading a research paper, even if just by a smidgen.

I’d much rather the barrier be a smidgen lower than a smidgen higher.

P.S.—I wondered if anyone had ever done any research on the pros and cons of the different reference styles. The only paper I was able to find bore the title, “How the name date (Harvard) reference style in papers shows an underlying interpretivist paradigm whilst numeric references show a functional paradigm.”

That I wasn’t able to find a copy was a relief. Paradigms. I would have been way out of my depth.


Gregory M. 1992. The infectiousness of pompous prose. Nature 360 (6399): 11-12. DOI: 10.1038/360011a0

Oakley P. 2003. How the name date (Harvard) reference style in papers shows an underlying interpretivist paradigm whilst numeric references show a functional paradigm. Systemist 25: 25-30.

Additional: An independent analysis at The Open Source Paleontologist arrives at some similar conclusions.


Anonymous said...

I <3 the Harvard style. ESPECIALLY as a young grad student when I didn't know the literature well and was mining a lot of papers for references, it was much easier to get an idea of a field when you can see who "said" what (of course then you have to go back and check yourself to see if it's true), rather than constantly flipping back and forth to the references. And of course now that I know the field better, I like the Harvard style because reading along and seeing the references I expect makes me feel like the author and I are on the same "page" with the state of the field.

Andy said...

You bring up some interesting points about why we cite author names in the first place. I have decidedly mixed feelings on the issue - I wholeheartedly agree that numbered citations improve ease of reading for most readers (this is an area crying out for a carefully controlled experiment!), but am not sure that "lowering barriers" (presumably to non-specialists in a field, or the lay public) is the most universal argument. It almost reads like "write for the lowest common denominator," which I'm sure is not the message intended here. I've long ago accepted that although I always try to write papers of broad appeal and accessibility, quite frankly some of much work may be read only by a handful of specialists (at best).

(at risk of blog-whoring, I had a post of my own on this topic a few days back)

Genomic Repairman said...

I like numbers, its simple and easy. Having multiple papers by the same first author becomes a pain because you have to remember what year it was published on the citation (Am I looking for Jones 03 or 04?)

Anonymous said...

I love (author, date) and personally find it much more readable (but I'm a soc scientist and footnote style is really rare in the journals I usually read). Thanks for pointing out the importance of experience and knowledge of the literature in that preference though. In saying loudly and often how much better I think (author, date) is, I wasn't really considering readability for anyone else. Thanks for the reminder!

Nico said...

Hi there, interesting post, I thought I'd share my comments.
I work in Sci publishing, so I deal with large amounts of citations every day. Both styles have their uses, Harvard is great if you have a short ref list and helps readers to remember "who said what" as you put it. On the other hand, with reviews or theses it can get clumsy, with lists of names/dates and unhelpful citations like "Smith, 2003d". For online stuff there is the trick we use at Nature: hover your cursor over the reference number. Tadaaaa!

AK said...

Personally, I don't really care in papers I'm reading: as an outsider in the field (whichever it is), I'm not going to worry about who said what, I'm going to actually go to the paper referred (or the abstract if it's behind a paywall) to see what it actually says. In preparing posts I've often gone 2-3 ref's deep tracking the actual research behind a claim.

As for preparing my own posts, I prefer numbered footnotes because they're easily recognized and distinguished from links to Wiki, other blog posts, and non-peer-reviewed web sites.

Note that a reader can easily click on an internal link to see the footnote (this is true of almost every on-line journal I've read in), then click the "back" key to return to the previous location in the article. Highlighting a few words of the text can mark your place exactly (although you have to be careful not to highlight something in the footnote which will erase the highlighting in the text), and if your browser is properly configured you shouldn't have to wait on load time in either direction.

I'm of two minds about the hover function, especially if it brings up a block of text with its own links. I find it often interferes with using the cursor to highlight text to mark my place, or copy/paste into a search box in another browser window. I normally use it only to identify links to Wiki, but it wouldn't be much trouble to add it to footnotes, if it made it easier for readers.