11 March 2013

How much is that crayfish in the window?

I could not have written my latest paper without Google Alerts.

I can’t remember when I set up Google Alerts for “Marmorkrebs” and “marbled crayfish,” but it goes back to at least 2008. Given that I set up the Marmorkrebs home page in late 2007, that was pretty clever of me, though I say it myself.

In 2009, I set up a survey on the Marmorkrebs home page that ran throughout the year (I blogged about this process here). I was very familiar with a the problems of surveys. I learned a lot about them during my undergraduate degree, where I majored in psychology. People have to be willing to come to you, and those people, being volunteers, are not necessarily representative of the general population.

Now, with so many ideas, it’s difficult to pinpoint or remember the origin of the idea. But late in 2009, a paper was published in Nature that you could predict flu outbreaks fairly well by tracking things like how often “flu symptoms” was typed into Google. (This led to this online resource.) I think the led me to the idea of using the information being delivered through the alerts to keep track of the locations where people were keeping Marmorkrebs as pets.

This project changed a lot as it went along. Initially, almost all I was concerned about was the locations where people were keeping Marmorkrebs. I had the date, the city, the American state or Canadian province.

As the project went on, and I kept collecting data, I kept expanding what I was recording. What were people calling these crayfish? (This may have been prompted by seeing made up species names.) What were people doing with them? Are the same people cropping up over and over again? And how much is a crayfish worth on the open market?

The problem was, when I went back... there was a lot of link rot. Craigslist entries came down fast, and I often couldn’t find them. It was only late in the day that I started the practice of turning every web page I could into a PDF so I could archive the entry. In retrospect, I could have done all this so much better.

Creating this paper was a little bit like making repairs on a plane while it was still in flight.

That makes the other thing I’m trying with this paper a little nerve-wracking. I decided to put the entire spreadsheet (slightly cleaned) I used in this paper up on figshare. Now, for reasons I have just described, this is not the cleanest data set out there. All the more reason to be transparent about how the set was put together, and make it available to check. I’m also happy to share PDFs of particular entries in the spreadsheet, as I expect link rot will continue to take its toll on the entries in the list.

I did some more analysis for when I presented this work at the Society for Integrative and Comparative Biology meeting in January, and I found a few more interesting things that were too late for the paper, and are probably not substantive enough for the next pet paper I’m working on. To continue the metaphor above, I am still not sure the plane has landed, and I’m still trying to fix it.


