21 September 2012

Academic astrology

ResearchBlogging.orgLast week, I published a little post about my predicted h-index scores over the next 10 years. There has been more discussion that I think warrants follow-up.

First, I put no more faith in this predictor than I would in astrology. The h-index predictor ought to come with the sort of disclaimer that skeptics asked newspapers to add to horoscopes: “purely for entertainment purposes.”

Justin Kiggins made the best critique of the h-index predictor:

according to the H-Index predictor, my cat will have an H-Index of 9 in 2022


So in ten years, Justin’s cat will have a higher h-index than I do after 20 years of publishing science. In fairness, though, this is Justin’s cat:


More seriously, I was a little surprised by some of the discussions I saw around the Nature paper (Acuna et al. 2012). For instance, this news article botches the basic idea of the article.

But the new formula is more than twice as accurate as the h index for predicting future success for researchers in the life sciences.

It’s not “more accurate” than the h-index, it is the h-index. It’s just a way to predict h-index out into the future, when h-index is a calculation based on what you’ve done in the past.

There’s a few other things that are worth pointing out about h-index.

First, the whole point of h-index is that it is supposed to give some indication of scientific quality or reputation, and not reward people for publishing rubbish papers. Someone who publishes a lot of papers that nobody cares about or cites should have a low h-index. The theory is sound, but a recent paper found that the number of papers you’ve written explains 87% of the variation in h-index (Gaster and Gaster 2012).

In a way, this is reassuring. This suggests that most scientists are competent, and most papers are worth citing. But it suggests that paper quality is not very important to this measure.

Another issue is reproducibility. For instance, my h-index according to Google Scholar is 8. According to Web of Science, it’s 6. There is no way that h-indices are going to be calculated by hand by human beings; they must be done by computers running algorithms. And if different datasets give the different results, it’s limited in its usefulness.

Measures like h-index this are entertaining, and as I’ve said before, they are expedient. But the moment you hear any administrator seriously suggesting using these alone, in isolation, to make important decisions like tenure and promotion, run. Or fight. But don’t just accept it.

References

Gaster N, Gaster M. 2012. A critical assessment of the h-index. BioEssays 34(10): 832. DOI: 10.1002/bies.201200036

Acuna DE, Allesina S, Kording KP. 2012. Future impact: Predicting scientific success, Nature 489(7415): 202. DOI: 10.1038/489201a

Related posts

Expedience
Gazing into the crystal ball of h-index

External links

The Genius Index: One Scientist's Crusade to Rewrite Reputation Rules

7 comments:

klab said...

"First, I put no more faith in this predictor than I would in astrology. The h-index predictor ought to come with the sort of disclaimer that skeptics asked newspapers to add to horoscopes: “purely for entertainment purposes.”

I am not sure I do understand your statement. Prediction is a statistical technique, with well quantifiable precision and has nothing to do with faith.

In practice the h-index is used for prediction by many entities, implicitly assuming that "the h-index predicts future scientific impact". In fact Hirsch argued and showed statistically that in some domains the h-index is predictive (http://arxiv.org/abs/0708.0646). The compound score we introduced is more precise and arguably more fair. After all it gives higher prediction scores to people for publishing in competitive journals, switching fields at some level, and gives old people less of an advantage.

Are you simply making the point that metrics never capture all aspects of a scientists work? That we should not rely exclusively on metrics when deciding hiring, or funding? That metrics can be gamed? I guess we all agree on these points.

Zen Faulkes said...

klab: Yes, I understand that predictions can be made statistically, with some degree of confidence.

But the limits of are made clear by the "Justin’s cat" example. If a prediction reaches an obviously illogical conclusion, I have to question its value in helping people make serious decisions.

There's also the question of whether a prediction is interesting. Does it generate new questions and insights? If 87% of the h-index variation is captured by number of publications alone, the conclusion - "publish a lot of papers" - is not exactly a bolt from the blue.

Thanks for the comment!

klab said...

I love the cat example and your lolcat is a great addition! And it so nicely pointed out that the scope of the calculator (starting h>5, 5-12 years since first paper) needs to be started in the calculator.

I think predictions can be interesting if they are non-obvious. In our simple 5 factor predictor, I thought that the # of distinct journals was interesting - I have never seen a search committee mention this as a major positive factor, and yet it somehow makes sense that people who invest in breadth ultimately reap rewards (as measured by a higher h-index later in life). There must be some features that good scientists share! Which factors do you think predict actually relevant research?

But beyond interesting, predictions can be useful if they are good enough, which is actually an empirical question. Do universities that use metrics (along with committees) for evaluations show signs of more important science over the long term? Would you presume that its always a mistake? What if you are a dean and you do not trust your committee? Or you can not afford one?

But I want to end with a positive example of (technically not overly interesting) predictions. Take google as an example. It predicts which pages I actually am looking for. Is it perfect? No its not. But I use it. Because it beats reading the entire internet.

neuromusic said...

My issues have much less to do with the paper and the predictive model that the authors (klab et al?) propose and everything to do with the silly widget.

The predictive model they employ only works under certain constraints... they only analyzed authors that had their first paper at least 5 years ago, authors with an H-index of >4, and authors who were already listed in Neurotree. And under these constraints, it does pretty well. And I can respect that & I can respect the goal of developing a compound prediction. And I think that the insights gained from the way the model weights different inputs is really interesting.

But then to promote their work they also publish a little widget titled "Predicting scientific success: your future h-index" which will let ANY SCIENTIST OR CAT (not just neuroscientists) enter ANY VALUES THEY WANT (unconstrained by their model's assumptions) and it will spit out an estimate of their future h-index WITHOUT ANY ERROR BARS.

This isn't "statistics"... its a horoscope.

Above, klab defends the model saying "Prediction is a statistical technique, with well quantifiable precision" and yet they do not offer this quantification in their widget. So, yes, though the widget's predictions might be statistically accurate under certain constraints, it is a horoscope because we are given no indication of the confidence or accuracy of the individual predictions.

klab said...

@neuromusic. Trying to replicate the issue.
The h-index calculator webpage says:
" Note: The equations and the calculator model people that are in Neurotree, have an h-index 5 or more, and are between 5 to 12 years after publishing first article."
but you mentioned that when you call it it says nothing.

Are people sharing a direct link to the java code that cuts away the surrounding text? We can try and prevent that if this is happening. Would adding a desription of the size of the error bar help? - just like the rest of the description it is reported in the paper.

neuromusic said...

@klab - no, I understand that you've now added the "note", which is commendable. but it should not be that hard to enforce the constraint in the code itself (e.g. don't let people set parameters that are outside of the scope of the predictor... throw up a warning if people try to go outside of the 5-12 range and clear the graph)

neuromusic said...

making sure I check the box to email follow up comments...