16 August 2018

How to present statistics: a gap in journals’ instructions to authors

I recently reviewed a couple of papers, and was reminded by how bad the presentation of statistics in many papers is. This is true even from veterans with lengthy publication records who you might think would know better. Here are a few tips on presenting statistics that I’ve gleaned over the years.

Averages


One paper wrote “(7.8 ± 0.8).” I guess this was supposed to be a mean and standard deviation. But I had to guess, because the paper didn’t say. But people often report other measures of dispersion around an average (standard errors, coefficient of variations) the exact same way.

Curran-Everett and Benos (2004) write, “The ± symbol is superfluous: a standard deviation is a single positive number.” When I tweeted this yesterday, a few wrote that this was silly, because Curran-Everett and Benos’s recommended format was a few characters longer and people worried about it being repetitive and hard to read. This reminds me of fiction writers who try to avoid repeating “He said,” usually with unreadable results. My concern is not about the ± symbol as the numbers having no description at all.

Regardless, my usual strategy is similar to Curran-Everett and Benos. I usually write something like, “(mean = 34.5, SD = 33.0, n = 49).” Yes, it’s longer, but it’s explicit.

That strategy isn’t the only way, of course. I have no problem with a line saying, “averages are reported as (mean ± S.D.) throughout.” That’s explicit, too.

Common statistical tests


Another manuscript repeatedly just said, “p < 0.05.” It didn’t tell me what test was used, nor any other information that could be used to check that it is correct.

For reporting statistical tests, I write something like, “(Kruskal-Wallis = 70.76, df = 2, p < 0.01).” That makes it explicit:

  • What test I am using.
  • The exact test statistic (i.e., the result of the statistical calculations).
  • The degrees of freedom, or sample size, or any other values that is relevant to checking and interpreting the test’s calculated value.
  • The exact p value, and never “greater than” or “lesser than” 0.05. Again, anyone who wants to confirm that the calculations have been done correctly needs an exact p value.  People’s interpretations of the data change depending on the reported p value. People don’t interpret a p value of 0.06 and 0.87 the same, even though both are “greater than 0.05.” Yes, I know that people probably should not put much stake in that exact value, and that p values are less reproducible than people expect, but there it is.

My understanding is that these values particularly matter for people doing meta-analyses.

Journals aren’t helping (much)


I wondered why I keep seeing stats presented in ways that are either uninterpretable or unverifiable. I checked the author’s instructions of PeerJ, PLOS ONE, The Journal of Neuroscience, and The Journal of Experimental Biology. As far as I could find, only The Journal of Neuroscience provided guidance on what their standards for reporting statistics is. The Journal of Experimental Biology’s checklist says, “For small sample sizes (n<5), descriptive statistics are not appropriate, and instead individual data points should be plotted.”

(Additional: PeerJ does have some guidelines. They are under “Policies and procedures” rather than “Instructions for authors,” so I missed them in my quick search.)

This stands in stark contrast to the notoriously detailed instructions practically every journal has for reference formatting. This is even true for PeerJ, which has a famously relaxed attitude towards reference formatting.

In the long haul, the proper reporting of statistical tests is probably more important to the long term value of a paper in the scientific record than the exact reference format.

Judging from how often I see minimal to inadequate presentation of statistics in manuscripts that I’m asked to review, authors need help. Sure, most authors should “know better,” but journals should provide reminders even for authors who should know this stuff.

How about it, editors?

Additional: One editor took me up on this challenge. Alejandro Montenegro gets it. Hooray!

More additional: And Joerg Heber gets it, too. Double hooray!

References

Curran-Everett D, Benos DJ. 2004. Guidelines for reporting statistics in journals published by the American Physiological Society. American Journal of Physiology - Gastrointestinal and Liver Physiology 287(2): G307-G309. http://ajpgi.physiology.org/content/287/2/G307.short

06 August 2018

Maybe we can't fix “fake news” with facts

There’s been a few recent moves in the war on “fake news.” For instance, several platforms stopped hosting a certain conspiracy-laden podcast today. (You can still get all that conspiratorial stuff. The original website is untouched.) But the discussion about “fake news” seems to be focusing on one thing: its content.

Lately, I’ve been thinking about this diagram I made about communication, based on Daniel Kahneman’s work. Kahneman argues you need three things for successful communication:

  1. Evidence
  2. Likeability
  3. Trust

I feel like most of the talking about “fake news” is very focused on “evidence.” This article, for instance, describes some very interesting research about how people view news articles. It’s concerned with how people are very prone to value opposing sources, but are very poor at evaluating the credibility of those sources.

All good as far as it goes. But, as I mentioned before, it feels a lot like science communicators who, for years and years, tried to beat creationists, flat Earthers, anti-vaccine folks, and climate change deniers by bringing forward evidence. They were using the deficit model: “People must think this because they don't know the facts. We must get the facts to them.”

That didn’t work.

I’m kind of seeing the same trend in fighting fake news. “Remove the falsehoods, and all will fix itself.”

But where I see the truly big gap between where we were and where we are isn’t about facts. It’s about trust.

When you bring evidence to a fight that isn’t about facts, you will lose. Every time. Facts mean nothing without trust. “Check the facts” means nothing when you are convinced everyone is lying to you. This is why conspiratorial thinking is so powerful and dangerous: it destroys trust.

You see the results in how someone who buys into one conspiracy theory often buys into several other conspiracy theories. If you believe Obama wasn’t born in the US because conspiracy (say), it’s not that big a leap to the moon landings were fake and the Earth is a flat square.

I have some hypotheses about how America in particular got to this point. I suspect the erosion of trust was slow, gradual, and – importantly – started long before social media. Maybe more like, I don’t know, let’s say 1996.

I don’t know how to reverse a long-term trend of distrust and paranoia. I’m not saying, “We need to understand and sympathize with fascists,” either. But you can’t cure a disease when you misdiagnose it. I just don’t see focusing on the factual content of social media getting very far.

Related posts

The Zen of Presentations, Part 59: The Venn of Presentations
Post fact politics catches up to science communication

External links

Fake news exploits our obliviousness to proper sourcing
Looking for life on a flat Earth
How can I convince someone the Earth is round?
Why do people believe conspiracy theories?