Saturday, 25 July 2009

In Science, Popularity Means Inaccuracy

Who's more likely to start digging prematurely: one guy with a metal-detector looking for an old nail, or a field full of people with metal-detectors searching for buried treasure?

In any area of science, there will be some things which are more popular than others - maybe a certain gene, a protein, or a part of the brain. It's only natural and proper that some things get of lot of attention if they seem to be scientifically important. But Thomas Pfeiffer and Robert Hoffmann warn in a PLoS One paper that popularity can lead to inaccuracy - Large-Scale Assessment of the Effect of Popularity on the Reliability of Research.

They note two reasons for this. Firstly, popular topics tend to attract interest and money. This means that scientists have much to gain by publishing "positive results" as this allows them to get in on the action -
In highly competitive fields there might be stronger incentives to “manufacture” positive results by, for example, modifying data or statistical tests until formal statistical significance is obtained. This leads to inflated error rates for individual findings... We refer to this mechanism as “inflated error effect”.
Secondly, in fields where there is a lot of research being done, the chance that someone will, just by chance, come up with a positive finding increases -
The second effect results from multiple independent testing of the same hypotheses by competing research groups. The more often a hypothesis is tested, the more likely a positive result is obtained and published even if the hypothesis is false. ... We refer to this mechanism as “multiple testing effect”.
But does this happen in real life? The authors say yes, based on a review of research into protein-protein interactions in yeast. (Happily, you don't need to be a yeast expert to follow the argument.)

There are two ways of trying to find out whether two proteins interact with each other inside cells. You could do a small-scale experiment specifically looking for one particular interaction: say, Protein B with Protein X. Or you can do "high-throughput" screening of lots of proteins to see which ones interact: Does Protein A interact with B, C, D, E... Does Protein B interact with A, C, D, E... etc.

There have been tens of thousands of small-scale experiments into yeast proteins, and more recently, a few high-throughput studies. The authors looked at the small-scale studies and found that the more popular a certain protein was, the less likely it was that reported interactions involving it would be confirmed by high-throughput experiments.

The second and the third of the above graphs shows the effect. Increasing popularity leads to a falling % of confirmed results. The first graph shows that interactions which were replicated by lots of small-scale experiments tended to be confirmed, which is what you'd expect.

Pfeiffer and Hoffmann note that high-throughput studies have issues of their own, so using them as a yardstick to judge the truth of other results is a little problematic. However, they say that the overall trend remains valid.

This is an interesting paper which provides some welcome empirical support to the theoretical argument that popularity could lead to unreliability. Unfortunately, the problem is by no means confined to yeast. Any area of science in which researchers engage in a search for publishable "positive results" is vulnerable to the dangers of publication bias, data cherry-picking, and so forth. Even obscure topics are vulnerable but when researchers are falling over themselves to jump on the latest scientific bandwagon, the problems multiply exponentially.

A recent example may be the "depression gene", 5HTTLPR. Since a landmark paper in 2003 linked it to clinical depression, there has been an explosion of research into this genetic variant. Literally hundreds of papers appeared - it is by far the most studied gene in psychiatric genetics. But a lot of this research came from scientists with little experience or interest in genes. It's easy and cheap to collect a DNA sample and genotype it. People started routinely looking at 5HTTLPR whenever they did any research on depression - or anything related.

But wait - a recent meta-analysis reported that the gene is not in fact linked to depression at all. If that's true (it could well be), how did so many hundreds of papers appear which did find an effect? Pfeiffer and Hoffmann's paper provides a convincing explanation.

Link - Orac also blogged this paper and put a characteristic CAM angle on it.

ResearchBlogging.orgPfeiffer, T., & Hoffmann, R. (2009). Large-Scale Assessment of the Effect of Popularity on the Reliability of Research PLoS ONE, 4 (6) DOI: 10.1371/journal.pone.0005996

16 comments:

Kristen said...

Thank you for highlighting this article; I think, as you say, it applies to any discipline. This makes sense from the standpoint of Type-I error. We control for our multiple comparisons within studies, but if lots of people are studying something, it is likely that one team will get a result by chance. That's why replication is so important, but there isn't a lot of money in replicating what someone else has done. There are lessons here for both grant-giving and publication.

Anonymous said...

Today's "hot topic" is autism. I see many, many "positive" papers about autism nowadays, so I am obviously suspicious. I particularity find topics such as "autistic intelligence" and "Neurodiversity" to be amusing.

"Neurodiversity" is incredibly popular and I am afraid that it is inaccurate in its philosophy.

Anonymous said...

The URL attached to the title of the paper is wrong, it points within this blog to a nonexistent page.

Michelle Dawson said...

Responding to Anonymous, there is relatively little published research about intelligence in autism. Major issues like, say, neural efficiency have barely been mentioned. There are only two inspection time papers (from the same group). There is the same problem with learning in autism.

Hans Asperger referred to "autistic intelligence" in 1944, and Leo Kanner noted areas in which autistics performed particularly well in 1943 (then there was Scheerer et al., 1945), But in the ensuing decades there has been very little systematic investigation into what autistics do well.

This is in contrast to the vast bulk of research seeking to investigate ways in which autistics are inferior, defective, dysfunctional, etc.

Neurodiversity is part of the general idea that disabled people should have human rights. While areas that could be considered related to this, like stigma and stereotype threat, are recognized as important and are well-studied in non-autism areas, they have not been studied at all in autism.

Neuroskeptic said...

Anonymous - Oh well spotted. Fixed.

Neuroskeptic said...

Kristen - Right. In an ideal world there would be researchers whose whole job was to try to replicate important findings.

That's a pipe dream, but what might work would be if grant funders made grants for new research conditional on also doing some replication ("We'll give you $5 million to do that but only if you take this $1 million and replicate Study X as well").

another skeptic said...

Neuroskeptic, another thing grant funders could do is to make the hypotheses in grant proposals available to the public, thereby discouraging post-hoc analyses and conclusions. Ideally, the grant proposal should be required to be cited in the published paper. Scientists like to fish their data for "publishable" bits, and not surprisingly, the result is that 90% of scientific reports do not replicate.

Neuroskeptic said...

Nice, I like that.

Bradley Ansell said...

When it comes to a lot of these scientific "hot topics", greed and politics play heavily into the "results" as well.

For example: cannabis, and every "study" that's tried to tarnish its name. From hotboxing rhesus monkeys with more than 200x the psychoactive amount in the 1930s to "prove" brain damage, to using genetically-mutated, metal-contaminated, heavily-dried samples wrapped in bleached, commercial cigarette paper in carcinogen studies in the mid-2000s, scientists have been bribed and pressured to publish left, right and centre in order to keep a harmless plant out of the hands of those who need it or simply just want to enjoy it. The industries involved in keeping cannabis down are pretty obvious, since cannabis could compliment or replace the following things: pharmaceuticals, psychotherapy, textiles, fuel... all very large, well-funded and influential industries.

Another example is the "obesity" "pandemic". First of all, enough with this "pandemic" word, media. Until I see Cambodians as big as Marlon Brando, I refuse to believe obesity is a worldwide issue. Then take into account how many obese people are still very active, eat well and are in good fitness. Their blood pressure is normal, their cholesterol is level, they're just simply larger than some other people. Well, these pesky facts just won't do, so of course the billion-dollar fashion and weight-loss industries, with their dangerous pills, ab-flexing machines and coked-up size-OOO runway models, hire as many doctors and researchers as they can afford (which is a lot) to claim being fat is just as dangerous as the government would have you believe marijuana is, if not worse. Irrelevant, antiquated barometers like BMI are slapped onto this junk science, insurance companies happily jump aboard the fat-discrimination bandwagon to save some money, and yet another innocent group of people is subjugated and socially condemned for the almighty dollar.

Neuroskeptic said...

Politics and money lie behind some bad science, but scientists are perfectly capable of screwing things up on their own.

Taking obesity as an example, once the idea that a BMI of >25 is bad took hold, which must have been a few decades ago, research supporting that theory will have a much easier time getting published than research contradicting it. And scientists want to get published so they will do their best to find results that "fit".

Bradley Ansell said...

The thing about BMI is that it's not an accurate barometer of health at all. It was never intended to be used for this purpose in the first place. BMI was invented in the 17th century by a Belgian mathematician as a tool for measuring social metrics. It literally had nothing to do with health, fitness, weight loss, bariatrics, whatever you want to call it. The only reason people put so much faith in it today is because it's a simple graph, a pretty visual to make the laymen feel all smart, educated and scientific when they look at it.

Neuroskeptic said...

Right, but however it started, a lot of experts do use BMI. Most of them admit it has problems, but they still use it, because it's convenient, and because everyone else uses it, so in order to get published you have to use it.

It's the same with standardized rating scales for depression such as the "HAMD", which are widely accepted as being deeply flawed. I have never heard anyone defending them, but they are still used by everyone who researches depression, simply because it's the "done thing".

Paper Research said...

Many institutions limit access to their online information. Making this information available will be an asset to all.

Laika said...

Thanks for reviewing this article. Neuroskeptic is becoming one of my favorite blogs! Keep them going!

I suppose things are worse for preclinical/basic science compared to clinical trials (although here pharma can have (too) much influence).

The suggestion of another skeptic is already realized in clinical science (protocols are submitted to clinical trials registers). This is harder to realize in basic science (stealing ideas). Plus one does not want to submit each single PCR (time, bureaucracy).

Furthermore Systematic reviews are quite uncommon in animal/basic science compared to clinical science. Systematic reviews serve to objectively gather the evidence. Funnel plots (in meta-analyses) may help to find skewed results. Larger trials (~high-throughput) may have an higher impact in such analyses.

auto insurance quotes said...

Another reason could be that many people jumped into popular field because it was popular rather than they had a passion for that subject. For example, if computer programing is a popular subject to study at the university many people would choose this department because it seems to be cool. This off course result in many result driven individuals and higher competition.

Side Effects of Paxil said...

Depression is a big problem in the US. Economically we are depressed and emotionally. We can fight back in empowering the workers with good benefits and new facility to cater more jobs.