Teddy's Rat Lab: COMMENT: The Pros and Cons of Scientific Peer Review

Wednesday, May 9, 2012

COMMENT: The Pros and Cons of Scientific Peer Review

http://teddysratlab.blogspot.com [Full link to blog for email clients.]

First, my disclaimer: This is my *opinion*. It is an opinion shaped by 30 years in science, both participating in, and at the mercy of, peer review. Since 1998, I have reviewed research grant applications for NIH, NSF and several private funding foundations. I have *officially* reviewed manuscripts for various scientific journals since I became an Assistant Professor in 1995 (I have assisted others with reviews since the 80's). My first publication subject to peer review was in 1985, the first in which I was the primary author ("First Authorship") was in 1989. My first peer-reviewed research grant was awarded in 1995; I've been a co-investigator on peer-reviewed research grants since 1989.

While this is what my field would consider *extensive* experience with peer-review, it is also fairly limited in that it is only within my field, and only with respect to research papers and grants. The other take-home message is that my field requires you to be the *victim* (excuse me – the *recipient*) of peer-review before becoming a reviewer.

What is driving the doubts about the effectiveness of peer-review?

1) Recent evidence shows that the second generation anti-psychotics and antidepressants such as Abilify are not as effective as they were shown to be in initial research and clinical trials...

http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer?currentPage=all

2) A 2005 article in Public Library of Science journal claiming that 50% of published research findings are false due to statistical inadequacies...

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/

3) The article that was sent to me a while ago states that pre-publication peer review does not provide any guarantee of lasting importance of scientific results...

http://breast-cancer-research.com/content/12/S4/S13

4) There is a field of opinion that pre-publication peer review serves to limit publications to a level that meets the print capacity of the available scientific journals. The advent of on-line internet publication eases the space restrictions, so why not publish everything and let the broader scientific community sort it out?

http://scholarlykitchen.sspnet.org/2008/07/07/bulk-publishing-keeps-plos-afloat/

http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0011.203

5) The highly public "ClimateGate" scandal has exposed abuse of pre-publication peer-review to publish articles by the AGW crowd and block publication by the AGW skeptics.

http://wattsupwiththat.com/2009/12/01/lord-moncktons-summary-of-climategate-and-its-issues/

6) A (very) recent article in the New York Times demonstrates the value of "crowd-sourcing" review of science.
http://www.nytimes.com/2012/05/06/opinion/sunday/science-and-truth-were-all-in-it-together.html?_r=1&ref=opinion

--------

I have a number of responses to the above. I don't believe that peer review is "broken" per se, but I do agree that the scientific community as a whole needs to police it better. I will take on several of the issues above, but not necessarily in that order. I'll refer to the topic numbers as needed.

#5 is the most egregious violation of peer review. Sadly it is not unusual for a "good-old-boy" network to operate in science. Would *I* tend to look more favorably at submissions by people I know and generally agree with? Probably. However, I also try to ensure that someone I know professionally does not get a "pass" on sloppy science, I can actually be *harder* on my colleagues than on strangers. That said, I *like* new ideas, but I detest articles that are little more than "I got this great idea and I think it works like this even if my results don't support it." I was asked to review an article in which the author decided that he could apply the principles of scalable physics models to biology, that all biology should be able to be treated the same, just on different scales. The manuscript, however, proved that the author really had only the vaguest grasp of either biology or physics, and had never considered the fact that in physics there is such a thing as singularity, beyond which the models *don't* scale. Thus with biology we get emergent properties and behaviors which are not explainable by a scaled model (which is also where I differ with a certain Baen author and his quantum theories of the mind). That article was eventually published in another journal, likely because the reviewers were selected to address the modeling aspects and were not as well versed in areas outside their discipline. Because I am a nerdy geek that is fascinated by physics, I was able to see fundamental flaws in the exemplars that would not have been obvious to someone less well versed in physics.

On the other hand, I have been the victim of the "not invented here" syndrome. I am senior author on an article that has been rejected by three different journals on the basis of "not appropriate (or too complicated) for the readership of this journal." (see link under #4) The problem being that what I have is a really neat effect that is relevant to brain-to-machine interfacing. When read and reviewed by engineers and neuroscientists in the neuroprosthetics subfield, it is very well received. When reviewed by mainstream neuroscientists, it is subject to highly critical reviews. The weak point is the "cause" of the cause-and-effect function. My colleagues in the mainstream would prefer that I run a number of controls to prove that my effects are not coming from a number of direct or alternate sources. Part of my response is "so what if it is?" In some cases it *doesn't* matter if the effect is due to one neuron or two, or occurs via three relay connections or none at all. What is important to publish is that an effect *exists* (#3) and then move on to more complete characterization later. Of course, the desire to get into print first and lay claim to a result drives this need, and it is very frustrating to know that a neuroengineering journal would publish in a heartbeat, but I'd *still* not get appropriate recognition from the neuroscience community.

This is where #2-4 interact as listed above. PLoS One is an online publication that does not make value judgments on the "appropriateness" of an article, but *will* subject it to review by 2-5 peers. Typical review is 2 outside reviewers, plus the editor. Reviewers are *supposed* to know the field. As a submitting author, I can suggest people who *should* know the field (and the editor should know who are my cronies!), as well as provide names that would have a conflict of interest with my work (either they hate my guts or my work disproves their work). The editor is under no obligation to use my suggestions, but hopefully will use *one* of my suggestions and I'll get one good review. If the two reviewers are diametrically opposed in opinion, the editor will call in a third. I have seen a case where 5 reviewers were used (aside from PLoS One) and the editor basically used the reviews as a vote. The biggest problem in publication and grant proposal review is that it can be difficult to find reviewers that fully understand *all* of the material in the article or grant proposal. Multiple reviewers are better (#2 & #3) – NIH uses 3 main reviewers plus a committee of 12-15 additional discussants for grant proposal reviews.

The philosophy of PLoS One is to let the scientific community sort it all out post-publication (#3, #4 & #6). With unlimited space, there can be publication of *every* article that passes basic peer-review; however, the scientific community will decide for itself what is worth keeping. The problems with this are two-fold: supply of reviewers is limited, and once released, there is no good way to "retract" the worthless publications. The first point is exemplified by NIH. Beginning in 1992, the NIH budget was increased each year with the goal of doubling the available research funding within 10 years. In that same time, the number of new grant applications increased ten-fold. By 2002, NIH was funding approximately more than twice as many *new* grants per year, and overall, funding 3x as many grants as in 1992. However, starting in 2003-2004, there was no longer any *additional* funding, and the percentage of new and renewed grant applications funded each year dropped from 20% to 7%. With further belt tightening, altered focus of the grant programs, and a mandate to continue to fund *new* grants (or more importantly, new investigators) at a minimum 5-7%, the funding percentage dropped to 3%, largely at the cost of grant renewals (typical grants must be renewed every 5 years) which were only 20% likely to be renewed - even though a renewal would not cost the NIH any more than was already budgeted!

Meanwhile, researchers have continued writing *more* grant proposals in hopes that they would have a better chance of being funded. [Note, since the mid 90's, Medicare/Medicaid reimbursement to Medical Schools and Teaching Hospitals has been locked to the same rate as *all* medical care. Previously Med Schools could charge more on the basis of the higher levels of care, and that extra helped subsidize teaching and research. Since 1990, tuition has increased, and researchers must bring in >80% of their salary on grants to keep their jobs.] In 2004, when I rotated off of a "Study Section", a standing review committee for NIH, I typically reviewed 6-8 grants every 3-4 months. Each grant was 25 pages long with up to 10 appendices (articles and supplemental data) to help explain the research. I considered going back on a Study Section last year, but the average review load was 10-12 grants every 3-4 months. NIH shortened the grant proposal length to 12 pages last year, and virtually all appendices are now merely online links to published articles. The goal was to reduce the *volume* of the grants so that they could increase the *number* of grants each reviewer was assigned (up to 15) – yet the reviewer is expected to do more reviews with less description on which to base judgment, while making the reviewer look up *more* material on their own!

So, publishing *more* while maintaining the peer-review process is not necessarily a winning game. Now, what if we were to publish more while *reducing* the peer-review burden? Nice idea, but who *really* decides what results are worthwhile? What if all scientific publishing were done on the internet, and anyone wanting to find a particular result just had to search for it? Which "one" result should our hypothetical researcher choose? The most recent? The one with the most links? Consider the Wikipedia model: Who judges worth? Is Science a popularity contest? Can *anybody* - with or without formal training edit WikiScience? And who is to say that the person giving my result a "thumbs up" has any idea that my research is valid other than thinking I must be alright just because I write cute stories about LabRats?

By far, my strongest criticism of #4 is that if there are *no* gatekeepers, then there is no way to weed out the junk science. ClimateGate (#5) serves as a cautionary tale to #2 and #4 as well. If *all* scientific findings are available online, then *somebody* - whether politician, publisher or progeny - will pick up the least-worthy result and decide to base law or policy on it!

On the other hand, the NY Times opinion piece cited in #6 illustrates just such a case in which the scientific and governmental regulatory communities have stuck by the validity of a peer -reviewed article while "the public" has thoroughly debunked it. I think I can be honest enough to admit that there is validity in the idea of opening up scientific results to commentary and public scrutiny as in the case of the so-called "ivory-billed woodpecker" cited by the article referenced in #6. In my defense, however, I would also point out that those who have debunked the peer-reviewed claims are hardly unskilled in the field, and consist of ornithologists and birdwatchers who *do* follow the science of the field rather closely.

Finally, to #2 and #1. Human studies and drug studies are particularly prone to errors of statistical power. In many cases it is just not possible to take into account individual variability in reaction to a drug. Hence a carefully selected population may *randomly* not represent the true population mean, or it may simply not have enough subjects to be reliable. It is one reason why I tell a particular friend to look at the clinical trials and be sure to read the side-effects and reactions of people who *dropped-out* of the trial, because sure enough *that* is the reaction she will have to the drug.

However, there is another factor that contributes to diminished efficacy of drugs on the market (#1): A. Before a drug is in clinical trial, the initial "pre-clinical" testing is on animals. FDA requires tests on at least two different species from different taxonomic orders (i.e. rats - dogs, cats - monkeys, mice - pigs, etc.). Pharmaceutical companies have been caught too often by drugs that work in animals, but not humans, thus one of the *preferred* preclinical trials is in nonhuman primates, but their use is quite often restricted by animal rights movements. B. Phase I clinical trials are strictly to generate toxicology results, and are in animals and humans. C. If a drug passes Phase I (i.e. is not toxic in proposed doses) it is tested in Phase II to find out if it has an effect at the proposed dose, and if not, what dose is required. D. Phase III is a larger population test to find out whether the drug has the desired effect at the prescribed dose, and whether the effect is robust and lasting. E. The drug is approved by the FDA and can be prescribed for a specific disease or treatment. F. The drug becomes widely accepted, and is prescribed for patients with similar or marginal conditions. G. The drug becomes generic or there are "off-label" uses - such as the antihypertensive that became a hair restorative (minoxidil). With each subsequent step, the patient base becomes broader, the symptomatology becomes looser, and the appropriateness of the use becomes less assured.

There is a much simpler explanation for #1 than that proposed in the article: appropriateness and compliance. As a drug becomes accepted, it gets prescribed to individuals who would never have been included in the initial clinical trials. Given that Rogaine (minoxidil) is used by so many people that do *not* require a vasodilator, is it appropriate to judge its effectiveness as a vasodilator on the *entire* population that uses it? Abilify (aripiprazole) was found to be highly effective in Phase II and Phase III trials in patients with moderate to severe schizophrenia. Lehrer (#1) cites the "startling news" that these drugs were no longer as effective as the initial studies had indicated. But! Abilify is no longer prescribed just for schizophrenia – but has been FDA approved for treatment of bipolar depression (since 2006), unipolar depression (since 2007) and autism (since 2009). Now, what has changed? The original research? The drug? Or the patient base? The flip side of appropriateness is compliance: The drug may be entirely appropriate - but do the patients *take* the drug as directed? The biggest problem with antibiotic-resistant bacteria is not the effectiveness of the drugs, but the fact that so many patients do not take the antibiotics as frequently or as long as prescribed by the doctor. Drug dosing schedules have been carefully worked out in preclinical and Phase I-III testing, but if a patient stops taking the drug because of financial, emotional or other reasons, the effectiveness is reduced and it has *nothing* to do with whether the preclinical results were effectively peer-reviewed prior to release.

---

So, do I think peer-review is broken? Well, #5 tells me that it can and has been warped. I do *not* think it should be scrapped, but I think it needs better watchdogs. Who are those watchdogs? We all are. If a scientist witnesses abuse of the system, they should be able to speak out and not get shut out because of political whim. The public needs to be better educated so that they do not get told what to do be manipulative media politicians and yes, scientists. I would be all for fully open access to science if the public were educated enough to understand the basics to be able to tell what is and is not good science. Unfortunately the reality is that there exists a high level in science where only a very few people worldwide understand or even care. Only time can judge the worth of such research, the rest requires an educated populace. As long as there is *any* stratification within the populace based on education, there will be those who must translate science to the masses, and become a gatekeeper. The gatekeeper position can all too easily be corrupted as we have seen.

Knowledge is power. Be powerful.

11 comments:

Ric LockeMay 10, 2012 at 5:57 PM
Peer review, as such, needs to go away.

It originated as a coping strategy for limited bandwidth. Print publication was (and remains) so expensive, complicated, and narrow that it became vitally necessary to find ways to exclude the weirdos, incompetent amateurs, and poseurs, lest the relatively few pages of the journals be filled with dreck. Thus peer review -- originally, a relatively quick "sanity check" to see that the proposed paper at least made sense and wasn't a waste of time, paper, and ink.

But the egos of the reviewers intruded. In all too many cases, the "review" tried to actually grade the science, to make sure that it not only made enough sense to be interesting but was in some ways "correct". That got worse and worse over the years, to the point where not just journalists but actual scientists are able to cite "peer reviewed" as being something like "checked and approved". That's nonsense. The only way to check a scientific result is to repeat the experiment (and/or data collection), and reviewers don't and can't do that.

It's also a direct, in-your-face violation of the original founding principles of science, as elucidated by Hooke, Newton, et. al. At the meetings of the Royal Society any member could get up and present, needing only to schedule it with the secretary. There are many contemporary complaints that this resulted in much waste of time, but the founders of the Society held fast.

As we should, now. Given the unlimited (for most purposes) bandwidth of the Internet, there is no reason for using "peer review" or any other system for restricting publication -- unless you are of the mindset that says "I can't release my data/analysis because you'll just find something wrong with it", in which case your claim to the sobriquet "scientist" is utterly fraudulent. The whole point of science is to reveal everything to the slings and arrows of outrageous critics, because the stuff that survives that treatment is useful (possibly even "correct") and the stuff that doesn't so survive doesn't deserve to.

It might well be useful to have, not gatekeepers, but guides -- people who take it upon themselves to say "this looks interesting and useful; somebody else should try it and see" or "this seems laughably ridiculous to me"; such recommendations could, over time, become useful. We're starting to see a bit of that. More would be handy.

But, in the end, it's all about the data and interpretation thereof, and the only real test of that is to expose it fully and let the vultures attack. That's science. Anything else will eventually morph into religion-equivalent, just as much of today's "scientific debate" already has.
ReplyDelete
Replies
Eric RasmusenMay 11, 2012 at 8:43 PM
A lot of what you identify is about the FDA being broken, not peer review.

On peer review: in my field (economics), everything comes out as a working paper first, with no attempt at secrecy. That functions like the publish-everything strategy you discuss. There is more publication delay in economics than in most fields, though, because it is harder (I think) to get published and the referees are as much for "fix this up" as "this is good enough". The peer review process substantially improves papers, especially for younger scholars. It's hard to get anyone to seriously read a technical paper and find mistakes,especially if you're not at a top 20 school, but a referee will do it.
Also, the winnowing function is very important. A well-known scholar doesn't really need it, since people will read his work even if it's just a working paper. Most scholars, though,won't get read unless a journal endorses that their work is good. And tenure committees often have no idea of quality except through journal quality (that's in part because of the publication lag--an assistant paper may have enough papers accepted for publication, but his cite count isn't big enough yet to use instead of journal quality).
It also helps that in economics we don't have the kind of publish-your-allies attitude that ClimateGate showed, and grants aren't important enough for intimidation to work. It really is important to get the right scholarly norms in place.
We do have the problem you mention of parochialism--- referees in one subarea of economics rejecting papers because they don't cite the subarea literature enough or aren't written in the accustomed style or aren't in the subareas narrow range of interest. The only way to get around that is to figure out some way to broaden narrow minds like the leech-brain scholar Nietzsche makes fun of in Thus Spoke Zarathustra.
ReplyDelete
Replies
AnonymousMay 11, 2012 at 9:50 PM
When I was studying for my PhD journals were eager to post their articles online as soon as possible so the authors could get as many references as possible. It made doing background research so much easier for us students. Then the scientific publishing industry put a big brake on the process and put all journals behind paywalls. Even libraries have put information behind paywalls and are starting to ban the general public from entering. How is the general public supposed to get educated about an issue so they can adequately evaluate an it if they are not permitted to read the source articles? This movement of scientific knowledge behind walls guarded by gatekeepers is in my opinion the reason that all the problems listed in the original post occur. The solution is to go back to free knowledge on the internet. All science is ultimately paid for by taxpayers so it should be available to them, the fact that gatekeepers are charging for access is one of the great crimes of the modern age and will ultimately limit the growth of our civilization.
ReplyDelete
Replies
AnonymousMay 18, 2012 at 6:04 AM
Great post. Again another informative post. Minoxidil UK
ReplyDelete
Replies

Add comment

Please add comment - no links, spammers will be banned.

Pages

News:

Wednesday, May 9, 2012

COMMENT: The Pros and Cons of Scientific Peer Review

11 comments:

Friends and Stuff I Really Like!

Search This Blog

Site Meter

Followers

Contributors