Read How to Read a Paper: The Basics of Evidence-Based Medicine Online
Authors: Trisha Greenhalgh
3
DiCenso A, Bayley L, Haynes R. ACP Journal Club. Editorial: accessing preappraised evidence: fine-tuning the 5S model into a 6S model.
Annals of Internal Medicine
2009;
151
(6):JC3.
4
Levin A. The Cochrane collaboration.
Annals of Internal Medicine
2001;
135
(4):309–12.
5
Field MJ, Lohr KN.
Clinical practice guidelines: directions for a new program
. Washington, DC: National Academy Press, 1990.
6
Grimshaw J, Freemantle N, Wallace S, et al. Developing and implementing clinical practice guidelines.
Quality in Health Care
1995;
4
(1):55.
7
Hoogendam A, Stalenhoef AF, de Vries Robbé PF, et al. Answers to questions posed during daily patient care are more likely to be answered by UpToDate than PubMed.
Journal of Medical Internet Research
2008;
10
(4):e29.
8
Shepherd J, Cobbe SM, Ford I, et al. Prevention of coronary heart disease with pravastatin in men with hypercholesterolemia.
The New England Journal of Medicine
1995;
333
(20):1301–7 doi: 10.1056/nejm199511163332001.
9
Ford I, Murray H, Packard CJ, et al. Long-term follow-up of the West of Scotland Coronary Prevention Study.
The New England Journal of Medicine
2007;
357
(15):1477–86 doi: 10.1056/NEJMoa065994.
Chapter 3
Getting your bearings: what is this paper about?
The science of ‘trashing’ papers
It usually comes as a surprise to students to learn that some (the purists would say up to 99% of) published articles belong in the bin, and should certainly not be used to inform practice. In 1979, the editor of the British Medical Journal, Dr Stephen Lock, wrote ‘Few things are more dispiriting to a medical editor than having to reject a paper based on a good idea but with irremediable flaws in the methods used’. Fifteen years later, Altman was still claiming that only 1% of medical research was free of methodological flaws [1]; and more recently he confirmed that serious and fundamental flaws commonly occur even in papers published in ‘quality’ journals [2]. Box 3.1 shows the main flaws that lead to papers being rejected (and which are present to some degree in many that end up published).
Most papers appearing in medical journals these days are presented more or less in standard Introduction, Methods, Research and Discussion (IMRAD) format: Introduction (
why
the authors decided to do this particular piece of research), Methods (
how
they did it, and how they chose to analyse their results), Results (
what
they found) and Discussion (what they think the results
mean
). If you are deciding whether a paper is worth reading, you should do so on the design of the methods section, and not on the interest value of the hypothesis, the nature or potential impact of the results or the speculation in the discussion.
Conversely, bad science is bad science regardless of whether the study addressed an important clinical issue, whether the results are ‘statistically significant’ (see section ‘Probability and confidence’), whether things changed in the direction you would have liked them to and whether the findings promise immeasurable benefits for patients or savings for the health service. Strictly speaking,
if you are going to trash a paper, you should do so before you even look at the results
.
Box 3.1 Common reasons why papers are rejected for publication
1.
The study did not address an important scientific issue (see section ‘Three preliminary questions to get your bearings’).
2.
The study was not original—that is someone else has already performed the same or a similar study (see section ‘Was the study original?’).
3.
The study did not actually test the authors' hypothesis (see section ‘Three preliminary questions to get your bearings’).
4.
A different study design should have been used (see section ‘Randomised controlled trials’).
5.
Practical difficulties (e.g. in recruiting participants) led the authors to compromise on the original study protocol (see section ‘Was the design of the study sensible?’).
6.
The sample size was too small (see section ‘Were preliminary statistical questions addressed?’).
7.
The study was uncontrolled or inadequately controlled (see section ‘Was systematic bias avoided or minimised?’).
8.
The statistical analysis was incorrect or inappropriate (see Chapter 5).
9.
The authors have drawn unjustified conclusions from their data.
10.
There is a significant conflict of interest (e.g. one of the authors, or a sponsor, might benefit financially from the publication of the paper and insufficient safeguards were seen to be in place to guard against bias).
11.
The paper is so badly written that it is incomprehensible.
It is much easier to pick holes in other people's work than to do a methodologically perfect piece of research oneself. When I teach critical appraisal, there is usually someone in the group who finds it profoundly discourteous to criticise research projects into which dedicated scientists have put the best years of their lives. On a more pragmatic note, there may be good practical reasons why the authors of the study have not performed a perfect study, and they know as well as you do that their work would have been more scientifically valid if this or that (anticipated or unanticipated) difficulty had not arisen during the course of the study.
Most good scientific journals send papers out to a referee for comments on their scientific validity, originality and importance before deciding whether to publish them. This process is known as
peer review
, and much has been written about it [3]. Common defects picked up by referees are listed in Box 3.1.
The assessment of methodological quality (critical appraisal) has been covered in detail in the widely cited series led by Gordon Guyatt, ‘Users’ Guides to the Medical Literature' (for the full list and links to the free full text of most of them, see JAMA Evidence
http://www.cche.net/usersguides/main.asp
). The structured guides produced by these authors on how to read papers on therapy, diagnosis, screening, prognosis, causation, quality of care, economic analysis, systematic review, qualitative research and so on are regarded by many as the definitive checklists for critical appraisal. Appendix 1 lists some simpler checklists I have derived from the Users' Guides and the other sources cited at the end of this chapter, together with some ideas of my own. If you are an experienced journal reader, these checklists will be largely self-explanatory. But if you still have difficulty getting started when looking at a medical paper, try asking the preliminary questions in the next section.
Three preliminary questions to get your bearings
Question One: What was the research question—and why was the study needed?
The introductory sentence of a research paper should state, in a nutshell, what the background to the research is. For example, ‘Grommet insertion is a common procedure in children, and it has been suggested that not all operations are clinically necessary’. This statement should be followed by a brief review of the published literature, for example, ‘Gupta and Brown’s prospective survey of grommet insertions demonstrated that…'. It is irritatingly common for authors to forget to place their research in context, as the background to the problem is usually clear as daylight to them by the time they reach the writing-up stage.
Unless it has already been covered in the introduction, the methods section of the paper should state clearly the research question and/or the hypothesis that the authors have decided to test. For example: ‘This study aimed to determine whether day case hernia surgery was safer and more acceptable to patients than the standard inpatient procedure’.
You may find that the research question has inadvertently been omitted, or, more commonly, that the information is buried somewhere mid-paragraph. If the main research hypothesis is presented in the negative (which it usually is), such as ‘The addition of metformin to maximal dose sulphonylurea therapy will not improve the control of Type 2 diabetes’, it is known as a
null
hypothesis. The authors of a study rarely actually
believe
their null hypothesis when they embark on their research. Being human, they have usually set out to demonstrate a difference between the two arms of their study. But the way scientists do this is to say ‘let’s
assume
there's no difference; now let's try to disprove that theory'. If you adhere to the teachings of Popper, this
hypotheticodeductive
approach (setting up falsifiable hypotheses that you then proceed to test) is the very essence of the scientific method [4].
If you have not discovered what the authors' research question was by the time you are halfway through the methods section, you may find it in the first paragraph of the discussion. Remember, however, that not all research studies (even good ones) are set up to test a single definitive hypothesis.
Qualitative
research studies, which (so long as they are well-designed and well-conducted) are as valid and as necessary as the more conventional quantitative studies, aim to look at particular issues in a broad, open-ended way in order to illuminate issues; generate or modify hypotheses and prioritise areas to investigate. This type of research is discussed further in Chapter 12. Even quantitative research (which most of the rest of this book is about) is now seen as more than hypothesis-testing. As section ‘Probability and confidence’ argues, it is strictly preferable to talk about evaluating the
strength
of evidence around a particular issue than about proving or disproving hypotheses.
Question Two: What was the research design?
First, decide whether the paper describes a primary or secondary study. Primary studies report research first-hand, while secondary studies attempt to summarise and draw conclusions from primary studies. Primary studies (sometimes known as
empirical studies
) are the stuff of most published research in medical journals, and usually fall into one of three categories:
The commoner types of clinical trials and surveys are discussed in the later sections of this chapter. Make sure you understand any jargon used in describing the study design (see
Table 3.1
).
Table 3.1
Terms used to describe design features of clinical research studies
Term | Meaning |
Parallel group comparison | Each group receives a different treatment, with both groups being entered at the same time. In this case, results are analysed by comparing groups. |
Paired (or matched) comparison | Participants receiving different treatments are matched to balance potential confounding variables such as age and sex. Results are analysed in terms of differences between participant pairs. |
Within-participant comparison | Participants are assessed before and after an intervention and results analysed in terms of within-participant changes. |
Single-blind | Participants did not know which treatment they were receiving. |
Double-blind | Neither did the investigators. |
Crossover | Each participant received both the intervention and control treatments (in random order), often separated by a washout period on no treatment. |
Placebo-controlled | Control participants receive a placebo (inactive pill) that should look and taste the same as the active pill. Placebo (sham) operations may also be used in trials of surgery. |
Factorial design | A study that permits investigation of the effects (both separately and combined) of more than one independent variable on a given outcome (e.g. a 2 × 2 factorial design tested the effects of placebo, aspirin alone, streptokinase alone or aspirin + streptokinase in acute heart attack [5]). |