![[SIGN thistle header]](../../../images/page-header-template.jpg)
Section 6: Systematic literature review
Guidelines based on a consensus of expert opinion or on unsystematic literature surveys have been criticised as not reflecting current medical knowledge and being liable to bias.1, 2 SIGN guidelines are therefore produced using a considered judgement process informed by systematic reviews of evidence. Systematic review is defined as "an efficient scientific technique to identify and summarise evidence on the effectiveness of interventions and to allow the generalisability and consistency of research findings to be assessed and data inconsistencies to be explored”.3
The SIGN approach is to produce a systematic review of the evidence for each key question (KQ) to be addressed in the guideline. Evidence tables (and summaries of findings where possible) are produced as supporting documents and the essential elements of systematic review are met in that the literature is:
All the stages of the review process are thoroughly documented (see below).
The benefits of the SIGN approach derive from the close involvement of guideline developers with the synthesis of the evidence base, allowing them to apply their considered judgment when deriving recommendations (see Sections 3 and 4), and from encouraging a sense of ownership of the guideline amongst all those involved in the process.
Incorporating the patient’s perspective from the beginning of the development process is essential if it is to influence the coverage of the final guideline. One of the measures used to achieve this is to conduct a specific search on patient issues in advance of the first meeting of the guideline development group.
This search is designed to cover both quantitative and qualitative evidence, and is not limited to specific study designs. It is carried out over the same range of databases and sources as the main literature review, but will normally include both nursing and psychological literature even where these are not seen as particularly relevant to the later searches of the medical literature. Whereas other literature searches carried out for the guideline attempt to answer focused key questions by filtering out the volume of irrelevant evidence, the patient search is deliberately as broad and inclusive as possible. It focuses entirely on the health condition that is being considered, and makes no attempt to concentrate on any social group or class. As the reviewer develops themes from the literature, (s)he will pay particular attention to anything that suggests there are population groups that are disadvantaged and ensure their interests are specifically considered by the guideline development group.
The use of this literature search is discussed in more detail in SIGN 100.4
As more good quality guidelines are being produced by other agencies, SIGN is making use of the evidence base underlying guidelines produced elsewhere for use in NHSScotland.
The guidelines identified in the scoping search carried out for the original guideline proposal will be presented to an early meeting of the guideline development group to allow it to consider what has been done already.
All guidelines must either be NHS Evidence accredited or be evaluated using the AGREE II instrument and be shown to have followed an acceptable methodology before they can be considered for use by SIGN guideline developers.
There is a range of possible ways in which existing guidelines can be used in relation to SIGN guidelines.
In all cases the guideline development group must decide on the best way forward that will address clinical need while avoiding duplication and waste of resources.
SIGN guideline development groups are encouraged to break down the guideline remit into a series of structured key questions using the PICO format as shown below.5, 6
Patients or population to which the question applies
Intervention (or diagnostic test, exposure, risk factor, etc.) being considered in relation to these patients
Comparison(s) to be made between those receiving the intervention and another group who do not receive the intervention
Outcome(s) to be used to establish the size of any effect caused by the intervention.
The Patients or population to be covered by the literature searches is largely defined by the presence of the particular condition that the guideline will cover. It should, however, be made clear which age groups are to be included.
Consideration should also be given to issues of equity - ensuring that any particular subgroup of the patient population that has particular needs in relation to the topic under review has those needs specifically addressed. This should take account not just of the needs of that population, but any evidence of differences in effectiveness of interventions between equality groups.
The factors defining the population subgroups/protected characteristics that are normally considered are:
It is worth emphasising here that, where clinically important, questions should be addressed even if it is not thought there will be any good evidence. If there is in fact no good evidence, then highlighting it as an area for research is a useful outcome in itself. Dealing with uncertainties of this kind will be addressed in the section of this manual covering the later stages of the considered judgment process.
The Interventions (which in this context includes diagnostic tests, risk factors, risk exposure) must be specified clearly and precisely. The only exception is in drug therapy where drug classes should be used in preference to specific agents unless there is a clear reason for focusing on a named agent.
The decision on Comparisons is mostly between placebo/no treatment, or comparison with other therapies. It should be borne in mind that, where there is an existing treatment, comparison with placebo or no treatment is not ethically acceptable.
Outcomes must be clearly specified, ideally at the stage of setting the key questions but certainly before making judgments about the quality of evidence. For some questions there will be a wide range of outcomes used in the literature, and, if useful comparisons are to be made across studies, it must be made clear which of these outcomes are expected to influence decisions about healthcare.
Outcomes should be discussed by the guideline development group and rated in terms of their importance.7 Where existing reviews have been identified in the scoping search, the outcomes used in those reviews should be presented to the GDG as an aid to completing this part of the process.
Critical outcomes are those on which the overall quality of evidence for a KQ is based. These are the key outcomes on which a health care professional would be expected to base a treatment decision. In osteoporosis, for example, prevention of fracture is likely to be seen as a critical outcome. The number of critical outcomes should be kept low, preferably less than seven per KQ.
Important outcomes are those that a healthcare professional is likely to take into account when making treatment decisions, but which are not the ultimate aim of the intervention under consideration. Often, these will be surrogates for the critical outcome. In the osteoporosis example, improved bone mineral density may be seen as an important outcome as it is a widely accepted surrogate for reduced fracture risk.
In some areas there are a large number of reported outcomes. Some of these are likely to be peripheral to treatment decisions and can be largely ignored in the process of developing guideline recommendations.
As far as possible outcomes should be objective and directly related to patient outcomes (eg length of time to next cardiovascular incident or survival time, rather than just reductions in blood pressure). Patient important outcomes should be explicitly considered along with more narrowly defined clinically important outcomes. It is particularly important to include any potential harm associated with the intervention under review so that a balanced view can be taken at the considered judgment stage.
A pro-forma to help formulate questions is included as Annex A to this document.
As part of the question setting process, a set of inclusion and exclusion criteria should be drawn up and saved as part of the record of the review. This will provide guidance at a later stage when studies are being selected for review.
Inclusion criteria will include definition of the topic and may include such factors as duration of therapy, drug dosage, and frequency of treatment. Other factors include any geographic or language limits, the types of trials that will be accepted, and date range to be covered. Any equality groups that are expected to have specific needs in relation to the question being addressed should be specified.
Exclusion criteria are likely to be more variable. They are, however, essential in that they help sift out irrelevant studies from the (often very large) initial search result.
Equality groups should never be specifically excluded from searches unless a clear justification is provided (eg issues for specific equality groups addressed in a separate question).
Results are sifted in two stages. Initially the Information Officer will review the abstract list and remove any duplicate studies, plus those that are clearly irrelevant or not the type of study being considered (eg reject observational studies when the focus is on controlled trials). Subsequently, at least one member of the GDG will look at the sifted search results and reject any remaining abstracts that they do not consider relevant. Decisions by the GDG member should be based on an explicit set of criteria agreed by the group. These will include clinical criteria, but may also consider issues such as size of the study, relevance to practice in the UK, etc.
Studies left after this second sift will normally be obtained for full review.
Once the questions have been agreed they form the basis of the literature searches to be undertaken by an Information Officer. These searches will focus on the patients, interventions, and (sometimes) comparison parts of the question. Additional references provided by group members or other interested parties may be included for consideration in the evidence base but must be evaluated on the same basis as all other studies. (see Section 6.5)
Definition of a set of clear and focused clinical questions is fundamental to the successful completion of a guideline development project. It is also important to be realistic about the number of questions that can be addressed in a single guideline if the final product is not to be too large to be useable. A large number of key questions will incur a very high workload for the developers. Care must be taken to ensure this is kept within limits that will allow the guideline to be completed within the agreed timescale. Keeping the number of questions to a minimum is particularly important in areas that are particularly rich in literature, as each question will require review of a large number of studies.
Deciding the key questions is entirely the responsibility of the guideline development group which must apply its knowledge and experience to ensuring the questions address the key issues in the area to be covered by the guideline. The Information Officer working with the group will provide guidance on question formatting, and ensure they are likely to produce useable results. They will also work with the Patient Involvement Officer to ensure that the key questions address appropriately the issues identified through the patient consultation exercise.8
The literature search must focus on the best available evidence to address each key question. SIGN uses a set of standard search filters that identify:
In order to minimise bias and to ensure adequate coverage of the relevant literature, the literature search must cover a range of sources. As a minimum, SIGN requires searches to cover the following sources:
For questions covering drug treatments, searches will also cover:
Specialised databases such as CINAHL or Psychinfo should only be searched for questions specific to their area of coverage.
Following a review in 2012 SIGN decided not to include Embase in the standard set of databases covered. Evidence suggests that the benefit of searching both Embase and Medline is variable between topics,9, 10 but can be very low. Inclusion of CCTR in the main search picks up those unique Embase records added by Cochrane and reduces duplication of effort.11, 12
SIGN does not undertake hand searching of key journals as part of the literature review. It is accepted that this means some relevant trials may be missed, and introduces the possibility of a degree of bias in the process. However, given time and resource constraints, it is not feasible for this to form part of the process.
The period that the search should cover will depend on the nature of the clinical topic under consideration, and will be discussed with the guideline group. For a rapidly developing field a 5 year limit to the search may be appropriate, whereas in other areas a much longer time frame might be necessary.
All the main search strategies are available to members of guideline development groups if they want to review them.
A listing of the Medline search strategies used for the guideline, plus notes of any significant variation on other databases, is published on the SIGN website at the time of publication of the guideline.
Before any studies are acquired for evaluation, the search output is sifted to eliminate irrelevant material. A preliminary sift of each search result is carried out by SIGN staff, normally by the individual that carried out the search. Studies that are clearly not relevant to the key questions are eliminated. Abstracts of remaining studies are then examined and any that clearly do not meet the agreed inclusion and exclusion criteria will also be eliminated at this stage. In cases of doubt, the Information Officer will leave abstracts in the output file at this stage.
A final sift is carried out by one or two individuals from the guideline development group, who will apply clinical judgment to reject any other studies that do not meet the pre-agreed criteria. Only when all stages of search result sifting have been completed will the remaining studies be acquired for evaluation.
Once studies have been selected as potential sources of evidence, the methodology used in each study is assessed to ensure its validity.
The methodological assessment is based on a number of criteria that focus on those aspects of the study design that research has shown to have a significant effect on the risk of bias in the results reported and conclusions drawn. These criteria differ between study types, and a range of checklists is used to bring a degree of consistency to the assessment process. The SIGN checklist for systematic reviews is based on the AMSTAR tool13, 14, while that for RCTs is based on an internal project carried out in 1997.15 Checklists for observational studies are based on the MERGE (Method for Evaluating Research and Guideline Evidence) checklists developed by the New South Wales Department of Health,16 which have been subjected to wide consultation and evaluation. The checklist for diagnostic accuracy studies is based on the QUADAS programme.17
These checklists were subjected to detailed evaluation and adaptation to meet SIGN’s requirements for a balance between methodological rigour and practicality of use. Copies of these checklists and accompanying notes on their use are included in Annex B.
The assessment process inevitably involves a degree of subjectivity. The extent to which a study meets a particular criterion – eg an acceptable level of loss to follow up – and, more importantly, the likely impact of this on the reported results from the study will depend on the clinical context and inevitably the judgment of the individual reviewers.
The methodology of studies selected for full consideration will be appraised by at least two people with experience in carrying out such appraisals. The subjective nature of critical appraisal makes double checking essential to minimise the chance of bias and to ensure consistency. Where reviewers cannot agree on the overall quality of a study the Programme Manager and (if required) the Lead Methodologist will arbitrate before a study goes forward for inclusion in the evidence base. {Note that this only applies to studies being actively considered as evidence. There is no need to seek agreement for studies that are not to be included}. Any study that has not been included in this process cannot be used as evidence to support a recommendation in the guideline.
The first part of carrying out a well conducted systematic review for a guideline or any other purpose is to systematically identify existing sources of evidence, including other relevant systematic reviews. As with any source of evidence, systematic reviews will vary in the degree to which they help to address the key question under consideration.
In some cases a good quality systematic review will be identified that addresses the same question as that set by the guideline group. If the review is more than one year old, the Information Officer will update the literature search (using the same strategy as the original authors wherever possible). Any studies identified during that search will be evaluated using SIGN checklists, as described above and presented as supplementary data. (Note: It can also mean there will be no recommendation because the review found no convincing evidence. In this case the group should make one or more research recommendations).
The output from this process that is passed to the guideline group should include:
The group is now ready to proceed to the first considered judgment stage for this question.
Directly relevant evidence tables or summaries of findings from other guideline developers should be treated as systematic reviews.
In some areas of work, there is a comparatively large number of existing reviews that will either cover a key question or parts of it. It is sensible to make best use of this material, but we must also guard against multiple citations of the same literature. Ideally, the studies from all the identified reviews should be compiled into a single analysis but this is likely to be a complex and time consuming task that will not be achievable in the timescale of normal guideline development. There is however a need for a formal process for a review of reviews.18
In order to make the task manageable in situations where there are multiple reviews relating to one key question, it is acceptable to concentrate on Cochrane Reviews and the most recent reviews that have received a high quality rating.
Where a review of reviews already exists it should be evaluated and considered on the same basis as single reviews. If the review is recent (<2 years old) no further searching should be required. If it is >2 years old, the individual reviews should be checked to see if they have had a recent update, and if so how (if at all) the conclusions have changed.
In rare cases the clinical question being addressed will be of such importance to the guideline that a particularly thorough review of the evidence will be essential. If there is no existing review of reviews in these cases, one will have to be prepared regardless of any impact on the timescale of guideline production.
In all other cases the guideline group should be presented with an evidence table summarising the findings of each review by outcome, along with comments from group members who have reviewed them. They should be accompanied by an evidence table summarising any more recent studies that have been identified addressing the key question. Considered judgment can then be applied to the whole body of evidence.
Where a review or reviews address only some aspects of a question, the processes described above should be followed in relation to the outcomes addressed in the reviews. As well as updating the literature search, a further search should be carried out looking for studies addressing outcomes not covered in the existing review. These can then be considered along with the existing review to produce a complete answer.
Where it has been established that there are no existing reviews, searches should be extended to individual studies. Types of study to be included should have been agreed as part of the inclusion criteria.
All reviews and studies that are identified as being of possible relevance and are acquired in full copy should be accounted for at the end of the process. If they are rejected for any reason, they should be listed in a report on rejected studies with a brief reason for rejection given against each one.
For individual studies, an evidence table should be produced for each key question. These tables will list bibliographic details of each study, and as far as possible a standard set of data including (but not limited to) sample size, population characteristics, intervention, primary and secondary outcome measures, length of follow up, primary and secondary outcome data, comments from guideline group members.
Once all reviews and studies have been collected for each question, the process moves on to considering the quality of the evidence that has been found.
1. Antman E, Lau J, Kupelnick B, Mosteller F, Chalmers T. A comparison of results of meta-analyses of randomized controlled trials and recommendations of clinical experts. Treatments for myocardial infarction. JAMA 1992;268:240-8.
2. Mulrow C. Rationale for systematic reviews. BMJ 1994;309:597-9.
3. Woolf S. Practice guidelines, a new reality in medicine. II: Methods of developing guidelines. Arch Intern Med 1992;152:946-52.
4. Scottish Intercollegiate Guidelines Network. SIGN 100: a handbook for patient and carer representatives. Edinburgh: SIGN; 2008. [cited 29 May 2012] Available from http://www.sign.ac.uk/patients/publications/100/index.html
5. Counsell C. Formulating questions and locating primary studies for inclusion in systematic reviews. Ann Int Med 1997;127(5):380-7.
6. Schardt C, Adams MB, Owens T, Keitz S, Fontelo P. Utilization of the PICO framework to improve searching PubMed for clinical questions. BMC Med inform Decis Mak. 2007; 7: 16. [cited: 29 May 2012]. url: http://www.biomedcentral.com/1472-6947/7/16
7. Guyatt GH, Oxman AD, Kunz R, Atkins D, Brozek J, Vist G, et al. GRADE guidelines: 2. Framing the question and deciding on important outcomes. J Clin Epidemiol 2011;64(4):395-400.
8. Scottish Intercollegiate Guidelines Network. SIGN 100: a handbook for patient and carer representatives. Edinburgh: SIGN; 2008. (SIGN Guideline 100)
9. Falck-Ytter Y, Blümle A, Motschall E, Antes G. Searching for intervention studies in gastroenterology: surprisingly low incremental benefit of Embase Cochrane Colloquium Abs J 2004; [cited: 29 May 2012]. url: http://www.imbi.uni-freiburg.de/OJS/cca/index.php?journal=cca&page=article&op=view&path%5B%5D=2653
10. Bara AI, Milan S, Jones PW. Identifying asthma RCTs with Medline and Embase. . Cochrane Colloquium Abs J. 1995; [cited: 29 May 2012]. url: http://www.imbi.uni-freiburg.de/OJS/cca/index.php?journal=cca&page=article&op=view&path%5B%5D=4280
11. Eisinga A, Lefebvre C. Closing the gap - identifying reports of randomized trials in EMBASE for inclusion in CENTRAL. . Cochrane Colloquium Abs J. 2004; [cited: 29 May 2012]. url: http://www.imbi.uni-freiburg.de/OJS/cca/index.php?journal=cca&page=article&op=view&path%5B%5D=2689
12. Paul N, Lefebvre C. Reports of controlled trials from EMBASE: an important contribution to The Cochrane Controlled Trials Register. Cochrane Colloquium Abs J. 1998; [cited: 29 May 2012]. url: http://www.imbi.uni-freiburg.de/OJS/cca/index.php?journal=cca&page=article&op=view&path%5B%5D=4368
13. Shea B, Bouter L, Peterson J, Boers M, Andersson N, Ortiz Z, et al. External validation of a measurement tool to assess systematic reviews (AMSTAR). PloS One 2007;2(12):e1350.
14. Shea B, Grimshaw J, Wells G, Boers M, Andersson N, Hamel C, et al. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol. 2007; 7: [cited: url:
15. Scottish Intercollegiate Guidelines Network. Methodology Review Group. Report on the review of the method of grading guideline recommendations. Edinburgh: SIGN; 1999.
16. Liddle J, Williamson M, Irwig L. Method for evaluating research and guideline evidence. Sydney: New South Wales Department of Health; 1996.
17. Whiting P, Rutjes A, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies. Ann Intern Med 2011;155(8):529-36.
18. Smith V, Devane D, Begley C, Clarke M. Methodology in conducting a systematic review of systematic reviews of healthcare interventions. BMC Med Res Methodol 2011;11(1):15.