How many studies needed for a meta analysis
Model parameter estimation with Monte-Carlo error propagation method. A Study-level data taken from ATP release meta-analysis. B Assuming sigmoidal model, parameters were estimated using Fit Model MetaLab module by randomly sampling data from distributions defined by study level data. Model parameters were estimated for each set of sampled data.
C Final model using parameters estimated from simulations. D Distributions of parameters estimated for given dataset are unimodal and symmetrical. It is critical for reviewers to ensure the data is consistent with the model such that the estimated parameters sufficiently capture the information conveyed in the underlying study-level data. In general, reliable model fittings are characterized by normal parameter distributions Figure 5D and have a high goodness of fit as quantified by R 2.
The advantage of using the Monte-Carlo approach is that it works as a black box procedure that does not require complex error propagation formulas, thus allowing handling of correlated and independent parameters without additional consideration. The absolute effect size, computed as a mean outcome or absolute difference from baseline, is the simplest, is independent of variance, and retains information about the context of the data Baguley, However, the use of absolute effect size requires authors to report on a common scale or provide conversion parameters.
In cases where a common scale is difficult to establish, a scale-free measure, such as standardized, normalized or relative measures can be used.
Standardized mean differences, such Hedges' g or Cohen d, report the outcome as the size of the effect difference between the means of experimental and control groups relative to the overall variance pooled and weighted standard deviation of combined experimental and control groups.
The standardized mean difference, in addition to odds or risk ratios, is widely used in meta-analysis of clinical studies Vesterinen et al. However, the standardized measure is rarely used in basic science since study outcomes are commonly a defined measure, sample sizes are small, and variances are highly influenced by experimental and biological factors.
Other measures that are more suited for basic science are the normalized mean difference, which expresses the difference between the outcome and baseline as a proportion of the baseline alternatively called the percentage difference , and response ratio, which reports the outcome as a proportion of the baseline.
All discussed measures have been included in MetaLab Table 2. The goal of any meta-analysis is to provide an outcome estimate that is representative of all study-level findings. One important feature of the meta-analysis is its ability to incorporate information about the quality and reliability of the primary studies by weighing larger, better reported studies more heavily. The two quantities of interest are the overall estimate and the measure of the variability in this estimate.
The choice of a weighting scheme dictates how study-level variances are pooled to estimate the variance of the weighted mean. The weighting scheme thus significantly influences the outcome of meta-analysis, and if poorly chosen, potentially risks over-weighing less precise studies and generating a less valid, non-generalizable outcome. Thus, the notion of defining an a priori analysis protocol has to be balanced with the need to assure that the dataset is compatible with the chosen analytic strategy, which may be uncertain prior to data extraction.
We provide strategies to compute and compare different study-level and global outcomes and their variances.
To generate valid estimates of cumulative knowledge, studies are weighed according to their reliability. This conceptual framework, however, deteriorates if reported measures of precision are themselves flawed. The most commonly used measure of precision is the inverse variance which is a composite measure of total variance and sample size, such that studies with larger sample sizes and lower experimental errors are more reliable and more heavily weighed.
Inverse variance weighting schemes are valid when i sampling error is random, ii the reported effects are homoscedastic, i. When assumptions i or ii are violated, sample size weighing can be used as an alternative. Despite sample size and sample variance being such critical parameters in the estimation of the global outcome, they are often prone to deficient reporting practices. Additionally, many assays used in basic research often have uneven error distributions, such that the variance component arising from experimental error depends on the magnitude of the effect Bittker and Ross, Such uneven error distributions will lead to biased weighing that does not reflect true precision in measurement.
Fortunately, the standard error and standard deviation have characteristic properties that can be assessed by the reviewer to determine whether inverse variance weights are appropriate for a given dataset.
Therefore, the standard error is expected to be approximately inversely proportionate to the root of the study-level sample size n i. Since the total observed study-level sample variance is the sum of natural variability assumed to be constant for a phenomenon and random error, no relationship is expected between reported standard deviations and sample sizes.
These assumptions can be tested by correlation analysis and can be used to inform the reviewer about the reliability of the study-level uncertainty measures. Therefore, in the case of the OB [ATP] ic data set, lower variances are not associated with higher precision and inverse variance weighting is not appropriate. Sample sizes are also frequently misrepresented in the basic sciences, as experimental replicates and repeated experiments are often reported interchangeably incorrectly as sample sizes Vaux et al.
Repeated independent experiments refer to number of randomly sampled observations, while replicates refer to the repeated measurement of a sample from one experiment to improve measurement precision. Statistical inference theory assumes random sampling, which is satisfied by independent experiments but not by replicate measurements. Misrepresentative reporting of replicates as the sample size may artificially inflate the reliability of results.
While this is difficult to identify, poor reporting may be reflected in the overall quality score of a study. Figure 6. Assessment of study-level outcomes. A,B Reliability of study-level error measures.
C,D Distributions of study-level outcomes. Heterogeneity was quantified by Q, I 2 , and H 2 heterogeneity statistics. The inverse variance is the most common measure of precision, representing a composite measure of total variance and sample size. Widely used weighting schemes based on the inverse variance are fixed effect or random effects meta-analytic models.
Study-level estimates for a fixed effect or random effects model are weighted using the inverse variance:. In practice, random effects models are favored over the fixed effect model, due to the prevalence of heterogeneity in experimental methods and biological outcomes. Sample-size weighting is preferred in cases where variance estimates are unavailable or unreliable. Under this weighting scheme, study-level sample sizes are used in place of inverse variances as weights.
The sampling error is then unaccounted for; however, since sampling error is random, larger sample sizes will effectively average out the error and produce more dependable results.
This is contingent on reliable reporting of sample sizes which is difficult to assess and can be erroneous as detailed above. While sample size weighting is less affected by sampling variance, the performance of this estimator depends on the availability of studies Marin-Martinez and Sanchez-Meca, When variances are reliably reported, sample-size weights should roughly correlate to inverse variance weights under the fixed effect model.
One important consideration the reviewer should attend to is the normality of the study-level effects distributions assumed by most meta-analytic methods. Non-parametric methods that do not assume normality are available but are more computationally intensive and inaccessible to non-statisticians Karabatsos et al.
The performance of parametric meta-analytic methods has been shown to be robust to non-normally distributed effects Kontopantelis and Reeves, However, this robustness is achieved by deriving artificially high estimates of heterogeneity for non-normally distributed data, resulting in conservatively wide confidence intervals and severely underpowered results Jackson and Turner, Therefore, it is prudent to characterize the underlying distribution of study-level effects and perform transformations to normalize distributions to preserve the inferential integrity of the meta-analysis.
Graphical approaches, such as the histogram, are commonly used to assess the distribution of data; however, in a meta-analysis, they can misrepresent the true distribution of effect sizes that may be different due to unequal weights assigned to each study. To address this, we can use a weighted histogram to evaluate effect size distributions Figure 6. A weighted histogram can be constructed by first binning studies according to their effect sizes. Each bin is then assigned weighted frequencies, calculated as the sum of study-level weights within the given bin.
The sum of weights in each bin are then normalized by the sum of all weights across all bins. If the distribution is found deviate from normality, the most common explanations are that i the distribution is skewed due to inconsistencies between studies, ii subpopulations exist within the dataset giving rise to multimodal distributions or iii the studied phenomenon is not normally distributed. The source of inconsistencies and multimodality can be explored during the analysis of heterogeneity i.
Skewness may however be inherent to the data when values are small, variances are large, and values cannot be negative Limpert et al. For sufficiently large sample sizes the central limit theorem holds that the means of a skewed data are approximately normally distributed.
However, due to common limitation in the number of studies available for meta-analyses, meta-analytic global estimates of skewed distributions are often sensitive to extreme values. In these cases, data transformation can be used to achieve a normal distribution on the logarithmic scale i. Since meta-analytic methods typically assume normality, the log transformation is a useful tool used to normalize skewed distributions Figures 6C—F. In the ATP release dataset, we found that log transformation normalized the data distribution.
However, in the case of the OB [ATP] ic dataset, log transformation revealed a bimodal distribution that was otherwise not obvious on the raw scale.
Data normalization by log transformation allows meta-analytic techniques to maintain their inferential properties. The outcomes synthesized on the logarithmic scale can then be transformed to the original raw scale to obtain asymmetrical confidence intervals which further accommodate the skew in the data.
Once the meta-analysis global estimate and standard error has been computed, reviewers may proceed to construct the confidence intervals CI. In meta-analyses, the CI conveys information about the significance, magnitude and direction of an effect, and is used for inference and generalization of an outcome.
Values that do not fall in the range of the CI may be interpreted as significantly different. A theoretical distribution describes the probability of any given possible outcome occurrence for a phenomenon. Extreme outcomes that lie furthest from the mean are known as the tails.
Heavier tails will result in larger critical values which translate to wider confidence intervals, and vice versa. The tails of a z-distribution are independent of the sample size and reflect those expected for a normal distribution. This produces more conservative wider CIs, which help ensure that the data are not misleading or misrepresentative when there is limited evidence available.
Importantly, the t-distribution is asymptotically normal and will thus converge to a z-distribution for a sufficiently large number of studies, resulting in similar critical values.
We have implemented the z-distribution and t-distribution CI estimators into MetaLab. The coverage is a performance measure used to determine whether inference made on the study-level is consistent with inference made on the meta-analytic level. Coverage that is less than expected for a specified significance level i.
Overall, the performance of a meta-analysis is heavily influenced by the choice of weighting scheme and data transformation Figure 7. This is especially evident in the smaller datasets, such as our OB [ATP] i example, where both the global estimates and the confidence intervals are dramatically different under different weighting schemes Figure 7A. Working with larger datasets, such as ATP release kinetics, allows to somewhat reduce the influence of the assumed model Figure 7B.
However, normalizing data distribution by log transformation produces much more consistent outcomes under different weighting schemes for both datasets, regardless of the number of available studies Figures 7A,B , log 10 synthesis.
Figure 7. Comparison of global effect estimates using different weighting schemes. Heterogeneity refers to inconsistency between studies. A large part of conducting a meta-analysis involves quantifying and accounting for sources of heterogeneity that may compromise the validity of meta-analysis. Basic research meta-analytic datasets are expected to be heterogeneous because i basic research literature searches tend to retrieve more studies than clinical literature searches and ii experimental methodologies used in basic research are more diverse and less standardized compared to clinical research.
The presence of heterogeneity may limit the generalizability of an outcome due to the lack of study-level consensus. Nonetheless, exploration of heterogeneity sources can be insightful for the field in general, as it can identify biological or methodological factors that influence the outcome.
Higgins and Thompson emphasized that a heterogeneity metric should be i dependent on magnitude of heterogeneity, ii independent of measurement scale, iii independent of sample size and iv easily interpretable Higgins and Thompson, Regrettably, the most commonly used test of heterogeneity is the Cochrane's Q test Borenstein, , which has been repeatedly shown to have undesirable statistical properties Higgins et al.
Nonetheless, we will introduce it here, not because of its widespread use, but because it is an intermediary statistic used to obtain more useful measures of heterogeneity, H 2 and I 2. Additionally, the Q total statistic is not a measure of the magnitude of heterogeneity due to its inherent dependence on the number of studies. To address this limitation, H 2 heterogeneity statistics was developed as the relative excess in Q total over degrees of freedom df :.
H 2 is independent of the number of studies in the meta-analysis and is indicative of the magnitude of heterogeneity Higgins and Thompson, The corresponding confidence intervals for H 2 are.
Intervals that do not overlap with 1 indicate significant heterogeneity. A more easily interpretable measure of heterogeneity is the I 2 statistic, which is a transformation of H 2 :.
Like H 2 , I 2 provides a measure of the magnitude of heterogeneity. However, several limitations have been noted for the I 2 statistic. In cases of excessive heterogeneity, if heterogeneity is partially explained through subgroup analysis or meta-regression, residual unexplained heterogeneity may still be sufficient to maintain I 2 near saturation. Therefore, I 2 will fail to convey the decline in overall heterogeneity, while H 2 statistic that has no upper limit will allow to track changes in heterogeneity more meaningfully.
Of the three heterogeneity statistics Q total , H 2 and I 2 described, we recommend that H 2 is used as it best satisfies the criteria for a heterogeneity statistic defined by Higgins and Thompson Bias refers to distortions in the data that may result in misleading meta-analytic outcomes.
In the presence of bias, meta-analysis outcomes are often contradicted by higher quality large sample-sized studies Egger et al. Sources of observed bias include publication bias, methodological inconsistencies and quality, data irregularities due to poor quality design, inadequate analysis or fraud, and availability or selection bias Egger et al.
At the level of study identification and inclusion for meta-analysis, systematic searches are preferred over rapid review search strategies, as narrow search strategies may omit relevant studies. Withholding negative results is also a common source of publication bias, which is further exacerbated by the small-study effect the phenomenon by which smaller studies produce results with larger effect sizes than larger studies Schwarzer et al.
By extension, smaller studies that produce negative results are more likely to not be published compared to larger studies that produce negative results. Identifying all sources of bias is unfeasible, however, tools are available to estimate the extent of bias present.
Funnel plots. Funnel plots have been widely used to assess the risk of bias and examine meta-analysis validity Light and Pillemer, ; Borenstein, The logic underlying the funnel plot is that in the absence of bias, studies are symmetrically distributed around the fixed effect size estimate, due to sampling error being random. Moreover, precise study-level estimates are expected to be more consistent with the global effect size than less precise studies, where precision is inversely related to the study-level standard error.
When bias is present, study-level effects will be asymmetrically distributed around the global fixed-effect estimate. In the past, funnel plot asymmetries have been attributed solely to publication bias, however they should be interpreted more broadly as a general presence of bias or heterogeneity Sterne et al. It should be noted that rapid reviews Figure 8A , left are far more subject to bias than systematic reviews Figure 8A , right , due to the increased likelihood of relevant study omission.
Figure 8. Analysis of heterogeneity and identification of influential studies. B OB [ATP] ic were evaluated using Baujat plot and inconsistent and influential studies were identified in top right corner of plot arrows. C,D Effect of the single study exclusion C and cumulative sequential exclusion of the most inconsistent studies D.
Left : heterogeneity statistics, H 2 red line and I 2 black line. Arrows : influential studies contributing to heterogeneity same as those identified on Baujat Plot. Inconsistencies between studies can arise for a number of reasons, including methodological or biological heterogeneity Patsopoulos et al. Since accounting for heterogeneity is an essential part of any meta-analysis, it is of interest to identify influential studies that may contribute to the observed heterogeneity.
Baujat plot. The Baujat Plot was proposed as a diagnostic tool to identify the studies that contribute most to heterogeneity and influence the global outcome Baujat, The graph illustrates the contribution Q i in f of each study to heterogeneity on the x-axis. Studies that strongly influence the global outcome and contribute to heterogeneity are visualized in the upper right corner of the plot Figure 8B.
This approach has been used to identify outlying studies in the past Anzures-Cabrera and Higgins, Single-study exclusion sensitivity.
Single-study exclusion analysis assesses the sensitivity of the global outcome and heterogeneity to exclusion of single studies. The global outcomes and heterogeneity statistics are computed for a dataset with a single omitted study; single study exclusion is iterated for all studies; and influential outlying studies are identified by observing substantial declines in observed heterogeneity, as determined by Q total , H 2 , or I 2 , and by significant differences in the global outcome Figure 8C.
Influential studies should not be blindly discarded, but rather carefully examined to determine the reason for inconsistency. If a cause for heterogeneity can be identified, such as experimental design flaw, it is appropriate to omit the study from the analysis.
All reasons for omission must be justified and made transparent by reviewers. Cumulative-study exclusion sensitivity. Cumulative study exclusion sequentially removes studies to maximize the decrease in total variance Q total , such that a more homogenous set of studies with updated heterogeneity statistics is achieved with each iteration of exclusion Figure 8D.
This method was proposed by Patsopoulos et al. We propose the homogeneity threshold T H as a measure of heterogeneity that can be derived from cumulative-study exclusion sensitivity analysis.
The homogeneity threshold describes the percentage of studies that need to be removed by the maximal Q-reduction criteria before a homogenous set of studies is achieved. After homogeneity is attained by cumulative exclusion, the global effect generally stabilizes with respect to subsequent study removal. This metric provides information about the extent of inconsistency present in the set of studies that is scale invariant independent of the number of studies , and is easily interpretable.
The purpose of an exploratory analysis is to understand the data in ways that may not be represented by a pooled global estimate. This involves identifying sources of observed heterogeneity related to biological and experimental factors.
Subgroup and meta-regression analyses are techniques used to explore known data groupings define by study-level characteristics i. Additionally, we introduce the cluster-covariate dependence analysis, which is an unsupervised exploratory technique used to identify covariates that coincide well will natural groupings within the data, and the intrastudy regression analysis, which is used to validate meta-regression outcomes.
Natural groupings within the data can be informative and serve as a basis to guide further analysis. Using an unsupervised k-means clustering approach Lloyd, , we can identify natural groupings within the study-level data and assign cluster memberships to these data Figure 9A.
Reviewers then have two choices: either proceed directly to subgroup analysis Figure 9B or look for covariates that co-cluster with cluster memberships Figure 9C In the latter case, dependencies between cluster memberships and known data covariates can be tested using Pearson's Chi-Squared test for independence.
Covariates that coincide with clusters can be verified by subgroup analysis Figure 9D. Clustering results should be considered exploratory and warrant further investigation due to several limitations. If the subpopulations were identified through clustering, however they do not depend on extracted covariates, reviewers risk assigning misrepresentative meaning to these clusters. Moreover, conventional clustering methods always converge to a result, therefore the data will still be partitioned even in the absence of natural data groupings.
Future adaptations of this method might involve using different clustering algorithms hierarchical clustering or independence tests G-test for independence as well as introducing weighting terms to bias clustering to reflect study-level precisions.
Figure 9. Exploratory subgroup analysis. Arrow : most influential covariate ex. D Subgroup analysis of ATP release by recording method. Subgroup analyses attempt to explain heterogeneity and explore differences in effects by partitioning studies into characteristic groups defined by study-level categorical covariates Figures 9B,D ; Table 5. Subgroup effects are estimated along with corresponding heterogeneity statistics.
To evaluate the extent to which subgroup covariates contribute to observed inconsistencies, the explained heterogeneity Q between and unexplained heterogeneity Q within can be calculated.
The explained heterogeneity Q between is then the difference between total and subgroup heterogeneity:. Similarly, Q within statistic can be used to test whether there is any residual heterogeneity present within the subgroups. The R e x p l a i n e d 2 is a related statistic that can be used to describe the percent of total heterogeneity that was explained by the covariate and is estimated as. To explore the remaining heterogeneity, additional subgroup analysis can be conducted by further stratifying method A and method B subgroups by other covariates.
However, in many meta-analyses multi-level data stratification may be unfeasible if covariates are unavailable or if the number of studies within subgroups are low. Multiple comparisons. When multiple subgroups are present for a given covariate, and the reviewer wishes to investigate the statistical differences between the subgroups, the problem of multiple comparisons should be addressed.
Error rates are multiplicative and increase substantially as the number of subgroup comparisons increases. The Bonferroni correction has been advocated to control for false positive findings in meta-analyses Hedges and Olkin, which involves adjusting the significance threshold:. Meta-regression attempts to explain heterogeneity by examining the relationship between study-level outcomes and continuous covariates while incorporating the influence of categorical covariates Figure 10A.
The main differences between conventional linear regression and meta-regression are i the incorporation of weights and ii covariates are at the level of the study rather than the individual sample. The generalized meta-regression model is specified as. The residual Q statistic that explains the dispersion of the studies from the regression line is calculated as follows.
Where y i is the predicted value at x i according to the meta-regression model. Q residual is analogous to Q between computed during subgroup analysis and is used to test the degree of remaining unaccounted heterogeneity.
Which can be used to calculate R e x p l a i n e d 2 estimated as. Q model quantifies the amount of heterogeneity explained by the regression model and is analogous to Q within computed during subgroup analysis.
Figure Meta-regression analysis and validation. A Relationship between osteoblast differentiation day covariate and intracellular ATP content outcome investigated by meta-regression analysis.
Outcomes are on log 10 scale, meta-regression markers sizes are proportional to weights. Solid red lines : intrastudy regression. Intrastudy regression analysis The challenge of interpreting results from a meta-regression is that relationships that exist within studies may not necessarily exist across studies, and vice versa.
Such inconsistencies are known as aggregation bias and in the context of meta-analyses can arise from excess heterogeneity or from confounding factors at the level of the study.
This problem has been acknowledged in clinical meta-analyses Thompson and Higgins, , however cannot be corrected without access to individual patient data. Fortunately, basic research studies often report outcomes at varying predictor levels ex.
Similarity in the magnitude and sign validates the existence of the relationship and characterizes its strength, while similarity in sign but not the magnitude, still supports the presence of the relationship, but calls for additional experiments to further characterize it. For the Ob [ATP] i dataset, the magnitude of the relationship between osteoblast differentiation day and intracellular ATP concentration was inconsistent between intrastudy and interstudy estimates, however the estimates were of consistent sign Figure 10B.
When performed with knowledge and care, exploratory analysis of meta-analytic data has an enormous potential for hypothesis generation, cataloging current practices and trends, and identifying gaps in the literature. Thus, we emphasize the inherent limitations of exploratory analyses:.
Data dredging. A major pitfall in meta-analyses is data dredging also known as p-hacking , which refers to searching for significant outcomes only to assign meaning later. While exploring the dataset for potential patterns can identify outcomes of interest, reviewers must be wary of random patterns that can arise in any dataset. Therefore, if a relationship is observed it should be used to generate hypotheses, which can then be tested on new datasets.
Steps to avoid data dredging involve defining an a priori analysis plan for study-level covariates, limiting exploratory analysis of rapid review meta-analyses and correcting for multiple comparisons. Statistical power. The statistical power reflects the probability of rejecting the null hypothesis when the alternative is true.
Meta-analyses are believed to have higher statistical power than the underlying primary studies, however this is not always true Hedges and Pigott, ; Jackson and Turner, Random effects meta-analyses handle data heterogeneity by accounting for between-study variance, however this weakens the inference properties of the model.
To maintain statistical powers that exceed those of the contributing studies in a random effects meta-analysis, at least five studies are required Jackson and Turner, This consequently limits subgroup analyses that partition studies into smaller groups to isolate covariate-dependent effects.
Thus, reviewers should ensure that group are not under-represented to maintain statistical power. Another determinant of statistical power is the expected effect size, which if small, will be much more difficult to support with existing evidence than if it is large. Thus, if reviewers find that there is insufficient evidence to conclude that a small effect exists, this should not be interpreted as evidence of no effect.
Causal inference. Meta-analyses are not a tool for establishing causal inference. However, there are several criteria for causality that can be investigated through exploratory analyses that include consistency, strength of association, dose-dependence and plausibility Weed, , For example, consistency, the strength of association, and dose-dependence can help establish that the outcome is dependent on exposure.
However, reviewers are still posed with the challenge of accounting for confounding factors and bias. Therefore, while meta-analyses can explore various criteria for causality, causal claims are inappropriate, and outcomes should remain associative. Meta-analyses of basic research can offer critical insights into the current state of knowledge. In this manuscript, we have adapted meta-analytic methods to basic science applications and provided a theoretical foundation, using OB [ATP] i and ATP release datasets, to illustrate the workflow.
Since the generalizability of any meta-analysis relies on the transparent, unbiased and accurate methodology, the implications of deficient reporting practices and the limitations of the meta-analytic methods were discussed. Emphasis was placed on the analysis and exploration of heterogeneity. Additionally, several alternative and supporting methods have been proposed, including a method for validating meta-regression outcomes—intrastudy regression analysis, and a novel measure of heterogeneity—the homogeneity threshold.
The methods we have described here serve as a general framework for comprehensive data consolidation, knowledge gap-identification, evidence-driven hypothesis generation and informed parameter estimation in computation modeling, which we hope will contribute to meta-analytic outcomes that better inform translation studies, thereby minimizing current failures in translational research.
Both authors contributed to the study conception and design, data acquisition and interpretation and drafting and critical revision of the manuscript.
NM developed MetaLab. Both authors approved the final version to be published. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Ahmed, I. Assessment of publication bias, selection bias, and unavailable data in meta-analyses using individual participant data: a database survey. Altman, D. Standard deviations and standard errors. Anzures-Cabrera, J. Graphical displays for meta-analysis: an overview with suggestions for practice. Methods 1, 66— Baguley, T. Standardized or simple effect size: what should be reported?
Barendregt, J. Baujat, B. A graphical method for exploring heterogeneity in meta-analyses: application to a meta-analysis of 65 trials. Bax, L. MIX 2. Version 2. Bittker, J.
Cambridge: Royal Society of Chemistry. Bodin, P. Chronic hypoxia changes the ratio of endothelin to ATP release from rat aortic endothelial cells exposed to high flow. Borenstein, M. Introduction to Meta-Analysis. Comprehensive meta-analysis Version 2. Englewood, CO. Bramer, W. De-duplication of database search results for systematic reviews in EndNote. Chowdhry, A. Meta-analysis with missing study-level sample variance data. Cochrane Collaboration Cox, M. Evaluation of measurement uncertainty based on the propagation of distributions using monte carlo simulation.
DeLuca, J. Developing a comprehensive search strategy for evidence based systematic reviews. Based Libr. DerSimonian, R. Meta-analysis in clinical trials. Trials 7, — Ecker, E. Conducting a winning literature search. Stating the limitations of the analysis is also important [ 62 ]. A table with complete information about included studies such as author, year, details of included subjects, DOIs, or PubMed IDs, among others is quite useful in an article reporting a meta-analysis; it can be included in the main text of the manuscript or as a supplementary file.
Software used for carrying out meta-analyses and to generate key graphs, such as forest plots, should be referenced. Summary effect measures, such as a pooled odds ratios or the counts used to generate them, should be always reported, including confidence intervals.
It is also possible to generate figures with information from multiple forest plots [ 63 ]. In the case of positive findings, plots from sensitivity analyses are quite informative. In more-complex analyses, it is advisable to include in the supplementary files the scripts used to generate the results [ 64 ].
The Discussion section is an important scientific component in a manuscript describing a meta-analysis, as the authors should discuss their current findings in the context of the available scientific literature and existing knowledge [ 65 ]. Authors can discuss possible reasons for the positive or negative results of their meta-analysis, provide an interpretation of findings based on available biological or epidemiological evidence, and comment on particular features of individual studies or experimental designs used [ 66 ].
As open science is becoming more important around the globe [ 68 , 69 ], adherence to published standards, in addition to the evolution of methods for different meta-analytical applications, will be even more important to carry out meta-analyses of high quality and impact.
Introduction In the context of evidence-based medicine, meta-analyses provide novel and useful information [ 1 ], as they are at the top of the pyramid of evidence and consolidate previous evidence published in multiple previous reports [ 2 ]. Rule 1: Specify the topic and type of the meta-analysis Considering that a systematic review [ 10 ] is fundamental for a meta-analysis, you can use the Population, Intervention, Comparison, Outcome PICO model to formulate the research question.
Rule 2: Follow available guidelines for different types of meta-analyses There are several available general guidelines. Rule 3: Establish inclusion criteria and define key variables You should establish in advance the inclusion such as type of study, language of publication, among others and exclusion such as minimal sample size, among others criteria.
Rule 4: Carry out a systematic search in different databases and extract key data You can carry out your systematic search in several bibliographic databases, such as PubMed, Embase, The Cochrane Central Register of Controlled Trials, Scopus, Web of Science, and Google Scholar [ 21 ].
Rule 5: Contact authors of primary articles to ask for missing data It is common that key data are not available in the main text or supplementary files of primary articles [ 31 ], leading to the need to contact the authors to ask for missing data. Rule 6: Select the best statistical models for your question For cases in which there is enough primary data of adequate quality for a quantitative summary, there is the option to carry out a meta-analysis.
Rule 7: Use available software to carry metastatistics There are several very user-friendly and freely available programs for carrying out meta-analyses [ 43 , 44 ], either within the framework of a statistical package such as Stata or R or as stand-alone applications. Rule 8: The records and study report must be complete and transparent Following published guidelines for meta-analyses guarantees that the manuscript will describe the different steps and methods used, facilitating their transparency and replicability [ 15 ].
Rule 9: Provide enough data in your manuscript A table with complete information about included studies such as author, year, details of included subjects, DOIs, or PubMed IDs, among others is quite useful in an article reporting a meta-analysis; it can be included in the main text of the manuscript or as a supplementary file.
Rule Provide context for your findings and suggest future directions The Discussion section is an important scientific component in a manuscript describing a meta-analysis, as the authors should discuss their current findings in the context of the available scientific literature and existing knowledge [ 65 ].
References 1. JAMA — Clin J Am Soc Nephrol 3: — Heart Lung Vessel 5: — Stat Med — Hedges LV The early history of meta-analysis. Res Synth Methods 6: — Glass GV Meta-analysis at middle age: a personal history. Pautasso M Ten simple rules for writing a literature review. PLoS Comput Biol 9: e BMJ f Syst Rev 2: 4.
Onkologie — PLoS Med 6: e Human Genome Epidemiology Network. PLoS Med 5: e Nat Rev Genet — J Neurosci Methods 92— Kavvoura FK, Ioannidis JP Methods for meta-analysis in genetic association studies: a review of their potential and pitfalls. Hum Genet 1— J Clin Epidemiol — Nucleic Acids Res D— Control Clin Trials 1— Stang A Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses.
Eur J Epidemiol — Ann Intern Med — Greenland S, O'Rourke K On the bias produced by quality scores in meta-analysis, and a hierarchical view of proposed solutions. Biostatistics 2: — BMJ Open 1: e Age Ageing — Nat Genet — PLoS Genet 4: e Chene G, Thompson SG Methods for summarizing the risk associations of quantitative variables in epidemiologic studies in a consistent form.
Am J Epidemiol — Int J Epidemiol — Di Pietrantonj C Four-fold table cell frequencies imputation in meta analysis. Parmar MK, Torri V, Stewart L Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints. Nucleic Acids Res — Methods Mol Biol — Mavridis D, Salanti G A practical introduction to multivariate meta-analysis. Stat Methods Med Res — Ioannidis JP, Trikalinos TA Early extreme contradictory estimates may appear in published research: the Proteus phenomenon in molecular genetics research and randomized trials.
Kraft P Curses—winner's and otherwise—in genetic epidemiology. Epidemiology —; discussion — Hoboken, NJ: Wiley. Quintana DS From pre-registration to publication: a non-technical primer for conducting a meta-analysis to synthesize correlational data.
0コメント