Help support open science today.
Donate Now

Replicating a challenging study: it's all about sharing the details.

June 27th, 2017,

Working on the replication attempt of The common feature of leukemia-associated IDH1 and IDH2 mutations is a neomorphic enzyme activity converting alpha-ketoglutarate to 2-hydroxyglutarate by Ward et al. was a challenging and rewarding experience. It forced us as a group to think about the differences between replication and validation, and how method sections often are lacking information regarding methodology that could impact replication. It is published in eLife here

The most challenging section to replicate was the metabolomics section, due to the inherent complexity of metabolomics studies. Methodology, data processing and normalizations are often specific to each lab. In our case the authors cited additional method papers, a practice we found to be helpful and use often. This way the method section doesn’t need to include every detail, since the published method paper includes all the information necessary for replication. Another practice that can be helpful is the inclusion of detailed methods in supplementary method files. These workarounds are only needed because authors often feel they are not able to include everything in the main text, due to word limits.

For our replication attempt we deposited all data into the NIH Metabolomics Workbench, a public metabolomics data repository that also includes detailed study design and methods sections. We hope that this, in addition to the Registered Report and Replication Study, will provide all details needed for others to build upon our work.

The field of metabolomics presents unique challenges for reproducibility. It is a young field and the reporting standards are still evolving. The community has published guidelines for reporting metabolomic findings [1], but unfortunately, they have still not been adopted by many members of the community.

Metabolomics reproducibility has four main challenges: compound identification, methodology, data processing and statistical analysis. The first three are unique to the field of metabolomics. Compound identification is the paramount issue for reproducibility. Many labs report identifications of compounds that have not been rigorously asserted. The annotations made by each individual lab are their final responsibility, but lab size and experience with metabolomics can impact the quality of those annotations. Large labs have extensive chemical standard inventories and access to in-house MS/MS fragmentation spectra, retention time and m/z often not available to smaller labs. To help bridge the gap, publicly available MS/MS spectra databases as well as numerous published m/z and retention time methods and libraries can be used.

Yet even with these tools available, many published reports lack basic information regarding the annotations regarded as important to their findings. Studies should be required to provide, at a minimum, MS/MS spectra of their experimental compound compared to chemical standard, or at least publicly available spectra, to confirm their annotation. Instead, many studies rely upon accurate mass annotations alone, often choosing a compound they prefer over others with the same accurate mass, and reporting experimental findings as that compound.

Metabolomics data acquisition and processing methods also present challenges to reproducibility because they are inherently complex and vary by lab. In addition, authors are not required, or are unsure where, to deposit metabolomics data into public repositories, which is standard practice for other omics. Without these data, it is nearly impossible for others to replicate original author findings.

While many of the issues around reproducibility of these methods can be solved by detailed method sections, it does raise interesting questions. Are the biological phenotypes of interest strong enough to be reproducible regardless of the mass spectrometry method or data processing used? Should they be? For this replication attempt we used the same methods as the authors, but from the numerous papers published on the 2HG, it is clear the biological phenotype has reproduced using many different methods. This shows that when done correctly, metabolomic findings are reproducible using original or adapted methods across numerous labs.


1. Sumner, L. W. et al. Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 3, 211-221, doi:10.1007/s11306-007-0082-2 (2007).

Recent Posts