The Benefits of Data-Independent Acquisition in Metabolomics


Special Issues

Spectroscopy SupplementsSpecial Issues-03-01-2019
Volume 34
Issue 3
Pages: 26–34

Data-independent acquisition (DIA) makes it possible to re-interrogate data from earlier analyses to determine if new compounds have appeared in a sample previously analyzed. In this interview, Craig Wheelock of the Karolinska Institute discusses the use of DIA in metabolomics.

Interest in data-independent acquisition (DIA) in mass spectrometry is growing in many fields, including in metabolomics. The approach makes it possible to re-interrogate data from earlier analyses, such as when a researcher wishes to determine if newly identified compounds appeared in a sample analyzed previously. In this interview, Craig Wheelock, the head of the Integrative Molecular Phenotyping Laboratory at the Karolinska Institute (Sweden), discusses the use of DIA in metabolomics.

Q. What aspects of metabolomics is your group focusing on?

A: The analytical work in our group has two distinct focuses. We initially developed targeted methods for metabolic profiling of lipid mediators, such as eicosanoids in respiratory disease. A few years ago, we naively decided to expand into metabolomics, but quickly realized that there are numerous analytical challenges in performing a metabolomics experiment. Therefore, we decided to start from scratch, and construct our own workflow, similar to many other laboratories. The primary aim of our method was to acquire as much high quality data in a single experiment as possible.

The goal was to then use these metabolite data for omics integration models in systems medicine studies. We wanted a long list of accurately identified metabolites, while simultaneously reducing the use of putative identifications. We were concerned that incorrect metabolite identifications would be problematic for our integrative modeling efforts, and lead to inaccurate interpretation of the observed biology. The last few years have been spent on developing our metabolomics method, which was finally published last year (1). Our focus on metabolite annotation has led us to be strong proponents of the efforts of the Metabolomics Standards Initiative to provide clear criteria on the accuracy of metabolite identification (2).

Q. What are the main challenges for chromatographers engaged in metabolomics at the moment?

A: We are currently using liquid chromatography (LC) for our metabolomics work in biofluids, primarily urine and blood. When developing an LC method for metabolomics, it is clearly a trade-off between separation and throughput. Ideally, we would have three hour long gradients using two-dimensional (2D) nanoflow methods. This would optimize our metabolite separation, which would greatly help in unequivocal metabolite identification. Run times of this length are not feasible for our applications, and there is a question of the long-term robustness and retention time stability of such an approach. As with most methods, our current approach represents a compromise. The field appears to have currently coalesced around 15–20 min gradients. However, there is a strong case to be made for short, fast methods, in the order of 2–3 min, using microbore technologies. An additional obstacle is the variability associated with hydrophilic-interaction chromatography (HILIC). It has been challenging to obtain reproducible retention times for HILIC chromatography, especially for larger studies. However, while this has historically been an issue, new HILIC columns have greatly improved in performance. We have successfully used HILIC chromatography for the analysis of more than one thousand samples in a single study. In many ways, gas chromatography (GC) offers several advantages over LC approaches, including extremely consistent retention times, robustness, reduced ion suppression, and excellent spectral libraries. Conversely, analysis of samples using GC generally requires a derivatization step, and the compounds need to be thermally stable. In order to capture the structural diversity of a metabolome, multiple methods are necessary. There is, unfortunately, no single analytical platform capable of simultaneously acquiring a metabolome. I therefore see the current optimal solution to involve multiple methods, including both LC– and GC– based approaches.

Q. You recently developed a method using liquid chromatography high resolution mass spectrometry (LC-HRMS) for metabolite identification using a data-independent acquisition (DIA) approach? What was the aim of this research?

A: The aim of this research (1,3) was to develop a method suitable for our laboratory goals, as I noted previously. We wanted to be able to accurately identify as many metabolites as possible in a single acquisition. Using DIA provided us with fragmentation from three different collision energies (0, 10 eV, and 30 eV), which greatly helps with compound identification.

We are able to identify metabolites based upon our in-house library of standards. However, the DIA approach enables the simultaneous acquisition of targeted and nontargeted data. We can then go through our nontargeted data, and mine it for interesting metabolites that are not included in our database. If we want to conclusively identify a new metabolite, we can acquire the standard, add it to our database, and reprocess the data to characterize the compound. One of the primary challenges for us with the analysis of the nontargeted data is the ability to perform metabolite identification and annotation across thousands of samples in a single study.

Q. What is DIA? What benefits does it offer?

A: To improve metabolite identification, and reduce the requirement for multiple analytical runs for structural confirmation, two different tandem mass spectrometry (MS/MS) strategies have been implemented: with selection of the precursor ion (data-dependent acquisition [DDA]), and without selection of the precursor ion (DIA). DIA-based MS generates MS/MS spectra containing a mixed population of product ions together with their precursor ions, and the extracted ion chromatogram (EIC) of each product ion needs to be mapped to its parent compound.

DIA approaches have been successfully used to conduct multiple fragmentation experiments in a single acquisition. One useful application of the DIA approach is to identify coeluting isobaric compounds, where DIA data are combined with software deconvolution algorithms that merge precursor ions from low-energy experiments and product ions from high-energy experiments. An advantage of the current DIA approach is the concurrent collection of full scan data, enabling identification of metabolites not included in the database. Our data acquisition strategy enables a simultaneous mixture of database-dependent targeted and nontargeted metabolomics in combination with improved accuracy in metabolite identification, increasing the quality of the biological information acquired in a metabolomics experiment.


Q. What were the main analytical challenges you had to overcome in this project, and how did you overcome them?

A: A lot of the work performed for this study was time-consuming, and rather repetitive. The characterization of more than four hundred standards to create the in-house library and measure the ion ratios took a very long time. The development of custom standard libraries is not efficient, and it does not make sense for laboratories to do this independently. Happily, there are now multiple metabolite libraries available that have the advantage of being well organized, and all metabolites have known solubilities.

Q. What is novel about your approach?

A: The aim of our work was to establish a comprehensive analytical workflow for the application of LC–HRMS to nontargeted metabolomics with a high level of accuracy in metabolite identification. Our application of DIA mode includes three sequential full scans, at 0, 10 eV, and 30 eV collision energies. In the subsequent data analysis, EIC from any precursor or associated product ions of interest can be extracted from the low- or high-energy scan data. One EIC is chosen for relative quantification (the quantifier ion) of the metabolite, and further product ions from the same compound are used as qualifier ions. This approach enables us to potentially distinguish coeluting isobaric pairs (provided that unique fragments can be identified). In addition, we have added an ion ratio approach, which means that the ratios of qualifier–quantifier ion intensities are established from analytical standards, and should therefore be preserved when measured in a biological sample, increasing the accuracy of the identification. The same acquired data (0 eV) can be used in parallel for a global metabolite profiling workflow, enabling a combined database-dependent targeted and nontargeted metabolomics experiment. The combination of the DIA-based data acquisition with the ion ratio confirmation and deconvoluted coeluting isobaric pairs provides a useful method for increasing the accuracy of metabolite identification in a metabolomics experiment.

Q. Can you describe a practical example to illustrate how this approach would benefit the analyst in practice?

A: The strength of the current method was demonstrated in urine, using the homoserine and threonine isobaric pair as an example. We analyzed a clinical cohort of urine samples from asthmatics using our published metabolomics methods. The homoserine and threonine peak areas were then each integrated separately using their specific product ion as a quantifier ion (homoserine, m/z 55.0189 at 30 eV; threonine, m/z 74.0248 at 10 eV), and their combined peak area (homoserine + threonine, m/z 118.0509 at 0 eV) was also integrated. The samples were then stratified by the abundance of the threonine value (obtained from threonine-specific peak integration) and the homoserine value (obtained from homoserine-specific peak integration). Following stratification, the 25% and 75% quantile of each sample set were selected in an extreme value approach to test for significance. The combined homoserine + threonine integrated peak was not significant (p = 0.2), but the threonine-specific peak was significantly different between the 25% and 75% quantile (p < 0.0001). In addition, the percentage relative standard deviations (RSDs) decreased in the homoserine- and threonine-specific integrated peaks between the two quantiles relative to the combined peak integration, increasing the precision of the measurement. This metabolite-dependent deconvolution example demonstrates the strength of the current method for increasing the accuracy of metabolite annotation by targeted ion selection, which can have a significant effect upon the observed biological shifts.

Q. What is the aim of the Metabolomics Standards Initiative?

A: The field of metabolomics is growing rapidly, and it is exciting to see the recognition of the importance of metabolomics in integrative omics studies. However, with this recognition comes a need for increased standardization in the field. The work by the Metabolomics Standards Initiative is an important step in this direction. In addition, repositories, such as MetaboLights (4), are vital for the science. I think that the field would benefit from a set of standardized recommendations on how to perform a metabolomics experiment. An example can be taken from the development of microarray methods. It is now considered as essentially obligatory that the Minimum Information About a Microarray Experiment (MIAME) (5,6) standards are followed in an experiment, and that the data are deposited in a database such as GEO. It would be preferable if journal editors and reviewers insisted that metabolomics experiments follow this same format.

Another helpful movement in metabolomics is the need for defined reference materials. The National Institute of Standards and Technology (NIST) group has been active in developing these materials, characterizing them in ring trials, and making them available to the research community. This can be quite helpful in benchmarking methods against a known metabolic profile in a well-described biofluid. Defined parameters for metabolite identification, public availability of acquired datasets, and clear experimental descriptions will help to expand the field and the utility of metabolomics as an informative platform for understanding metabolism.

Q. What is your group working on next?

A: We are currently expanding upon the metabolomics method we published last year (1). One of the primary aims is to automate the method as much as possible. We have, therefore, formatted the sample preparation for both urine and plasma using an automated liquid handling platform. We can then prepare all samples in 96-well plates for analysis, which reduces both sample handling and preparation time. As part of these efforts, we are focusing on fully annotating the observed urinary metabolome with our current methodology. These efforts include evaluating shifts in observed metabolites associated with glucuronidase and urease treatment, as well as concentration and fractionation steps.

We envision developing a series of methods enabling us to capture different metabolic fractions of the urinary metabolome depending upon the biological question. One of the weaknesses of metabolomics is that it is less sensitive compared to targeted approaches. There are multiple metabolites, such as lipid mediators and halogenated tyrosine derivatives, that are of extreme interest in respiratory biology, but cannot be detected by our metabolomics methods, because of their low endogenous concentration. We would like to develop a concentration and clean-up step that would provide us with a urinary metabolomics platform to detect these low abundant compounds. The presence of high abundant metabolites, such as creatinine, will be a challenge. We are also working on developing high-throughput metabolomics applications. There are multiple challenges associated with these efforts, but there is the potential to offer extremely fast analysis times. We are not there yet, but the field is making major advances, and it will be exciting to see where we are in another 5–10 years.


(1) S. Naz, H. Gallart-Ayala, S.N. Reinke, C. Mathon, R. Blankley, R. Chaleckis, and C.E. Wheelock, Anal. Chem. 89(15), 7933–7942 (2017).

(2) L.W. Sumner et al., Metabolomics 3(3), 211–221 (2007).

(3) R. Chaleckis, S. Naz, I. Meister, and C.E. Wheelock, Methods Mol. Biol. 1730, 45–58 (2018).



(6) A. Brazma, P. Hingamp, J. Quackenbush et al., Nat. Genet. 29(4), 365–371 (2001). doi:10.1038/ng1201-365.

Craig Wheelock is head of the Integrative Molecular Phenotyping Laboratory at the Karolinska Institute (Sweden) and an associate professor within the Centre for Allergy Research (CfA). He is also a distinguished visiting professor of metabolomics at the Gunma Institute for Advanced Research (GIAR) at Gunma University (Japan). He is currently a member of the European Respiratory Society Scientific Events Working Group and board member of the International Metabolomics Society.

Related Videos
John Burgener | Photo Credit: © Will Wetzel