Protein Identification in Complex Mixtures: A Comparison of Accurate-Mass Q-TOF and Ion-Trap LC–MS

March 1, 2008
Special Issues

Volume 0, Issue 0

Because it is extremely rapid, biomarker discovery and identification using liquid chromatography–mass spectrometry (LC-MS), including both ion-trap and triple-quadrupole LC–MS, is well established. Fractionation of complex samples before LC–MS-MS analysis might be necessary to identify the proteins, greatly increasing the number of analyses required. In this case, there is ongoing debate regarding knowing whether the protein is identified correctly, knowing how much prior fractionation is needed to reduce complexity to the point where low-abundance proteins can be detected reliably, and balancing specificity with sensitivity.

Because it is extremely rapid, biomarker discovery and identification using liquid chromatography–mass spectrometry (LC-MS), including both ion-trap and triple-quadrupole LC–MS, is well established. Fractionation of complex samples before LC–MS-MS analysis might be necessary to identify the proteins, greatly increasing the number of analyses required. In this case, there is ongoing debate regarding knowing whether the protein is identified correctly, knowing how much prior fractionation is needed to reduce complexity to the point where low-abundance proteins can be detected reliably, and balancing specificity with sensitivity.

This article will explore these questions using two LC–MS approaches: ion trap and quadrupole time-of-flight (Q-TOF). Ion trap, a mainstay in protein identification, has excellent sensitivity but suffers from a lack of specificity, thus resulting in high false-positive rates. Q-TOF is known to have a wider mass range and higher resolution and therefore lower false-positive rates. However, it is only recently that the high mass accuracy, wide dynamic range, enhanced sensitivity, and fast data acquisition capabilities of the latest Q-TOF LC–MS systems have been utilized fully to obtain more confident protein identifications. Because Q-TOF LC–MS has emerged recently as a new technology for biomarker discovery, this article also explores optimization of a Q-TOF LC–MS system to achieve best results.

Experimental

Overview

In this study, human plasma samples were depleted of the major protein components. The samples were digested and fractionated using OFFGEL electrophoresis. LC–MS-MS analyses were then performed by microfluidic-based nanoflow LC coupled to either a high-performance ion-trap or Q-TOF system. Results were processed using Agilent's Spectrum Mill database searching software (Agilent, Palo Alto, California).

Sample Preparation

A human plasma sample from a patient with rheumatoid arthritis was used for this study. The sample was depleted of 14 highly abundant proteins using Agilent's HU-14 immunoaffinity column. After depletion, the sample was buffer-exchanged into an ammonium bicarbonate solution and the protein concentration was measured using a bicinchoninic acid assay (BioRad). After reduction and alkylation (iodoacetamide), the sample was digested with trypsin under denaturing conditions.

Electrophoretic Fractionation

OFFGEL protein fractionation separates proteins or peptides according to their isoelectric point (pI). In contrast to traditional isoelectric focusing, the fractionated sample is delivered in liquid phase, thus facilitating sample analysis by LC–MS. The trypsin-digested lysate was fractionated into 24 fractions in the pH range from 3 to 10. The current was limited to 50 μA, and fractionation was stopped after 50 kVh at about 24 h.

LC–MS-MS Analysis

Approximately 5% of each fraction (8 μL) was analyzed using Agilent's microfluidics-based HPLC-Chip for nanoelectrospray LC–MS connected to Agilent's Q-TOF MS or ion trap MS system. Nanoelectrospray LC-MS is well established as a state-of-the-art technique because of its high sensitivity and low sample consumption. It is most often used for applications in which sample amounts are limited or when there is a need for the analysis of trace-level components in complex mixtures. The HPLC-Chip integrates the sample enrichment columns of a nanoflow LC system with the intricate connections and nanoelectrospray tip on a reusable biocompatible polymer chip, thereby eliminating the traditional fittings, valves, and connections typically required in a nanoelectrospray LC-MS system. Leaks, postcolumn dead volumes, and peak dispersion are nearly eliminated, resulting in better-defined peaks and improved separations (1).

Database Searches

Database searches help to speed proteomics data interpretation and review. In this study, protein database searches were carried out with Agilent's Spectrum Mill Proteomics Workbench software. The software automatically assesses MS-MS spectral quality based upon sequence tag length and signal-to-noise criteria. Poor quality spectra that do not meet the criteria are excluded to reduce the number of false positives. An especially efficient approach is to search first for unmodified proteins and save the results, and then research this subset for modifications. This gives faster results and produces fewer false positives than a single search of the database with modifications. Like any other mass-based database software, this software is designed to produce more confident protein identifications, with improved database search scores and fewer false positives, the higher the mass accuracy of the MS-MS data supplied to it.

All searches used the IPI Human Version 3.06 database with trypsin specificity, two missed cleavages, 50% minimum scored peak intensity, and dynamic peak thresholding. The forward database search of the ion trap data was performed using 2.5-Da precursor and 0.7-Da fragment ion tolerance, while the Q-TOF search used 20-ppm precursor and 50-ppm fragment ion tolerance. Protein identifications were validated automatically after searching protein sequences without modifications. A subsequent round of searching against the proteins already identified looked for variable modifications such as oxidized Met, pyroGlu, and deamidation. For the Q-TOF data, this was followed by an "unassigned single mass gap" search; a search that looks for unnamed modifications using highly accurate mass information.

Figure 1

Q-TOF Optimization

In the Q-TOF LC-MS system, the MS-MS collision energy parameters can be optimized to provide maximum signal-to-noise and quality of MS-MS spectra, thereby maximizing peptide matches and protein identification when using the Spectrum Mill software. A relatively simple sample, a bovine serum albumin (BSA) digest, was used for this purpose. Based on sequence coverage, number of unique peptides, and protein score, method 23 provided the optimal collision energy (Figure 1). As shown in Figure 2, method 1 resulted in overfragmentation and lower signal-to-noise compared to method 23. This reduced the information content in the spectra and the confidence in the database match when compared with the optimal collision energy provided by method 23. The Q-TOF system allows the collision energy to be set as a slope–intercept relationship. The formula is slope x (observed m/z divided by 100) + intercept. This offers a robust, easy approach to optimize the collision energy for the analysis of peptide digests.

Figure 2

The Q-TOF system's acquisition parameters such as the MS-MS spectral acquisition rate also can be optimized to enhance protein identification. A complex mixture, such as E. coli lysate, best approximates the human sera samples. The results in Table I show the acquisition parameters tested and the protein identification results for a complex mixture of E. coli lysate. Method H was shown to provide the greatest number of unique peptides identified, spectra assigned, and greatest sequence coverage.

Table I: Methods A through J show the optimization of acquisition parameters using an E. coli lysate. Red values indicate the values that the user can change from those listed in method A.

Results and Discussion

Comparison of Q-TOF and Ion-Trap Results

The results of the Q-TOF and ion-trap LC–MS analyses of the trypsinized, depleted sera samples are shown in Table II. For the unfractionated sample, the optimized Q-TOF LC–MS system provides significantly more protein and peptide matches and over 50% more MS-MS spectra identified compared with the results provided by the ion-trap system. While the results are less dramatic for the fractionated samples, the Q-TOF system outperforms the ion-trap system. The numbers in the table represent only the confident matches and are therefore conservative. These results indicate that regardless of whether a Q-TOF or ion-trap system is used, fractionation of complex samples before LC–MS-MS analysis might be necessary to identify the greatest number of proteins and achieve the highest confidence in results. If fractionation is not used to reduce sample complexity, Q-TOF LC–MS is likely to produce the best results.

Table II: Number of confident matches; a comparison of Q-TOF and ion trap results

Table III: Mass accuracy of the QTOF MS-MS spectra of IFFESVYGOCK

Figure 3

The MS-MS spectra in Figure 3 show the spectral quality of the lowest scoring match validated for both the ion-trap and Q-TOF systems. The MS-MS spectra produced by the Q-TOF system provides a greater number of more abundant peptide fragments, thus better spectral quality, compared with the ion-trap system. Table III shows the exceptional mass accuracy achieved for the Q-TOF MS-MS spectra of IFFESVYGOCK.

It is a combination of exceptional MS and MS-MS mass accuracy, wide dynamic range, sensitivity, and fast electronics that enables the enhanced performance of new Q-TOF systems. Figure 4 demonstrates that accurate-mass MS-MS data greatly reduces the rate of false positive matches found after protein database searches. The latest accurate-mass Q-TOF LC–MS systems employ a number of technologies to achieve consistent, repeatable 1–2 ppm MS and <5-ppm MS-MS mass accuracy: analog–digital detector (ADC) technology, thermal protection, automated internal reference correction, and optimized collision cell design.

Figure 4

An ADC detector is needed to achieve accurate mass assignments and wide dynamic range. Unlike older time-to-digital converters (TDC) that only register an ion arrival above a certain intensity and give the same response regardless of whether the signal is the result of one or many ions, an ADC converter creates a continuous digital representation of the ion detector's signal. When multiple ions of a given mass arrive at the detector within a very short time, an ADC with a fast sampling rate can translate this rising and falling signal into a very accurate digital profile of the mass peak. The detector output is represented accurately regardless of whether it is from a small ion current or a large ion current. A wide in-spectrum dynamic range is also important because it allows lower-abundance proteins and peptides to be found in the presence of higher-abundance components. Unlike older TDC instruments, new ADC detector designs maintain mass accuracy up to five orders of magnitude of dynamic range.

Figure 5

Because mass measurement is dependent upon the length of the TOF's flight tube, it must have a low coefficient of thermal expansion and it must be protected from temperature fluctuations if mass measurements with 1–2 ppm errors are to be achieved. This level of mass accuracy can be attained by devising a flight tube constructed from a metal alloy with an extremely low coefficient of thermal expansion. An insulated outer shell with an evacuated air compartment protects the inner components from temperature changes.

Because it is impossible to completely eliminate all of the most miniscule instrument variations that could affect mass assignments and cause a noticeable mass shift, the latest TOF-based instruments employ automated two-point internal reference mass correction. In this technique, two compounds of known mass are introduced continuously into the ion source. In each spectrum acquired, masses of known low-and high-mass ions from the reference solution are measured and used to correct the calibration curve. This curve is used to calculate the mass assignments of all other ions in the spectrum. In this way, the control software constantly and automatically corrects the measured masses of the samples using the known masses as reference. Because the latest Q-TOF systems have wide in-spectrum dynamic ranges, the reference mass compound can be introduced at low concentrations. This helps eliminate interference between the reference compound and samples.

MS-MS mass accuracy and consistency of mass assignments between MS and MS-MS measurements are both critical to accurate protein identification. Q-TOF mass accuracy is inherently poorer for MS-MS measurements due to the random energies imparted by the process of collision and fragmentation. New Q-TOF collision cell designs, however, minimize this by removing the kinetic energy of precursor and product ions and then using linear axial acceleration to impart nearly uniform energy to the ions exiting the collision cell. This allows the same correction factors to be applied to MS and MS-MS mass assignments and allows the Q-TOF to achieve better than 5-ppm MS-MS mass accuracy.

Maximizing sensitivity of a Q-TOF system requires optimizing the generation, transmission, and detection of ions while simultaneously minimizing the generation, transmission, and detection of chemical and electronic noise. In new Q-TOF systems, the sensitivity needed to enhance protein identification is the result of the combined design optimization of all Q-TOF components along the ion path, including ion generation in the ion source, ion fragmentation and mass filtering, and ion detection and signal processing. Data acquisition fast enough to extract critical MS-MS information from complex chromatographic peaks is the result of advancements in ADC acquisition electronics and high-speed, embedded PC digital acquisition systems.

Conclusions

This study shows deeper and more specific protein identification from Q-TOF designs with high mass accuracy, wide dynamic range, enhanced sensitivity, and fast acquisition in both the MS and MS-MS domains. Q-TOF LC–MS performance can exceed that of ion-trap systems for protein and peptide identification in both fractionated and unfractionated samples. Regardless of whether Q-TOF or ion trap is used, fractionation of complex samples before LC–MS-MS analysis might be necessary to identify the greatest number of proteins and achieve the highest confidence in results. If fractionation is not used to reduce the sample complexity, Q-TOF LC–MS is likely to produce the best results. Optimization of Q-TOF collision cell and acquisition parameters increases the identification of proteins and peptides.

Christine Miller and Ning Tang are with Agilent Technologies, Inc., Santa Clara, California.

Reference

(1) H. Yin, K. Killeen, R. Brennen, D. Sobek, M. Werlich, and T. van de Goor, Anal. Chem. 77(2), 527–533 (2005).