Identifying "Known Unknowns" in Commercial Products by Mass Spectrometry

May 01, 2014
Volume 12, Issue 2

The identification of nontargeted species in environmental and commercial samples by mass spectrometry can be very difficult. In this article, authors from Eastman Chemical Company describe their systematic approach for the identification of nontargeted species using nominal and accurate mass data, searching both mass spectral and "spectra-less" databases.

Organic mass spectrometry (MS) has witnessed an extraordinary increase in capabilities this past decade because of major advances in ionization sources, analyzers, detectors, chromatography, and computer technology. Many of these technological advances focus on biological applications, a fact plainly evident to attendees of the American Society for Mass Spectrometry's (ASMS) annual conferences. Yet the significance of this ever-sophisticated technology has not been lost on industrial, environmental, and forensic mass spectrometrists, whose work involves characterizing commercial chemical products.

Eastman Chemical Company is a global manufacturer of polymers, fibers, coatings, additives, solvents, adhesives, and many other products. Gas chromatography–mass spectrometry (GC–MS) and liquid chromatography–mass spectrometry (LC–MS) have proven to be essential for characterizing our company's products and those of other companies. With reasonable effort, we routinely and reliably obtain mass spectral data from these highly sensitive and yet robust techniques. However, unless the data can be converted into structural information, it is not useful as a knowledge base to resolve the analytical problem at hand.

Figure 1: Simplified flowchart for identifying "known unknowns." MF = molecular formula and MW = molecular weight.
In the last 34 years, we developed and refined a systematic process (1,2) for the identification of nontargeted species using GC–MS and LC–MS analyses. We refer to these types of species as "known unknowns" — that is, species known in the chemical literature or MS reference databases, but unknown to the investigator. The essence of the process is finding candidate structures by searching mass spectral databases, Chemical Abstract Services databases, and ChemSpider databases. Figure 1 presents a simplified flowchart of the overall process; the subsequent sections discuss individual steps and illustrate three examples in the identification of known unknowns.

Computer-Searchable Mass Spectral Databases

Table I: Spectra with associated structures searched with NIST search software
The first step in the process is computer searching of spectra against mass spectral databases. This approach (3) is very powerful and efficient for the identification of unknowns typically requiring 3–5 s for each component in a mixture. Electron ionization (EI) databases are used for identifying compounds in GC–MS analyses, and collision-induced dissociation (CID) databases are used for LC–MS analyses. The databases are purchased from commercial sources or are created from compounds characterized at our company (see Table I).

The results of the EI mass spectral searches are normally more successful than CID searches for two reasons. First, the number of entries in EI databases for GC–MS is approximately 10 times larger than that for CID databases for LC–MS. Second, 70-eV EI spectra are much more reproducible than CID spectra, which can vary significantly depending on instrument design and user-specified variables (3).

lorem ipsum