Connecting Chemometrics to Statistics - Part I: The Chemometrics Side - - Spectroscopy
 Home   Mass Spectrometry   ICP-MS   Infrared   FT-IR   UV-Vis   Raman   NMR   X-Ray   Fluorescence  
Home
Magazine
Issue Archive
Subscribe/Renew
Special Issues
Reprints
The Application Notebook
Current Issue
Archive
Submission Guidelines
Training
SpecAcademy
E-solutions
Digital Edition
Subscribe to the Digital Edition
The Wavelength
Subcribe to The Wavelength
Subscribe to the MS E-news
Resources
Market Profiles
Information for Authors
SpecTube
Webcasts
Advertiser services
Contact Us
Columns
Atomic Perspectives
Chemometrics in Spectroscopy
Focus on Quality
Laser and Optics Interface
Mass Spectrometry Forum
The Baseline
Molecular Spectroscopy Workbench

Connecting Chemometrics to Statistics - Part I: The Chemometrics Side


Spectroscopy
Volume 21, Issue 5, pp. 34-38

This series of columns has been running for a long time. Long-time readers will recall that it has even changed its name since its inception. The original name was "Statistics in Spectroscopy." This was a multiple pun, as it referred to the science of Statistics in the journal Spectroscopy and the science of Statistics in the science of Spectroscopy as well as statistics (the subject of the science of Statistics) in the journal Spectroscopy. [See our third column ever (1) for a discussion of the double meaning of the word "Statistics." The same discussion is found in the book based upon those first 38 columns (2).]

Our goal then, as now, was to bring the study of chemometrics and the study of statistics closer together. While there are isolated points of light, it seems that many people who study chemometrics have no interest in and do not appreciate the statistical background upon which many of our chemometric techniques are based, nor do they appreciate the usefulness of the techniques that we could learn from that discipline. Worse, there are some who actively denigrate and oppose the use of statistical concepts and techniques in the chemometric analysis of data. The first group can, perhaps claim unfamiliarity (ignorance?) with statistical concepts. It is difficult, however, to find excuses for the second group.

Nevertheless, at its very fundamental core, there is a very deep and close connection between the two disciplines. How could it be otherwise? Chemometric concepts and techniques are based upon principles that were formulated by mathematicians hundreds of years ago, even before the label "statistics" was applied to the subfield of mathematics that deals with the behavior and effect of random numbers on data. Nevertheless, recognition of statistics as a distinct subdiscipline of mathematics also goes back a long way, certainly long before the term "chemometrics" was coined to describe a subfield of that subfield.

Before we discuss the relationship between these two disciplines, it is, perhaps, useful to consider what they are. We have already defined "statistics" as ". . . the study of the properties of random numbers . . ." (3).

A definition of "chemometrics" is a little trickier of come by. The term originally was coined by Kowalski, but currently, many chemometricians use the definition by Massart (4). On the other hand, one compilation presents nine different definitions for "chemometrics" (5,6) (including "what chemometricians do," a definition that apparently was suggested only half humorously). But our goal here is not to get into the argument over the definition of the term, so for our current purposes, it is convenient to consider a somewhat simplified definition of "chemometrics" as meaning "multivariate methods of data analysis applied to data of chemical interest."

This definition is convenient because it allows us to then jump directly to what is arguably the simplest chemometric technique in use, and consider that as the prototype for all chemometric methods; that technique is multiple regression analysis. Written out in matrix notation, multiple regression analysis takes the form of a relatively simple matrix equation:










where B represents the vector of coefficients, A represents the matrix of independent variables, and C represents the vector of dependent variables.


Rate This Article
Your original vote has been tallied and is included in the ratings results.
View our top pages
Average rating for this page is: 4.94
Headlines from LCGC North America and Chromatography Online
Upcoming Eastern Analytical Symposium Integrates Art and Science
Environmental Speciation Analysis using Ion Chromatography
Waters EU - Using the Elucidation Tool in UNIFI Scientific Information System to Identify Unknown Compounds in Natural Products
Waters EU - Leveraging Organizational Information with Scientific Search
Waters EU - A Label-Free, Multi-Omic Study to Qualitatively and Quantitatively Characterize the Effects of a Glucosylceramide Inhibitor on Obesity
Source: Spectroscopy,
Click here