Columns | Column: Chemometrics in Spectroscopy

February 14th 2025

This “Chemometrics in Spectroscopy” column traces the historical and technical development of these methods, emphasizing their application in calibrating spectrophotometers for predicting measured sample chemical or physical properties—particularly in near-infrared (NIR), infrared (IR), Raman, and atomic spectroscopy—and explores how AI and deep learning are reshaping the spectroscopic landscape.

Big data concept. | Image Credit: © your123 - stock.adobe.com.

EP. 3: Data Transforms in Chemometric Calibrations Part 4B: Continuation of the Discussion of Data Transforms for Continuous-Wavelength Spectra as well as Discrete-Wavelength Models

October 9th 2024

This column is the continuation of our previous column that describes and explains some algorithms and data transforms beyond those most commonly used. We present and discuss algorithms that are rarely, if ever, seen or used in practice, despite that they have been proposed and described in the literature.

Big data sorting. Information analytics algorithms, machine learning and intelligence data picking vector concept illustration | Image Credit: © Tartila - stock.adobe.com.

EP. 4: Are We There Yet? Is There Such a Thing as an Expert Calibration System for Vibrational Spectroscopy?

August 21st 2024

By providing automated tools and guidance, an ECS would aim to streamline the calibration process, improve calibration transfer, enhance operator efficiency, and improve the overall consistency and reliability of analytical results produced using advanced chemometrics and machine earning techniques.

Abstract graphic world map illustration on blue background, big data and networking concept. 3D Rendering | Image Credit: © Pixels Hunter - stock.adobe.com.

EP. 5: Data Transforms in Chemometric Calibrations, Part 4A: Continuous-Wavelength Spectra and Discrete-Wavelength Models

February 1st 2024

In this column and its successor, we describe and explain some algorithms and data transforms beyond those commonly used. We present and discuss algorithms that are rarely, if ever, used in practice, despite having been described in the literature. These comprise algorithms used in conjunction with continuous spectra, as well as those used with discrete spectra.

Blue digital binary data on computer screen background. | Image Credit: © Denis - stock.adobe.com.

EP. 6: Data Transforms in Chemometric Calibrations: Variation in MLR, Part 3: Reducing Sensitivity to Repack Σbi = 0

October 1st 2023

There is a variation of the MLR calibration algorithm that can reduce sensitivity to repacked sample measurements. We explore that MLR method here in detail.

AI, Artificial Intelligence concept, 3d rendering, conceptual image. | Image Credit: © Shuo - stock.adobe.com.

EP. 7: Artificial Intelligence in Analytical Spectroscopy, Part II: Examples in Spectroscopy

June 1st 2023

A sample library of selected references discussing the application of artificial intelligence (AI) in analytical chemistry and molecular spectroscopy is presented.

digitized brain that represents artificial intelligence

EP. 8: Artificial Intelligence in Analytical Spectroscopy, Part I: Basic Concepts and Discussion

February 1st 2023

Are you intrigued by artificial intelligence, but unsure what it really means for analytical chemistry? Read on.

binary representation glowing in neon green

EP. 9: Decimal Versus Binary Representation of Numbers in Computers

October 1st 2022

The past decision to use binary representation in computer architectures affects the results of chemometric-based outputs, especially if different data values are used.

data visualized against a bluish-pink background

EP. 10: Data Transforms in Chemometric Calibrations: Simple Variations of MLR, Part 2

July 1st 2022

We examine variations of the multiple linear regression (MLR) algorithm confer special properties on the model that the algorithm produces and critique the use of derivatives in calibration models.

Prism light spectrum. 3d triangle shape from lines, triangles, particle, low poly and wireframe design. Vector illustration

EP. 11: Data Transforms in Chemometric Calibrations: Application to Discrete-Wavelength Models, Part 1: The Effect of Intercorrelation of the Spectral Data

February 1st 2022

Raw data produced by an NIR instrument undergoes some sort of processing, or transformation, to make them easier to use. In this series, we explore options for that data transformation, starting with multiple linear regression (MLR).

EP. 12: Survey of Key Descriptive References for Chemometric Methods Used for Spectroscopy: Part II

October 1st 2021

The second in a two-part series highlighting key explanatory or tutorial references for each of 29 chemometric methods.

$A diffraction grating splitting white light into a series of spectra.$

EP. 13: Survey of Key Descriptive References for Chemometric Methods Used for Spectroscopy: Part I

June 1st 2021

The carefully selected literature references in this curated set describe the application of 29 major chemometric methods used for analyzing molecular spectroscopy data.

EP. 14: Why Does -1 x -1 = 1?

February 1st 2021

Mathematics is a formal logic system, perhaps the ultimate formal logic system. Here we describe the elegance of the foundations of the mathematics that chemometrics is based on.

EP. 15: Classical Least Squares (CLS), Part 3: Expanding the Analysis to Include Concentration Information on Principal Component Regression (PCR) and Partial Least Squares (PLS) Algorithms

October 1st 2020

We explore how different algorithms and different numbers of factors affect the results.

EP. 16: A Survey of Chemometric Methods Used in Spectroscopy

August 1st 2020

We provide a scorecard of chemometric techniques used in spectroscopy. The tables and lists of reference sources given here provide an indispensable resource for anyone seeking guidance on understanding chemometric methods or choosing the most suitable approach for a given analysis problem.

EP. 17: More About CLS, Part 2: Spectral Results (Not Needing Constituent Values) and CLS

ByJerome Workman, Jr., and Howard Mark

February 1st 2020

A previous analysis of data is compared to the results achieved using classical least squares and principal component analysis. What did we learn?

EP. 18: Using Reference Materials, Part II: Photometric Standards

October 1st 2019

Alignment of the instrument y-axis is a critical step for quantitative and qualitative measurements using spectroscopy. Here, we explain in detail how to use photometric standards for ultraviolet, visible, near infrared, infrared, and Raman spectroscopy.

EP. 19: More About CLS, Part 1: Expanding the Concept

June 1st 2019

A newly discovered effect can introduce large errors in many multivariate spectroscopic calibration results. The CLS algorithm can be used to explain this effect. Having found this new effect that can introduce large errors in calibration results, an investigation of the effects of this phenomenon to calibrations using principal component regression (PCR) and partial least squares (PLS) is examined.

EP. 20: Using Reference Materials, Part I: Standards for Aligning the X-Axis

February 1st 2019

The use of reference materials to align or test the wavelength–wavenumber axis for optical spectroscopy is essential for quantitative and qualitative methods. This article provides details for using reference materials with ultraviolet, visible, near-infrared, infrared, and Raman spectroscopy methods.

EP. 21: Outliers, Part III: Dealing with Outliers

October 1st 2018

What are the steps to take once an outlier is discovered? There are several options.

EP. 22: Calibration Transfer Chemometrics, Part II: A Review of the Subject

June 1st 2018

Calibration transfer involves several strategies and mathematical techniques for applying a single calibration database consisting of samples, reference data, and calibration equations to two or more instruments. In this installment, we review the chemometric and tactical strategies used for the calibration transfer process.

EP. 23: Outliers, Part II: Pitfalls in Detecting Outliers

ByJerome Workman, Jr., and Howard Mark

February 1st 2018

How can you detect the presence of an outlier when it is mixed with multiple other, similar, samples?

EP. 24: Calibration Transfer Chemometrics, Part I: Review of the Subject

October 1st 2017

Calibration transfer involves multiple strategies and mathematical techniques for applying a single calibration database to two or more instruments. Here, we explain the methods to modify the spectra or regression vectors to correct differences between instruments.

EP. 25: Outliers, Part I: What Are Outliers?

June 1st 2017

Outliers are fundamentally a very fuzzy notion. Here, we try to clear up what outliers are and how they affect your data.

EP. 26: Bias and Slope Correction

February 1st 2017

As we have previously discussed, the most time consuming and bothersome issue associated with calibration modeling and the routine use of multivariate models for quantitative analysis in spectroscopy are the constant intercept (bias) or slope adjustments. These adjustments must be routinely performed for every product and each constituent model. For transfer and maintenance of multivariate calibrations this procedure must be continuously implemented to maintain calibration prediction accuracy over time. Sample composition, reference values, within and between instrument drift, and operator differences may be the cause of variation over time. When calibration transfer is attempted using instruments of somewhat different vintage or design type the problem is amplified. In this discussion of the problem we continue to delve into the issues causing prediction error, bias and slope changes for quantitative calibrations using spectroscopy.

EP. 27: Statistics, Part III: Third Foundation

November 1st 2016

Part III of this series discusses the principle of least squares

EP. 28: How to Select the Appropriate Degrees of Freedom for Multivariate Calibration

June 1st 2016

This column addresses the issue of degrees of freedom (df) for regression models. The use of smaller degrees of freedom (df) (e.g., n or n-1) underestimates the size of the standard error; and possibly the larger df (e.g., n-k-1) overestimates the size of the standard deviation. It seems one should use the same df for both SEE and SECV, but what is a clear statistical explanation for selecting the appropriate df? It is a good time to raise this question once again and it seems there is some confusion among experts about the use of df for the various calibration and prediction situations - the standard error parameters should be comparable and are related to the total independent samples, data channels containing information (i.e., wavelengths or wavenumbers), and number of factors or terms in the regression. By convention everyone could just choose a definition but is there a more correct one that should be verified and discussed for each case? The problem with this subject is in computing the standard deviation using different df without a more rigorous explanation and then putting an over emphasis on the actual number derived for SEE and SECV, rather than on using properly computed confidence intervals. Note that confidence limit computations for standard error have been discussed previously and are routinely derived in standard statistical texts (4).

EP. 29: Statistics, Part II: Second Foundation

February 1st 2016

This column is the continuation of our discussion in part I dealing with statistics.

EP. 30: Statistics, Part I: First Foundation