News||December 8, 2025

Chemometrics in the AI Age: Bridging Tradition and Machine Intelligence

Author(s)John Chasse
Fact checked by: Caroline Hroncich
Listen
0:00 / 0:00

Key Takeaways

  • Chemometrics has long used tools now labeled as AI and ML, emphasizing the importance of interpretability in models.
  • Balancing AI enthusiasm with chemometric traditions is crucial for understanding experimental systems and data quality.
SHOW MORE

Paolo Oliveri of the University of Genoa (Italy) sat down with Spectroscopy to discuss the case for artificial intelligence’s (AI) increasing influence in chemistry, and maintained that the chemometrics field has long employed tools now labeled as AI and machine learning (ML).

A recent article in Analytical Science (1) by Paolo Oliveri of the University of Genoa (Italy) discusses the case for artificial intelligence’s (AI) increasing influence in chemistry, maintaining that the chemometrics field has long employed tools now labeled as AI and machine learning (ML). The real challenge, in his opinion, is balancing enthusiasm for new ML methods with the chemometric tradition of understanding experimental systems, data quality, and preprocessing. By merging rigorous chemical insight with emerging AI tools, Oliveri believes that chemometrics field is poised for significant growth in modern analytical laboratories.

Oliveri sat down with Spectroscopy for a deeper dive into the subject.

How do you think the early chemometricians, like Svante Wold and Bruce Kowalski, who you mention in your piece, would view today’s artificial intelligence (AI) and machine learning (ML) revolution in analytical chemistry?

Given their positive attitude towards the implementation of powerful computational methods and innovation in general, I believe they would view the current AI revolution as a valuable opportunity to further promote the use of chemometrics in chemical laboratories. They would likely focus on improving and streamlining chemical processes, from method development to validation, monitoring, and quality control. And they would certainly acknowledge the central role of chemistry and chemists in this context. As Svante Wold wrote in his editorial paper celebrating the first 20 years of chemometrics: “As chemists, we must realize that we must continue to keep the power over our own theory, data analysis, model interpretation, and most important of all, our problem formulation. [...] And chemometrics must continue to be motivated by chemical problems solving, not by method development.” (2).

A key distinction you raise is that chemometrics prioritizes interpretability, while modern AI often accepts “black box” models. How can chemometricians leverage the predictive power of deep learning while preserving chemical interpretability?

Undeniably, the ability to interpret models is key, as it enables experimenters to understand which experimental variables are responsible for a given result, whether positive or negative. Interpretability, which is typically missing from deep learning models, is often fundamental for chemists. When processing a set of spectra, for example, we not only want to obtain qualitative or quantitative analytical results, but also to understand which spectral features are associated with relevant information. This interpretative step is not only fundamental to the validation process of data-driven modelling, ensuring that data overfitting has been avoided, but is also a key part of knowledge acquisition itself. Chemometricians who wish to reap the benefits of implementing deep learning models should design modifications to such algorithms to extract the values of model coefficients and convert them into a parameter that indicates the contribution of each variable, thereby achieving model interpretability.

You emphasize that deep learning’s hunger for massive datasets doesn’t align well with typical chemical data collection. What strategies or innovations do you see emerging to make AI methods more viable for smaller, high-quality chemical datasets?

Yes, the other major problem with implementing deep learning methods is the need for large, highly representative training datasets to create efficient models. One approach to overcoming this hurdle is artificial data augmentation. While this approach is interesting, I have not yet seen any fully satisfactory applications. However, this is certainly a direction that can be further investigated and improved.

You mention the growing number of chemometrics courses at all academic levels. How should we be training the next generation of chemometricians to navigate a world increasingly dominated by AI and data science?

We should distinguish between two levels of education: one for users and one for developers of chemometrics. Users need to learn the key concepts necessary for successfully applying chemometrics, such as the importance of data quality and representativity, how to avoid overfitting, how to properly validate models, and how to correctly interpret model outcomes. Developers, on the other hand, need to delve deeper into statistical and ML theory without forgetting chemical constraints and the aforementioned points. However, just as we don't need to be electronics engineers to use a smartphone, we don't need to be a chemometrics developer to use chemometric methods.

You highlight the need for strong collaboration between chemometricians and general data scientists. What does an ideal partnership between these two communities look like, and what challenges must be overcome to make it work?

This type of collaboration can be profitable for chemometricians, who can borrow advanced methods and adapt them to the nature of chemical data and problems. In turn, data scientists may be inspired by real-world problems to develop targeted methods and optimised algorithms that address specific needs. Difficulties related to slightly different technical languages used by different communities can easily be overcome by working together.

You describe data pre-processing as a crucial yet underrated part of analysis. In an era of automated AI pipelines, how do we ensure that domain-specific knowledge, like appropriate pre-processing, remains central to data analysis?

Once again, we must start with a thorough understanding of the nature of chemical data and its characteristics. For this reason, the education of new generations of chemometricians must focus firmly on the chemical aspects and implications, because there is a risk of paying too much attention to the mathematical and algorithmic aspects, which are important too, but in the second instance.

You quote Wold’s advice: “We must remain chemists and adapt statistics to chemistry, instead of vice versa.” How relevant do you think that principle remains today, given the computational and algorithmic dominance in current research?

This principle is essential for achieving profitable results from machine learning implementations in the chemical field. The suggestion made earlier to modify deep learning algorithms to extract a variable importance parameter from model coefficients is a concrete example of this concept.

You contrast “mature” chemometricians who resist AI terminology with younger, more enthusiastic researchers. How do you think the dialogue between these generations could shape the future direction of chemometrics?

The key to exploiting the most advanced state-of-the-art data processing tools without losing focus on addressing real-world chemical problems, starting from a deep knowledge of data features and considering all practical constraints, lies in combining the skills and concrete approach of experienced specialists with the passionate impulse of younger generations.

You conclude with optimism about merging AI with chemometrics. What specific developments or breakthroughs do you foresee in the next decade that could define the “AI-chemometrics” era?

Actually, chemometrics is AI. We will certainly see its integration with emerging AI frames such as generative tools, language models, multimodal approaches and agentic implementations. Over the next decade, we can expect to see a significant increase in the automation and speed of data processing, as well as the widespread availability of advanced software tools. Consequently, it will become even more important to invest time in developing a robust understanding of experimental systems and the properties of the data, to draw really powerful applications.

References

  1. Oliveri, P. Chemometrics: A Bridge to the AI Age. The Analytical Scientist 2025. https://url.us.m.mimecastprotect.com/s/IzpjCJ6R2Ri0vB8ktkH5CyZoE5?domain=urlsand.esvalabs.com (accessed 2025-11-06)
  2. Wold, S. Chemometrics; What Do We Mean With It, and What Do We Want From It? Chemom. Intell. Lab. Syst. 1995, 30, 109-115. DOI: /10.1016/0169-7439(95)00042-9

Newsletter

Get essential updates on the latest spectroscopy technologies, regulatory standards, and best practices—subscribe today to Spectroscopy.