This installment, and the next one, comprise lists of four key explanatory or tutorial references for each of 29 chemometric topics described in a previous article, with the addition of programming platforms often used for chemometrics. The references are selected as being particularly helpful to explain the use of each technique with spectroscopic data whenever possible. This reference list cannot be exhaustive, due to space limitations, but it is extensive and comprehensive. Included is a series of tables listing the key reference numbers for each chemometric technique.
The August 2020 installment of the “Chemometrics in Spectroscopy” column was entitled, “Survey of Chemometric Methods Used in Spectroscopy” (1). In that article, we delineated 29 common chemometric methods (or techniques) in use today by spectroscopists, and selected a single literature reference for each method. This current installment, and the one that will follow, continue this theme, forming a two-part series. In this two-part series, the respective chemometric methods, with their corresponding literature reference numbers, are given in Tables I through V. The tables for the two-part series include the following topics (in order of appearance); in all, 30 chemometrics topics will be covered.
Part I includes Tables I and II with references. Part II will include Tables III through V with corresponding literature references.
The science of chemometrics has rapidly advanced to now be included in the broader field of data analytics. There are now degree programs in data analytics and all types of data processing related to computer science or specific other data fields, including econometrics, biometrics, biomedical statistics, data mining, chemometrics—and software design and architecture specifically for design engineering, control systems, manufacturing engineering, robotics, instrument control, predictive modeling and learning, and many other fields. This includes artificial intelligence and its subfields of machine learning algorithms (used in data filtering and computer imaging applications), computational statistics and optimization, supervised, semi-supervised, and unsupervised learning, and various forms of predictive modeling. Other machine learning family members include deep learning (deep structured learning), artificial neural networks, and specialized learning algorithms. A menagerie of techniques or combinations of techniques is continuously being introduced, with ever-changing names. However, the basic mathematical concepts behind these changing names are much the same, and advances are mostly in name, the computer processing power used, and in the combinations and applications of the data analysis algorithms used. At some future point, we hope to at least summarize the nomenclature of these fields in this column.
Description of References
The general references 1–12 are basic descriptions of chemometrics reviews specific to spectroscopy applications. Then, a set of four references are given in Tables I through IV for each chemometric method name given, and five references are given for each computer software platform introduced in Table V. The references for the two-part series will be sequential from 1 to 148 so that the series can be viewed as a single body of work.
Introduction to the Tables
Many excellent videos, technical notes, online sources, and published articles exist for the purpose of instruction and understanding of algorithms and chemometrics topics. Here, we have selected a set of papers from the technical literature that includes chemometrics reviews for spectroscopy (1–12) as well as a set of articles for each of 30 selected chemometrics topics (including software platforms): We included four references for each topic that we considered most applicable to Spectroscopy readers, and we also tried to include those references that might be considered “classic” or tutorial papers. As we specifically delve into each subject or topic, we will include additional references that will be helpful to the reader in understanding and using these various chemometric methods.
Here in part I of this series, Table I presents the references for various signal preprocessing techniques. These data processing methods are often used prior to the application of data exploration, or prior to qualitative or quantitative methods. Table II lists references for component analysis techniques used mostly for data exploration and discovery. In Part II of the series, Table III will show the variety of references for quantitative (calibration) methods used to take raw or preprocessed data and compute predictive calibration models for quantitative determination of physical or chemical parameters in a dataset. Table IV will provide references for the qualitative (classification) methods used to take raw or preprocessed data and compute predictive calibration models for qualitative (classification) of different groups or types of samples or of physical or chemical parameters in a dataset. Table V will include references for using the most common programming languages or platforms for general data interpretation using chemometrics or other statistical analysis.
References
Chemometrics Reviews for Spectroscopy
(1) J. Workman and H. Mark, Spectroscopy 35(8), 9–14 (2020).
(2) J.J. Workman Jr., P.R. Mobley, B.R. Kowalski, and R. Bro, Appl. Spectrosc. Rev. 31(1–2), 73–124 (1996).
(3) P.R. Mobley, B.R. Kowalski, J.J. Workman Jr., and R. Bro, Appl. Spectrosc. Rev. 31(4), 347–368 (1996).
(4) R. Bro, J.J. Workman Jr., P.R. Mobley, and B.R. Kowalski, Appl. Spectrosc. Rev. 32(3), 237–261 (1997).
(5) P. Geladi, Spectrochim Acta Part B At Spectrosc. 58(5), 767–782 (2003).
(6) P. Geladi, B. Sethson, J. Nyström, T. Lillhonga, T. Lestander, and J. Burger, Spectrochim Acta Part B At Spectrosc. 59(9), 1347–1357 (2004).
(7) B. Lavine and J. Workman, Anal. Chem. 80(12), 4519–4531 (2008).
(8) T. Rajalahti and O.M. Kvalheim, Int. J. Pharm. 417(1–2), 280–290 (2011).
(9) R.G. Brereton, J. Jansen, J. Lopes, F. Marini, A. Pomerantsev, O. Rodionova, J.M. Roger, B. Walczak, and R. Tauler, Anal. Bioanal. Chem. 409(25), 5891–5899 (2017).
(10) R.G. Brereton, J. Jansen, J. Lopes, F. Marini, A. Pomerantsev, O. Rodionova, J.M. Roger, B. Walczak, and R. Tauler, Anal. Bioanal. Chem. 410(26), 6691–6704 (2018).
(11) H. Yang, Spectroscopy 34(11), 40–42 (2019).
(12) H. Mark, and J. Workman Jr., Chemometrics in Spectroscopy (Elsevier, Academic Press, New York, New York, 2nd ed., 2018)
Signal Preprocessing
1. Baseline Subtraction
(13) C. Rowlands and S. Elliott, J. Raman Spectrosc. 42(3), 363–369 (2011).
(14) A.T. Weakley, P.R. Griffiths, and D.E. Aston, Appl. Spectrosc. 66(5), 519–529 (2012).
(15) A. Jirasek, G. Schulze, M.M.L. Yu, M.W. Blades, and R.F.B. Turner, Appl. Spectrosc. 58(12), 1488–1499 (2004).
(16) J.R. Powell, F.M. Wasacz, and R.J. Jakobsen, Appl. Spectrosc. 40(3), 339–344 (1986).
2. Derivative Preprocessing
(17) M.N. Leger and A.G. Ryder, Appl. Spectrosc. 60(2), 182–193 (2006).
(18) Y.L. Loethen, D. Zhang, R.N. Favors, S.B. Basiaga, and D. Ben-Amotz, Appl. Spectrosc. 58(3), 272–278 (2004).
(19) A.C. Dotto, R.S.D Dalmolin, A. ten Caten, and S. Grunwald, Geoderma 314, 262–274 (2018).
(20) B. Zimmermann and A. Kohler, Appl. Spectrosc. 67(8), 892–902 (2013).
3. Detrending
(21) K.E. Jang, S. Tak, J. Jung, J. Jang, Y. Jeong, and Y.C. Ye, J. Biomed. Opt. 14(3), 034004 (2009).
(22) B.K. Alsberg, W.G. Wade, and R. Goodacre, Appl. Spectrosc. 52(6), 823–832 (1998).
(23) D. Cozzolino and A. Moron, Anim. Feed Sci. Technol. 111(1–4), 161–173 (2004).
(24) A. Fassio and D. Cozzolino, Ind. Crops Prod. 20(3), 321–329 (2004).
4. Mean Centering
(25) A. Afkhami and M. Bahram, Talanta 66(3), 712–720 (2005).
(26) J.B. Cooper, Chemometr. Intell. Lab. Syst. 46(2), 231–247 (1999).
(27) M.P. Gómez-Carracedo, J.M. Andrade, D.N. Rutledge, and N.M. Faber, Anal. Chim. Acta 585(2), 253–265 (2007).
(28) A. Lorber, K. Faber, and B.R. Kowalski, J. Chemom. 10(3), 215–220 (1996).
5. Multiplicative Signal Correction
(29) Y.P Du, S. Kasemsumran, K., Maruo, T. Nakagawa, and Y. Ozaki, Anal. Sci. 21(8), 979–984 (2005).
(30) G.E. Fodor, R.A. Mason, and S.A. Hutzler, Appl. Spectrosc. 53(10), 1292–1298 (1999).
(31) H. Martens and E. Stark, J. Pharm. Biomed. 9(8), 625–635 (1991).
(32) A. Kohler, J. Sulé-Suso, G.D. Sockalingum, M. Tobin, F. Bahrami, Y. Yang, J. Pijanka, P. Dumas, M., Cotte, D.G. Van Pittius, and G. Parkes, Appl. Spectrosc. 62(3), 259–266 (2008).
6. Normalization
(33) J. Palacký, P. Mojzeš, and J. Bok, J. Raman Spectrosc. 42(7), 1528–1539 (2011).
(34) Å. Rinnan, F. Van Den Berg, and S.B. Engelsen, Trends Analyt. Chem. 28(10), 1201–1222 (2009).
(35) M.A. Czarnecki, Appl. Spectrosc. 53(11), 1392–1397 (1999).
(36) N.B. Zorov, A.A. Gorbatenko, T.A. Labutin, and A.M. Popov, Spectrochim. Acta B 65(8), 642–657 (2010).
7. Standard Normal Variate
(37) R.J. Barnes, M.S. Dhanoa, and S.J. Lister, Appl. Spectrosc. 43(5), 772–777 (1989).
(38) M.S. Dhanoa, S.J. Lister, R. Sanderson, and R.J. Barnes, J. Near Infrared Spectrosc. 2(1), 43–47 (1994).
(39) Q. Hai-bin, O. Dan-lin, and C. Yi-yu, J. Zhejiang Univ. Sci. B 6(8), 838–843 (2005).
(40) S. Romero-Torres, J.D. Pérez-Ramos, K.R. Morris, and E.R. Grant, J. Pharm. Biomed. 38(2), 270–274 (2005).
8. Successive Projections Algorithm (SPA)
(41) M.C.U. Araújo, T.C.B. Saldanha, R.K.H. Galvao, T. Yoneyama, H.C. Chame, and V. Visani, Chemometr. Intell. Lab. Syst. 57(2), 65–73 (2001).
(42) S.F.C. Soares, A.A. Gomes, M.C.U. Araujo, A.R. Galvão Filho, and R.K.H. Galvão, Trends Analyt. Chem. 42, 84–98 (2013).
(43) R.K.H. Galvao, M.C.U. Araujo, W.D. Fragoso, E.C. Silva, G.E. Jose, S.F.C. Soares, and H.M. Paiva, Chemometr. Intell. Lab. Syst. 92(1), 83–91 (2008).
(44) M.J.C. Pontes, R.K.H. Galvao, M.C.U. Araújo, P.N.T. Moreira, O.D.P. Neto, G.E. Jose, and T.C.B. Saldanha, Chemometr. Intell. Lab. Syst. 78(1-2), 11–18 (2005).
9. Wavelets
(45) B.K. Alsberg, A.M. Woodward, M.K. Winson, J. Rowland, and D.B. Kell, Analyst 122(7), 645–652 (1997).
(46) B. Walczak, E. Bouveresse, and D.L. Massart, Chemometr. Intell. Lab. Syst. 36(1), 41–51 (1997).
(47) J. Trygg and S. Wold, Chemometr. Intell. Lab. Syst. 42(1–2), 209–220 (1998).
(48) P.J. Brown, T. Fearn, and M. Vannucci, J. Am. Stat. Assoc. 96(454), 98–408 (2001).
Component Analysis
10. Classical Least Squares (CLS)
(49) D.M. Haaland and D.K. Melgaard, Vib. Spectrosc. 29(1–2), 171–175 (2002).
(50) D.M. Haaland and D.K. Melgaard, Appl. Spectrosc. 55(1), 1–8 (2001).
(51) D.K. Melgaard, D.M. Haaland, and C.M. Wehlburg, Appl. Spectrosc. 56(5), 615–624 (2002).
(52) T.G. Diaz, A. Guiberteau, J.O. Burguillos, and F. Salinas, Analyst 122(6), 513–517 (1997).
11. Independent Component Analysis (ICA)
(53) J.D. Bayliss, J.A. Gualtieri, and R.F. Cromp, “Analyzing Hyperspectral Data with Independent Component Analysis,” in 26th AIPR Workshop: Exploiting New Image Sources and Sensors, 3240, 133–143, International Society for Optics and Photonics (1998).
(54) J. Chen and X.Z. Wang, J. Chem. Inform. Comput. Sci. 41(4), 992–1001 (2001).
(55) J.M. Nascimento and J.M. Dias, IEEE Transactions on Geoscience and Remote Sensing 43(1), 175–187 (2005).
(56) N. Pasadakis and A.A. Kardamakis, Anal. Chim. Acta 578(2), 250–255 (2006).
12. Inverse Adding Doubling (IAD)
(57) S. Prahl, “Everything I Think You Should Know About Inverse Adding-Doubling,” Oregon Medical Laser Center, St. Vincent Hospital, 1–74 (2011).
(58) J. Yao, “Inverse Adding-Doubling Method for the Determination of Optical Properties of Thermotropic Material,” in 2010 International Conference on Display and Photonics, 7749, 77490V, International Society for Optics and Photonics (2010).
(59) S. Bellini, R. Bendoula, E. Latrille, and J.M. Roger, Appl. Spectrosc. 68(10), 1154–1167 (2014).
(60) W. Wang, C. Li, and R.D. Gitaitis, Trans. ASABE 57(6), 1771–1782 (2014).
13. Multivariate Curve Resolution (MCR)
(61) A. De Juan and R. Tauler, Crit. Rev. Anal. Chem. 36(3–4), 163–176 (2006).
(62) A. de Juan, J. Jaumot, and R. Tauler, Anal. Meth. 6(14), 4964–4976 (2014).
(63) Y. Xie, W. Cao, S. Krishnan, H. Lin, and N. Cauchon, Pharm. Res. 25(10), 2292 (2008).
(64) M. Garrido, F.X. Rius, and M.S. Larrechi, Anal. Bioanal. Chem. 390(8), 2059–2066 (2008).
14. Principal Components Analysis (PCA)
(65) E.K. Kemsley, Chemometr. Intell. Lab. Syst. 33(1), 47–61 (1996).
(66) E.J. Hasenoehrl and P.R. Griffiths, Appl. Spectrosc. 47(5), 643–650 (1993).
(67) R.C. Pereira, V.L Skrobot, E.V. Castro, I.C. Fortes, and V.M. Pasa, Energy Fuels 20(3), 1097–1102 (2006).
(68) C.W. Chang, D.A. Laird, M.J. Mausbach, and C.R. Hurburgh, Soil Sci. Soc. Am. J. 65(2), 480–490 (2001).
Next-Gen Mineral Identification: Fusing LIBS and Raman Spectroscopy with Machine Learning
September 17th 2024A pioneering study integrates laser-induced breakdown spectroscopy (LIBS) with Raman spectroscopy (RS) and applies machine learning (ML) to achieve exceptional accuracy in mineral identification. The combined approach not only leverages the strengths of both techniques but also enhances classification precision, achieving up to 98.4% accuracy.
AI-Powered Spectroscopy Faces Hurdles in Rapid Food Analysis
September 4th 2024A recent study reveals on the challenges and limitations of AI-driven spectroscopy methods for rapid food analysis. Despite the promise of these technologies, issues like small sample sizes, misuse of advanced modeling techniques, and validation problems hinder their effectiveness. The authors suggest guidelines for improving accuracy and reliability in both research and industrial settings.
Non-Linear Memory-Based Learning Advances Soil Property Prediction Using vis-NIR Spectral Data
September 3rd 2024Researchers from Zhejiang University have developed a new non-linear memory-based learning (N-MBL) model that enhances the prediction accuracy of soil properties using visible near-infrared (vis-NIR) spectroscopy. By comparing N-MBL with traditional machine learning and local modeling methods, the study reveals its superior performance, particularly in predicting soil organic matter and total nitrogen.