Light and AI Unite: Raman Breakthrough in Noninvasive Lung Cancer Detection


Harun Hano, Charles H. Lawrie, and Beatriz Suarez, et al. from the Department of Physics at the University of the Basque Country (UPV/EHU), in Spain; and the IKERBASQUE─Basque Foundation for Science in Spain have published a research paper in the journal ACS Omega describing the use of Raman spectroscopy with specialized data treatment for the diagnosis of lung cancer.

Raman light and AI technology unite: © dejanns -

Raman light and AI technology unite: © dejanns -

Lung cancer is the leading cause of cancer-related deaths worldwide, emphasizing the urgent need for reliable and efficient diagnostic methods. Conventional approaches often involve invasive procedures and can be time-consuming, costly, and involve the risk of infection, resulting in potential delay of effective treatment. The current study published in ACS Omega by scientists from the University of Basque Country and the IKERBASQUE─Basque Foundation for Science in Spain, explores the potential of Raman spectroscopy as a promising noninvasive technique (1).

Raman has been explored as a diagnostic technique by analyzing human blood plasma samples from lung cancer patients and healthy controls. In a benchmark study, 16 machine learning (ML) models were evaluated by employing four strategies: the combination of dimensionality reduction with classifiers, application of feature selection prior to classification, stand-alone classifiers, and a unified predictive model. The models showed different performances due to the inherent complexity of the data, achieving accuracies from 0.77 to 0.85 and areas under the curve for receiver operating characteristics (AUC-ROC) from 0.85 to 0.94. Hybrid methods incorporating dimensionality reduction and feature selection algorithms present the highest figures of merit. Nevertheless, all ML models deliver creditable scores and demonstrate that Raman spectroscopy potentially represents a powerful method for future in vitro diagnostics of lung cancer (1) .

Early detection of diseases has become increasingly important. Timely and accurate diagnosis of lung cancer is crucial for effective treatment and better survival rates. However, conventional methods are often expensive, time-consuming, and have limited sensitivity in the early stages. In contrast, Raman spectroscopy has emerged as a promising diagnostic technique that enables noninvasive, label-free, and real-time analysis (1).

Raman Spectroscopy and Its Significance

Raman spectroscopy is based on the inelastic scattering of light, where a small fraction of the photons interact with the sample, resulting in a gain or loss of energy and thus a shift in the wavelength of the scattered light. This shift, known as the Raman shift, is proportional to the frequency of the molecular vibration. This highly effective and nondestructive approach can provide insight into the molecular composition of biological fluids. Human blood plasma, a complex biological fluid composed of proteins, lipids, nucleic acids, and carbohydrates, is an excellent source for identifying biochemical changes. Therefore, Raman spectroscopy can be used to analyze the spectral signatures of blood plasma and provide valuable diagnostic information (1).

Read More: Raman Used for Biomedical Applications and Cancer Diagnosis

Research Focus and Methodology

This research primarily focused on the performance of several ML models used to discriminate the spectral signatures of human blood plasma samples between lung cancer patients and healthy controls. An ensemble of 16 different machine learning models was examined, including various combinations with a particular feature selection method, transformation techniques, and classifiers. Principal component analysis (PCA), a commonly used technique for dimensionality reduction, was applied along with classifiers such as linear discriminant analysis (LDA), support vector machine (SVM), Naïve Bayes (NB), logistic regression (LR), and random forest (RF). The models were further extended in conjunction with the Fisher score (FS) feature selection method in various configurations. Standalone classifiers and partial least-squares discriminant analysis (PLS-DA) were studied independently (1).

Key Findings:

The study's comprehensive analysis reaffirms the potential of Raman spectroscopy as a promising tool for lung cancer detection. By comparing the Raman spectra of lung cancer patients and healthy controls, significant differences in spectral features were identified, highlighting the considerable potential to provide insights into the molecular alterations associated with lung cancer. For identifying these changes and elucidating compositional and structural modifications that occur in proteins, carbohydrates, lipids, nucleic acids, and other biomolecules, the entire spectral range of the Raman spectra from human blood plasma is crucial (1).

In the presented analysis, models combining PCA with LDA and PCA with FS and SVM exhibited the highest accuracy, both falling within 0.85 ± 0.14 and featuring area under the curve (AUC) scores above 0.93. LDA also demonstrated robust performance metrics, even without feature extraction methods. PLS-DA, though slightly behind in accuracy, held a respectable AUC score of 0.90, signaling its reliability. Among standalone classifiers, NB distinguished itself with a competitive accuracy of 0.82 ± 0.13. Overall, the findings indicate that while PCA-enhanced models offer the highest accuracy and AUC scores, simpler models like LDA and PLS-DA remain robust choices depending on the specific requirements of a given application.


This study highlights the potential of Raman spectroscopy as a diagnostic tool for lung cancer detection and emphasizes the benefits of employing machine learning models to analyze spectral data for classification purposes (1,2). The research underlines the role of model selection and the importance of multivariate analysis methods in attaining superior performance. Different models can be optimally applied based on the specific needs of the task, leading to more accurate and effective diagnostic tools. This could lead to earlier detection, improved treatment, and better patient outcomes. Using Raman spectroscopy data supported by artificial intelligence offers a rapid and low-cost technology for in vitro diagnostics. Once validated and calibrated for specific disease patterns, the proposed technology can replace complex chemical analyses and provide detailed insight into biochemical changes in physiology in real-time. The technology is not limited to lung cancer and has the potential for a paradigm shift in medical diagnostics, representing a significant step forward in medical research and offering new hope to millions worldwide (1,2).


(1) Hano, H.; Lawrie, C. H.; Suarez, B.; Paredes Lario, A.; Elejoste Echeverría, I.; Gómez Mediavilla, J.; Crespo Cruz, M. I.; Lopez, E.; Seifert, A. Power of Light: Raman Spectroscopy and Machine Learning for the Detection of Lung Cancer. ACS Omega 2024, 9 (12), 14084–14091. DOI: 10.1021/acsomega.3c09537

(2) Santos, I. P.; Barroso, E. M.; Schut, T. C. B.; Caspers, P. J.; van Lanschot, C. G.; Choi, D. H.; Van Der Kamp, M. F.; Smits, R. W.; Van Doorn, R.; Verdijk, R. M.; Hegt, V. N. Raman Spectroscopy for Cancer Detection and Cancer Surgery Guidance: Translation to the Clinics. Analyst 2017, 142 (17), 3025–3047. DOI: 10.1039/C7AN00957G

Related Videos
Related Content