Raman Spectroscopy and Machine Learning-Based Optical Probe for Tuberculosis Diagnosis via Sputum


In the treatment of tuberculosis (TB), a contagious disease that causes 1.5 million deaths per year globally, early diagnosis is critical in order to control its spread. Unfortunately, standard tuberculosis diagnostic tests, such as sputum culture, can take days to weeks to yield results. In a recent paper, Ubaid Ullah of the Syed Babar Ali School of Science and Engineering in Pakistan and his colleaguesdemonstrate a quick, portable, easy-to-use, and non-invasive optical sensor based on sputum samples for tuberculosis detection using Raman spectroscopy to detect TB in a patient’s sputum supernatant. Ullah spoke to Spectroscopy about this sensor and its development.

Your paper (1) demonstrates a quick, portable, easy-to-use, and non-invasive optical sensor for analysis of sputum samples for tuberculosis (TB) detection. What inspired the development of this sensor?

The current gold standard and most common practice test for TB diagnosis is the sputum culture, specifically in third-world countries where TB has a high impact. However, these tests take days to weeks to produce results. Meanwhile, the potential TB patient is out in the general public, probably infecting more and more people. Therefore, we devised an alternate approach that still makes use of the standard TB lab equipment and testing procedures, and yields findings within a couple of minutes. The catch is that we replace the time-consuming incubation step (where bacteria in TB liquid samples are nourished to form visible colonies) by exploiting spectroscopic information from the freshly prepared TB sample.

The probe uses Raman spectroscopy to evaluate detection of TB in a patient’s sputum supernatant. What advantages did Raman spectroscopy offer that other techniques available may have lacked?

Multiple techniques, including immune-chromatography, polymerase-chain-reaction (PCR), piezoelectricity, electrochemical, and surface plasmon resonance, have been implemented for TB detection. These methods, however, either have poor sensitivity and specificity, high operational cost, lack portability, invasiveness, or produce harmful byproducts. Raman spectroscopy, on the other hand, does not suffer from these limitations, and has been successfully applied to the diagnosis of various diseases, as evident from the fact that it has been used in clinical trials involving hundreds of patients to detect various types of cancer. In fact, companies now exist that provide commercial Raman systems for various disease diagnoses. For example, Kaiser Optical Systems, Inc. develops Raman systems for multiple applications, including cancer, skin, bone, eyes, and dental diagnosis.

How did you select the calibration set of samples and the validation set of samples? Were different samples used to develop the discriminant model versus samples used to test the model?

We employed principal component analysis (PCA) for the classification. Since PCA relies on dimensional reductionality, it bypasses the usual classification procedures of training and validation. We ran PCA on 112 patients’ data, and chose the first two orthogonal principal components, PC1 and PC2, for segregation.

Your paper states that the red and black dots are underneath the blue line—did you mean to say green and black dots, indicating the TB + samples are separated out from the red TB – samples?

Yes, the blue line serves as the classifier. As mentioned in Figure 3, the black and green dots below the blue line represent TB + patients on medication and recently diagnosed TB + patients, while red dots above the blue line denote TB – patients.

Briefly describe your discovery and development process.

Modern, sophisticated machine learning algorithms make it possible to detect very minute variations that were previously impossible to comprehend with conventional methods. It is well-researched that the sputum of a person with TB has a range of biomarkers, including esters and proteins. We hypothesize that the presence of these compounds minutely changes the Raman signature of the sputum’s supernatant sample, even after filtering mycobacterium tuberculosis (Mtb) bacteria cells. The change in the Raman signature is too weak to identify TB biomarkers, but is sufficient to probe TB. However, its weak nature makes it hard to differentiate the probe signal visually. Therefore, we utilized a machine learning algorithm, principal component analysis (PCA) for the classification.

The following is a brief description of the development process. We collected sputum from a potential TB patient, and prepared the sample following the standard TB culture protocol. Samples were then filtered to remove Mtb bacteria into a clean cuvette. Following this, we record Raman data of the cell-free sample supernatant using a wavelength and intensity calibrated diffraction limited spectrometer by averaging five frames of the Raman spectrum and subtracting the reference background spectrum. Each Raman spectrum was integrated for 1.5 min. The obtained Raman data was then regularized, a covariance matrix was found, principal components analysis was performed, and the sample data was plotted in the plane of the highest variance orthogonal components (PC1, PC2).

How does this work differ from what has been previously done by yourself or others?

As stated earlier, Raman spectroscopy has been successfully applied to detect various diseases, including cancer and skin diseases, and companies now exist that provide commercial Raman systems for various disease diagnoses. However, Raman spectroscopy is not widely exploited for TB detection, since it is predominantly a problem of developing countries. Researchers have utilized Raman spectroscopy to detect TB in blood and TB meningitis in cerebrospinal fluid (2,3). Even though these approaches accelerate TB diagnosis, their invasive nature limits its applications. A subsequent study that employed Raman spectroscopy of individual mycobacterium cells for TB detection offered even species-level information (4). Moreover, they achieved an accuracy of 94%. However, the complex sample preparation process necessitates the need for a specialist. Furthermore, human sampling was missing in the study. Here, for the first time, we utilize Raman spectroscopy and a machine learning algorithm to detect TB in sputum supernatant. The probe produces quick results, does not require trained laboratory personnel, is non-invasive in nature, and will use the already established state-of-the-art TB test facilities.

Please summarize your findings.

We demonstrate for the first time the application of Raman spectroscopy toward rapid TB detection using sputum samples. The sensor offers 100% true-positive and 93.3% true-negative accuracy after testing 112 TB patients. Significantly, our probe can provide a quick (few minutes) TB test result of an already processed sputum sample for the culture test. The probe can be easily integrated into conventional TB diagnostic labs where smear microscopy and culture tests are routinely conducted.

Were there any particular limitations or challenges you encountered in your work?

One particular limitation of our setup is its high capital cost, mainly due to the Raman spectrometer. Yes, working and gathering TB samples during the Covid pandemic was a huge challenge for us.

Can you please summarize the feedback that you have received from others regarding this work?

In addition to applause and greetings, we were advised to apply this technique to other infectious diseases such as malaria and pneumonia. Furthermore, it was suggested that we make a stand-alone commercial device out of this work.

What are the next steps in this research?

We are looking for further funding to make this probe more robust, reduce its capital cost, and to make feasible the replacement of the standard TB culture test with a low-cost setup.

Do you think this work be taken into clinical trials for use in diagnostics?

Yes, this work has the capability to be taken into clinical trials for use in diagnostics.


  1. U. Ullah, Z. Tahir, O. Qazi, S. Mirza, and M.I. Cheema, Tuberculosis 136, 102251 (2022)
  2. S. Khan, R. Ullah, S. Shahzad, N. Anbreen, M. Bilal, and A. Khan, Photodiagn. Photodyn. Ther. 24, 286-291 (2018).
  3. R. Sathyavathi, N. C. Dingari, I. Barman, P. S. R. Prasad, S. Prabhakar, D. N. Rao, R. R. Dasari, and J. Undamatla, J. Biophotonics 6(8), 567-572 (2013).
  4. S. Stöckel, S. Meisel, B. Lorenz, S. Kloß, S. Henk, S. Dees, E. Richter,S. Andres, M. Merker, I. Labugger, and P. Rösch, J. Biophotonics 10(5), 727-734 (2017).
Ubaid Ullah

Ubaid Ullah

Ubaid Ullah received his BS degrees in Electronics from the University of Peshawar in 2014, M.Phil. and PhD degrees in Electronics and Electrical Engineering from Quaid-i-Azam University (Islamabad, Pakistan) and Lahore University of Management Science (Lahore, Pakistan) in 2016 and 2021, respectively. He is currently working as a Postdoctoral Researcher in the Bio-Agri-Photonics Lab at Lahore University of Management Sciences. His research interests include the design and development of optical sensors for biomedical and chemical applications.

Related Videos
John Burgener | Photo Credit: © Will Wetzel
Robert Jones speaks to Spectroscopy about his work at the CDC. | Photo Credit: © Will Wetzel
John Burgener | Photo Credit: © Will Wetzel
Robert Jones speaks to Spectroscopy about his work at the CDC. | Photo Credit: © Will Wetzel
John Burgener of Burgener Research Inc.
Related Content