News|Articles|May 20, 2025

Describing Their Two-Step Neural Model: An Interview with Ayanjeet Ghosh and Rohit Bhargava

In the second part of this three-part interview, Ayanjeet Ghosh of the University of Alabama and Rohit Bhargava of the University of Illinois Urbana-Champaign discuss how machine learning (ML) is used in data analysis and go into more detail about the model they developed in their study.

Key Takeaways

A new two-step regressive neural network significantly improves DFIR imaging by reconstructing high-resolution spectra using just seven wavenumbers.

Unlike traditional methods like PCA, this model eliminates the need for dimensionality reduction while offering greater interpretability and scalability for biomedical applications.

The model uses two artificial neural networks: one to reconstruct full spectra from sparse data, and a second to quantify protein secondary structures by predicting Gaussian peak areas

Although trained on simulated data representative of common protein structures, the model has been validated on real tissue samples and shows promise for generalizability across tissue types.

A recent study in Applied Spectroscopy introduced a two-step regressive neural network model that improves discrete frequency infrared (IR) imaging for biomedical use, especially in studying protein structures in tissues affected by neurodegenerative diseases (1). Unlike traditional methods like principal component analysis (PCA), which are less interpretable and require dense spectral data, this model uses only seven wavenumbers to accurately reconstruct high-resolution spectra and predict structural features (1). As a result, it significantly accelerates both data acquisition and analysis, offering a more efficient and scalable solution for IR imaging in biomedical research (1).

Two of the authors of this study, Ayanjeet Ghosh, who is a professor in the Department of Chemistry and Biochemistry at the University of Alabama, and Rohit Bhargava, who is a professor in the Department of Bioengineering at the University of Illinois, Urbana-Champaign, recently sat down with Spectroscopy to discuss their findings (2,3).

In the second part of this three-part interview, Ghosh and Bhargava discuss how machine learning (ML) is used in data analysis and go into more detail about the model they developed in their study.

How is machine learning (ML) used for data analysis, and why is principal component analysis (PCA) insufficient for analyzing sparsely sampled discrete frequency IR data, especially in biomedical applications?

ML approaches have been widely in for chemical imaging, specifically for both Fourier transform infrared (FT-IR) and discrete frequency IR (DFIR) microscopies, wherein spectral parameters, such as intensities and frequencies, have been leveraged to distinguish between different disease states, such as early-stage vs metastatic cancer or identify chemical signatures underlying pathological markers, such as composition of protein aggregates in neurodegenerative diseases. PCA is a dimensionality reduction technique typically used in conjunction with FT-IR imaging. DFIR does not require tools like PCA because it already provides only the specific spectral data relevant to chemical characterization of a specific specimen—the sparse spectral sampling in DFIR makes dimensionality reduction unnecessary.

Could you walk us through the architecture and design of the two-step regressive neural network model you developed? How does it address the challenges of curve fitting at scale?

Our two-step neural network is designed to perform two key steps necessary for quantification of protein secondary structures from discrete frequency data. It reconstructs the full spectra from seven wavenumbers, and it then predicts areas under curve (AUCs) of underlying spectral components for structural quantification, which is typically done using band fitting.

Step 1 (ANN1): Upscales 7-point sparse spectra → 41-point full spectra of the amide I region and includes three hidden layers (16, 32, 64 neurons) and ReLU activation.
Step 2 (ANN2): Predicts 3 Gaussian AUCs from the up-sampled spectra and includes two hidden layers (16, 9 neurons) and SELU activation.

Our approach is ~3000x (based exclusively on our computational resources available to us at the time) faster than Gaussian fitting, which is particularly relevant for large images with > 1 million pixels.

Your model requires only seven wavenumbers to generate high-resolution spectral predictions—how did you determine the optimal spectral frequencies to sample, and how generalizable is this selection across tissue types or applications?

Our goal was to use the minimum number of spectral bands to reconstruct full-resolution spectra. We trained our models on largely simulated spectral data composed of three components, representative of the most common secondary structural elements in proteins. The number of bands was chosen empirically based on performance tests comparing mean absolute error (MAE) of the model vs. the band count. The specific wavenumbers chosen were not necessarily tied to specific structures but were selected to best reconstruct the spectrum. We found that our model performance was slightly better with hand-picked bands compared to uniformly spaced bands across the amide-I range.

The data chosen to train the models was designed to capture the possible variations of the amide I IR spectra as typically observed in biological specimens. Hence, this model should be generalizable across different tissue types. We have recently verified this by comparing the model output with band fitting for breast cancer tissue biopsies. However, retraining of the models may be necessary for specific applications where the spectra are known to be composed of additional structural components.

References

Edmonds, H.; Mukherjee, S. S.; Holcombe, B.; et al. Quantification of Protein Secondary Structures from Discrete Frequency Infrared Images Using Machine Learning. Appl. Spectrosc. 2025, ASAP. DOI: 10.1177/00037028251325553
The University of Alabama, Ayanjeet Ghosh. UA.edu. Available at: https://chemistry.ua.edu/people/dr-ayanjeet-ghosh/ (accessed 2025-05-01).
University of Illinois, Urbana-Champaign, Rohit Bhargava. Illinois.edu. Available at: https://bioengineering.illinois.edu/people/rxb (accessed 2025-05-01).

Get essential updates on the latest spectroscopy technologies, regulatory standards, and best practices—subscribe today to Spectroscopy.

Subscribe Now!

Describing Their Two-Step Neural Model: An Interview with Ayanjeet Ghosh and Rohit Bhargava

Key Takeaways

References

Newsletter

Related Content

Best of the Week: Previewing the Winter Conference on Plasma Spectrochemistry, Transforming Plastic Recycling

Tracking Microplastics in Italy’s Po River

Previewing a Talk on Mercury Speciation in Whole Blood

The Benefits of In-Person Conversations at Conferences

Previewing a Talk on Glow Discharge Optical Emission Spectroscopy

Trending on Spectroscopy Online

Deep Learning Meets Spectroscopy to Transform Plastic Recycling Accuracy

Developing LIBS for Molten Salt Reactor Monitoring

Best of the Week: Previewing the Winter Conference on Plasma Spectrochemistry, Transforming Plastic Recycling

Tracking Microplastics in Italy’s Po River

The Benefits of In-Person Conversations at Conferences