Near-infrared (NIR) Raman spectroscopy has become widespread in different areas of the analytical study of materials, including
forensic science. The nondestructive character, high selectivity, and possibility to explore a variety of materials make NIR
Raman a valuable tool for the identification of body-fluid traces. In this study, we combined regression and classification
chemometrical algorithms to achieve effective discrimination of pure body fluids from their binary mixtures. Raman spectra
of dried blood, semen, and their mixtures in different ratios, collected in an automatic mapping manner, were used as a model
system. The established detection limit for minor contributors is as low as a few percent. The proposed methodology takes
into account the intrinsic heterogeneity of blood and semen and their variations between donors, and it potentially can be
applied on other mixtures, including those which are of interest to forensic specialists.
In this article, we outline a recently published study on differentiating semen and blood mixtures from individual body fluids
(1). The ultimate goal of this project is to develop a nondestructive, confirmatory method for the identification of trace
amounts of body fluids for forensic purposes (2,3). Modern forensic tools, such as DNA- and RNA-based methods, provide confirmation
of body-fluid type, race, and gender. Even a person involved in a crime can be unambiguously identified by genetic analysis
of biological evidence. However, in real life, forensic laboratories equipped with such labor-intensive and expensive techniques
are significantly outnumbered by the quantity of biological "footprints" associated with crimes. As a result, there are thousands
of body-fluid samples waiting to be analyzed. This type of analysis can be difficult or impossible to perform when the amount
of evidence is diminutive or is presented as a complex mixture.
Raman spectroscopy is a nondestructive, time-efficient, and easy-to-use technique suitable for the simultaneous characterization
of multiple fluids. Our earliest studies demonstrated the exceptional potential of near-infrared (NIR) Raman spectroscopy
for identifying pure body-fluid stains and encouraged us to address more complicated problems, such as differentiating pure
body fluids from their mixtures (1,3–11). This practical task required a simplified formulation of the common statistical
problem of spectroscopic signal demixing. A forensic expert is much more interested in determining the presence or absence
of particular body fluids within a stain than their exact contributions (percentages), which are perfectly accidental.
This scientific problem has been studied using mixtures of blood and semen as model systems because they are highly practical.
Blood and semen mixtures are often found at crime scenes related to sexual assaults. The detection of even minor remnants
of these types of forensic evidence could be crucial for investigating a crime. Several methods for body-fluid identification
were recently proposed and have been used efficiently in forensic laboratories. It is possible to characterize such biological
evidence by fluorescence (12–14), immunological tests (15), electrophoretic separation (16), RNA and DNA profiling (17), and
several other methods (18–22).
Despite the undeniable advantages of the methods listed above, a uniform approach that can save sample for further analysis
and provide fast and accurate results is still a necessity. The Raman effect occurs due to the interaction of an incident
photon with a vibrating molecule. The energies of the scattered photons with shifted frequencies can be graphically represented
as Raman spectra that provide unique information about the biochemical composition of the analyzed samples (23). The volume
of the analyzed matter can be as small as a few femtoliters or picograms of sample.
Recent developments in analytical instrumentation have allowed for the comprehensive manipulation of samples using techniques
such as automatic imaging and mapping. This technique automatically collects spectra when a sample under the laser beam moves
by a specified amount until all of the areas are scanned. This approach yields large data sets that can be preprocessed and
treated with a variety of statistical software packages. Combining Raman spectroscopy with multivariate statistics helps avoid
the problem of heterogeneity of body-fluid stains; their composition can vary between donors and within the sample. The method
we propose can process samples with various fluorescent profiles and overlapping spectral bands, which are common features
for biological objects.
Our method is based on the effective combination of regression and classification analyses that require a multistep discrimination
procedure. Support vector machine (SVM) regression was selected to separate the mixture's spectra that can be easily distinguished
from those of pure fluids. The final classification model was built on selected data using support vector machine discriminant
analysis (SVMDA). Then, the entire data set was subjected to the analysis. As a result, we were able to distinguish spectra
with minor contributions of blood or semen. The lowest concentrations that we were able to detect were 5% of blood in semen
and approximately 1% of semen in blood stains. Lower concentrations could be detected, but the accuracy of detection decreased
significantly. The proposed approach potentially can be plugged into portable instruments as a discriminative algorithm, which
would be beneficial for forensic laboratories as a valuable tool for investigation directly at a crime scene.