Research on Coal Classification Method Based on Terahertz Time-Domain Spectroscopy

News
Article

The traditional method of coal classification requires the measurement of various parameters of coal samples to yield accurate results. However, this detection process is time-consuming and laborious, making the rapid classification of coal species unfeasible. A proposed solution is a coal species classification method that combines terahertz time-domain spectroscopy with machine learning - specifically, principal component analysis (PCA) and cluster analysis (CA). By using terahertz (THz) time-domain spectroscopy (TDS), the absorption coefficient, dielectric constant, and refractive index of each sample were obtained from lignite, bituminous coal, and anthracite samples. A coal classification model was then established by integrating PCA and CA. The results indicate that the PCA-CA classification model, based on the refractive index spectrum, can accurately identify coal species with a classification accuracy of 100%. However, while the PCA-CA classification model based on the absorption coefficient spectrum can accurately identify lignite, it does not recognize bituminous coal and anthracite coal as effectively. These findings suggest that THz-TDS technology can accurately identify different kinds of coal, providing a novel approach for coal type classification.

Coal remains one of the most dominant energy sources in China. Despite a year-on-year decrease in coal consumption in recent years, China’s energy structure, characterized by abundant coal and scarce oil, ensures that coal’s primary status as an energy source will not fundamentally change in the short term. Coal will continue to serve as the stabilizer and ballast of China’s energy security (1).

Coal can be broadly categorized into three types based on the degree of coalification: lignite, bituminous coal, and anthracite coal. The price, usage, and environmental pollution vary among these different coal types. In order to use coal resources rationally and efficiently, and to achieve the vision of carbon neutrality as early as possible (2), it is crucial to quickly and accurately identify coal types.

Currently, most rapid coal classification methods (3), such as nuclear detection technology (4) and dual-energy γ-ray (5), can only obtain a single indicator in the coal. However, accurate identification of coal types requires multiple indicators for comprehensive research and judgment, rendering these methods insufficient for rapid coal type classification. With the rapid development of terahertz technology, it has shown great application prospects in the fields of material identification (6–9) and nondestructive testing (10–13). Terahertz waves are a unique type of electromagnetic wave situated between microwave and infrared, with a frequency of 0.1-10 THz and a wavelength of 0.03-3 mm. Due to its special position in the electromagnetic spectrum, it possesses many unique properties compared to other bands of electromagnetic radiation (14,15).

This study combines terahertz time-domain spectroscopy with two machine learning algorithms to qualitatively identify lignite, bituminous coal, and anthracite coal. The goal is to explore an accurate and fast method for identifying coal species.

Experimental Setup

Sample Preparation

The experiments involved the use of three types of coal powders: lignite (ZBM006), anthracite (ZBM093, ZBM097A), and bituminous coal (ZBM104, ZBM100D, ZBM100E, ZBM124). The lignite sample was sourced from the Xilinhot Shengli coal mine in Inner Mongolia. Following air-drying, crushing, grinding, and sieving, the physical properties and chemical composition of the coal sample were analyzed through industrial and X-ray diffraction (XRD) analysis. Both the anthracite and bituminous coal sample powders are coal-standard substances obtained from the National Standard Material Network. Their physical properties and chemical composition are provided by the National Material Center. The physical properties and chemical composition of the three types of coal samples are presented in Table I.

A certain mass of sample powder was weighed before the test. To prevent it from being broken due to a small thickness, add high-density polyethylene (HDPE) was then mixed thoroughly to the same proportion, at a ratio of 1:5 sample powder:polyethylene powder. A certain mass of mixed powder was weighed. The pressure of the tablet press was set to 10 MPa, and the pressing time was set to 5 min to obtain a circular tablet with a thickness of about 1.1 mm, with no cracks on the upper and lower surfaces.

Terahertz Spectroscopic Detection

In this experiment, the TAS 7400SU transmissive THz-TDS system from Advantest was utilized for the detection of coal samples. This system comprises a femtosecond laser, a terahertz emitter, a terahertz detector, and a time-delay system. The system’s frequency resolution is 7.6 GHz, and its spectrum range is 0.5-7 THz. To minimize experimental error, the measurements were conducted in an environment with an ambient temperature of 24 oC. Each sample’s signal was averaged after three repeated measurements. The air humidity in the optical path part was maintained below 1% RH. The general structure of the system is depicted in Figure 1.

Figure 1: Terahertz time-domain spectral system.

Figure 1: Terahertz time-domain spectral system.

Optical Parameter Extraction

The refractive index, extinction coefficient, absorption coefficient, and dielectric constant of the coal samples were derived after the terahertz spectral data of various samples were examined and calculated according to the optical parameter extraction model proposed by Dorney and associates. The calculation equations (16,17) are:

where p(w) is the ratio of the sample signal to the reference signal amplitude; a(w) is the phase difference between the sample signal and the reference signal; d is the sample thickness in m; c is the speed of light (m/s); w is the angular frequency (rad/s); and Er and Ei are the real and imaginary parts of the dielectric number E.

Algorithms Introduction

Principal component analysis (PCA)(18,19) is a commonly used feature extraction method, often employed in spectral analysis. PCA uses the original dataset as input and transforms multiple potentially correlated variables into a smaller number of completely uncorrelated new variables through an orthogonal linear transformation. This set of new variables is known as the principal components. Each principal component is orthogonal to each other, and the first principal component contains most of the information in the original data. This allows for a more intuitive presentation of the large amount of data in the original dataset.

Cluster analysis (CA) is an unsupervised classification method that classifies samples in an unlabeled dataset by measuring the similarity between samples. During the clustering process, no prior labeling of the dataset is required. Each sample is treated as an individual class before the calculation begins. The two closest samples are treated as a new class by calculating the Euclidean distance between each sample. The Euclidean distance between this new class and other samples is then calculated. The samples closest to this new class are combined with the new class to form a class. This calculation is repeated until all samples are clustered into one class, resulting in a clustering tree diagram that reflects the similarity and dissimilarity between samples (20).

Results and Discussion

Terahertz Spectroscopic Analysis

The terahertz time-domain spectra of the seven coal samples, as measured experimentally, are depicted in Figure 2a. As can be seen from the figure, the time delays and peak intensities vary among the different coal samples, with the time delays of various samples concentrated in the range of 17.5–20 ps. The time domain spectra of ZBM100D, ZBM100E, and ZBM104 are very similar. The time delay main peak of ZBM100E appears earliest at 18.13s with an amplitude of 0.106V; the time delay main peak of ZBM100D appears at 18.19s with an amplitude of 0.105V; the main peak of ZBM104 appears at 18.196s with an amplitude of 0.1044V. ZBM093 has the smallest main peak amplitude of 0.0725V and appears at 18.26s; ZBM097A has the largest main peak amplitude and appears the latest at 18.456s with an amplitude of 0.149 V. The differences in time delay and peak intensity among various coal samples can be attributed to the varying refraction and absorption of terahertz waves by the samples. Figure 2b displays the power spectra of the seven samples in the frequency range of 0.7–3 THz. It can be observed that all seven samples show an increase and then a decrease in energy consumption with the increase in frequency. In the range of 0.7–2.15 THz, the energy consumption of ZBM097A is the largest, and that of ZBM093 is the smallest. However, in the range of 2.15–3 THz, the energy consumption of ZBM006 is the largest, and that of ZBM093 is the smallest. Through Figure 2a and Figure 2b, it can be observed that there are significant differences in the terahertz time-domain spectra and power spectra of various samples. This indicates the feasibility of qualitatively identifying coal species using terahertz time-domain spectroscopy.

Figure 2: (a) Terahertz time-domain spectrum, and (b) terahertz power spectrum.

Figure 2: (a) Terahertz time-domain spectrum, and (b) terahertz power spectrum.

Figure 3 presents the refractive index spectra, dielectric constant spectra, and absorption coefficient spectra of the seven coal samples in the 0.7–3 THz band. It can be observed that the seven coal samples exhibit some separability in the terahertz refractive index and dielectric constant spectra. In Figure 3a, ZBM006 has the lowest refractive index with an average refractive index of 1.423. The refractive indices of the two anthracites, ZBM093 and ZBM097A, are the highest, with average refractive indices of 1.462 and 1.456, respectively. These values are significantly higher than those of bituminous coal and lignite. The average refractive indices of the four bituminous coals, ZBM100D, ZBM124, ZBM100E, and ZBM104, decrease in order, with values of 1.446, 1.444, 1.441, and 1.440, respectively, all falling between anthracite and lignite. In Figure 3b, the dielectric constant spectrum shows a similar pattern to the refractive index spectrum. ZBM006 has the lowest dielectric constant, ZBM093 and ZBM097A have the highest dielectric constants, and ZBM100D, ZBM100E, ZBM104, and ZBM124 have approximately the same dielectric constants, falling between anthracite and lignite. Significant differences can be observed in the refractive index and dielectric constant spectra of the seven coal samples, suggesting that the terahertz refractive index spectrum or dielectric constant spectrum can be used for preliminary coal type calibration. In Figure 3c, the absorption coefficient spectrum shows that the absorption coefficients of all seven samples increase with the frequency, consistent with the classical electromagnetic wave theory of electromagnetic wave propagation in lossy media. The absorption coefficient spectra do not have obvious characteristic absorption peaks, likely due to the complex chemical composition of coal and overlapping absorption peak positions. Among them, ZBM006 has the smallest absorption coefficient and ZBM093 has the largest. The analysis of refractive index, absorption coefficient, and dielectric constant spectra further illustrates the feasibility of using terahertz time-domain spectroscopy for coal species identification.

FIGURE 3: (a) Terahertz refractive index spectrum; (b) dielectric constant spectrum; and (c) absorption coefficient spectrum.

FIGURE 3: (a) Terahertz refractive index spectrum; (b) dielectric constant spectrum; and (c) absorption coefficient spectrum.

Coal Classification

Principal component analysis and cluster analysis were applied to the original spectra of the collected samples to classify the seven coal samples. Figure 4a shows the refractive index spectra in the frequency range of 0.7–3 THz as the input set, the seven samples were classified using cluster analysis, and a cluster tree diagram reflecting the similarity and dissimilarity of the seven samples was obtained after six steps of stepwise clustering. As can be seen from the figure, when the Euclidean distance is 0.15, the seven samples are divided into three categories: ZBM100D, ZBM100E, ZBM104, and ZBM124 are classified into the first category. The smallest Euclidean distance within this category is 0.0317, between ZBM100E and ZBM104, followed by ZBM100D and ZBM124, with a Euclidean distance of 0.03959. ZBM093 and ZBM097A are the second category, with a Euclidean distance of 0.1016 between them. ZBM006 is separately classified as the third category. It has the largest Euclidean distance of 0.424 compared to the other two categories.

FIGURE 4: (a) Clustering tree diagram; (b) two-dimensional principal component scatter point diagram; and (c) PC1 score diagram based on refractive index spectrum.

FIGURE 4: (a) Clustering tree diagram; (b) two-dimensional principal component scatter point diagram; and (c) PC1 score diagram based on refractive index spectrum.

The refractive index spectra in the frequency range of 0.7–3 THz were used as the input set, and the first two principal components (PC1, PC2) were extracted using principal component analysis. The contribution of the first principal component was 99.1%, and the contribution of the second principal component was 0.4%. The cumulative contribution of the first two principal components reached 99.5%, which essentially contained most of the original information of the samples. The two-dimensional principal component scatter plots and PC1 score histograms for various samples are shown in Figures 4b and 4c, respectively. The two-dimensional scatter plot shows a clear clustering effect of the seven samples on PC1. The closer the distance between the samples, the higher the similarity; the further the distance, the greater the difference. ZBM104 is the closest to ZBM100E, followed by ZBM100D and ZBM124. ZBM006 is far from the other six samples and has a negative maximum, showing great dissimilarity with the other samples.

This conclusion can also be drawn more visually through the PC1 score histogram in Figure 4c. Comparing the clustering tree diagram in Figure 4a, it can be seen that the conclusion drawn by PCA remains consistent with the conclusion drawn by CA. It can be concluded that the PCA-CA classification model based on refractive index spectra can achieve accurate identification of three types of coal with 100% identification accuracy. This also verifies that preliminary calibration of coal types can be performed using terahertz refractive index spectra.

Similarly, the absorption coefficient spectrum in the frequency range of 0.7–3 THz was used as the input set. The cluster tree diagram obtained by cluster analysis is shown in Figure 5a. As can be seen from the figure, when the Euclidean distance is 80, the seven samples are classified into three categories: ZBM093 is classified into one category. ZBM006 is classified into another category. The remaining five samples (ZBM100D, ZBM100E, ZBM104, ZBM097A, and ZBM124) are classified into a third category. The Euclidean distance between ZBM093 and the other two classes is the largest at 170; the Euclidean distance between ZBM006 and the third class is the second largest at 113.7. Among the third class, the Euclidean distance between ZBM100D and ZBM104 is the smallest at 27.88. The Euclidean distance between ZBM100E and the new class formed by ZBM100D and ZBM104 increases slightly to 38.76. The Euclidean distances for ZBM097A and ZBM124 gradually increase, and these five samples eventually form one class.

FIGURE 5: (a) Clustering tree diagram; (b) two-dimensional principal component scatter point diagram; and (c) PC1 score diagram based on absorption coefficient spectrum.

FIGURE 5: (a) Clustering tree diagram; (b) two-dimensional principal component scatter point diagram; and (c) PC1 score diagram based on absorption coefficient spectrum.

The absorption coefficient spectra in the frequency range of 0.7–3 THz were subjected to principal component analysis, and the first two principal components were extracted. The cumulative contribution of the first two principal components was 97.3%, with the first principal component contributing 90.4%. Figures 5b and 5c show the two-dimensional principal component scatter plots and PC1 score histograms of various samples, respectively. A clear clustering effect of various samples on PC1 can be seen on the two-dimensional scatter plot. ZBM100D and ZBM104 are closest to each other, while ZBM100E, ZBM097A, and ZBM124 are gradually increasing in distance from the first two. ZBM006 and ZBM093 are farther away from all five samples, with ZBM093 having a positive maximum and ZBM006 having a negative maximum.

This conclusion can also be reached by the PC1 score histogram in Figure 5c, which is consistent with the CA analysis when comparing the clustering tree in Figure 5a. Six of the seven coal samples were well classified, with only ZBM097A showing deviations, resulting in an accuracy of 85.7%. It can be concluded that the PCA-CA classification model based on the absorption coefficient spectrum can achieve an accurate classification of ZBM006 (lignite). However, it mistakenly classifies ZBM097A (anthracite) into one category with the four bituminous coals, failing to achieve an accurate classification of bituminous and anthracite coals.

Conclusion

In this experiment, seven samples of lignite, bituminous coal, and anthracite coal were used to identify coal types using terahertz time-domain spectroscopy combined with machine learning algorithms. The results show that there is variability in the terahertz time-domain spectra of the seven coal samples, indicating that it is feasible to identify coal species using terahertz time-domain spectroscopy. It is also concluded that the use of terahertz refractive index spectra or dielectric constant spectra can perform preliminary calibration of coal species. In the frequency range of 0.7–3 THz, PCA-CA classification models were established for the refractive index spectra and absorption coefficient spectra of the seven samples, respectively. By comparing the classification effects of both, it can be concluded that the PCA-CA classification model established based on terahertz refractive index spectra is more effective than that established based on absorption coefficient spectra. The PCA-CA classification model established based on refractive index spectra can achieve accurate identification of the three types of coal with 100% recognition accuracy. This study is significant for the identification of coal species and also provides a theoretical reference for the identification of other special coal species by terahertz time-domain spectroscopy in the future.

Acknowledgments

This work was supported by the National Natural Science Foundation of China(52074273),the Key Natural Science Research Project for Colleges and Universities of Anhui Province (2023AH050343), the Pollutant Sensitivity Monitoring and Application Innovation Team of Anhui Province (2023AH010043), Anhui Provincial Department of Education Quality Engineering Project(2022jyxm1405), Anhui Province Graduate Education Quality Project (2024jyjxggyjY204), and Huaibei Normal University Bit and Graduate Education Quality Project (2024jgxm003).

References

  1. Wang, G.; Li, S.; Zhang J.; et al. Ensuring the Safety of Coal Industry to Lay the Cornerstone of Energy Security. China Coal 2022, 48 (7), 1–9. DOI: 10.19880/j.cnki.ccm.2022.07.001
  2. Chen, F.; Yu H.; Bian Z.; Yin, D. How to Handle the Crisis of Coal Industry in China Under the Vision of Carbon Neutrality. J. China Coal Soc. 2021, 46 (6), 1808–1820. DOI: 0253-9993( 2021) 06-1808-13
  3. Zhang, Y.; Shan, Q.; Zhang, X.; et al. Rapid Measurement System for Sulfur Content in Coal. J. Nanjing Univ. Aeronaut. Astronaut. 2015, 47 (5), 767. DOI: 10.16356/j.1005-2615.2015.05.022
  4. Qiao, Y. Research Progress and Application of Neutron Ttube. Nucl. Electron. Detect. Technol. 2008, 28 (6), 1134–1139. DOI: 0258-0934(2008)06-1134-06
  5. Cheng, D.; Wen, H.; Teng, Z.; Li, F. Study on Soft-Sensing of Coal Ash Content Based on Dual-Energy γ-Ray. Chin. J. Sci. Instrum. 2014, 35 (10), 2263–2270. DOI: 10.19650/j.cnki.cjsi.2014.10.014
  6. Wang, F.; Zhang, C.; Zhao, J,;Ha, S.; Zhang, Y. Identification of a Grass Species Using a Terahertz Wave Based on Hybrid Machine Learning Method. Laser Optoelectron. Prog. 2021, 58 (03), 318–324. DOI: 10. 3788/LOP202158.0330001
  7. Liu, Y.; Xu Z.; Hu, J.; Li, M.; Cui, H. Research on Variety Identification of Fritillaria Based on Terahertz Spectroscopy. Spectrosc. Spectral Anal. 2021, 41 (11), 3357–3362. DOI: 10.3964/j.issn.1000-0593(2021)11-3357-06
  8. Yang, Y.; Zhang, C.; Liu, H.; Zhang, Z. Identification of Two Types of Safflower and Bezoar by Terahertz Spectroscopy. Spectrosc. Spectral Anal. 2019, 39 (1), 45–49. DOI: 10.3964/j.issn.1000-0593(2019)01-0045-05
  9. Wang, Y.; She, S.; Zhou, N.; Jia, P.; Zhang, J. Classification of Terahertz Rosewood Based on Continuous Projection Algorithm and Random Forest. Spectrosc. Spectral Anal. 2019, 39 (9), 2719–2724. DOI: 10.3964/j.issn.1000-0593(2019)09-2719-06
  10. Zheng, L.; Liu, C.; Ren, J.; et al. Debonding Defect Identification Method for Multi-layer Bonded Structures Based on LDA-CPSO-SVM Optimization. Acta Photonica Sin. 2021, 50 (12), 114–121. DOI: 10.3788/gzxb20215012.1212004
  11. Jiang, X.; Xu, Y. Nondestructive Testing of Corrosion Thickness of Steel Plates Under Coatings by Terahertz Time-Domain Spectroscopy. Acta Photonica Sin. 2022, 42 (13), 76–83. DOI: 10.3788/AOS202242.1312001
  12. Wang, Q.; Mu, D.; Zhou, T.; et al. Terahertz Nondestructive Test of Delamination Defects in Glass-Fiber-Reinforced Composite Materials. Acta Photonica Sin. 2021, 41 (17), 90–98. DOI: 10.3788/AOS202141.1712003
  13. He, P.; Zhao, J. High-Precision Thermal Barrier Coating Thickness Mearsurement Method Using Terahertz Time-Domain Spectroscopy Technology. J. Xi’an Jiaotong Univ. 2022, 56 (6), 112–119. DOI: 0253-987X(2022)06-0112-08
  14. Mu, K.; Zhang, Z.; Zhang, C. Terahertz Scinece and Technology. J. China Acad. Electronics inf. Technol. 2009, 4 (3), 221–230, 237. DOI: 1673-5692(2009)03-221-10
  15. Liu, L.; Chang, T.; Li, K.; et al. Spectral Analysis and Quantitative Detection of Baicalin Based on Terahertz Radiation. Chin. J. Lasers 2020, 47 (3), 313–319. DOI: 10.3788/CJL202047.0314001
  16. Dorney, T. D.; Baraniuk, R. G.; Mittleman, D. M. Material Parameter Estimation with Terahertz Time-Domain Spectroscopy. J. Opt. Soc. Am. A 2001, 18 (7), 1562–1571. DOI: 10.1364/JOSAA.18.001562
  17. Liu, L.; Yang, C.; Zhang, X.; et al. Relationship Between Moisture and Dielectric Properties of Coal at Terahertz Band Electromagnetic Radiation. J. China Coal Soc. 2016, 41 (02), 497–501. DOI: 10.13225/j.cnki.jccs.2015.0431
  18. Zhan H.; Zhao, K.; Xiao, L. Spectral Characterization of the Key Parameters and Elements in Coal Using Terahertz Spectroscopy. Energy 2015, 93, 1140–1145. DOI: 10.1016/j.energy.2015.09.116
  19. Esteki, M.; Memarbashi, N.; Simal-Gandara, J. Classification and Authentication of Tea According to their Harvest Season Based on FT-IR Fingerprinting Using Pattern Recognition Methods. J. Food Compos. Anal. 2023, 115, 104995. DOI: 10.1016/j.jfca.2022.104995
  20. Zhan, H.; Bao, R.; Ge, L.; Zhao, K. Discerning of Swill -Cooked Dirty Oil by Terahertz Technology and Statistical Method. China Oils Fats. 2015, 40 (4), 52–54. DOI: 1003-7969(2015)04-0052-03

Xiang Liu, Shuguang Miao, Yue Zhang, Chenchen Li, and Shujuan Xia are with the School of Physics and Electronic Information, Huaibei Normal University, in Huaibei, China. Enjie Ding is with the IOT Perception Mine Research Center, China University of Mining and Technology, in Xuzhou, China. Direct correspondence to Shuguang Miao at msgmcu@126.com.●

Newsletter

Get essential updates on the latest spectroscopy technologies, regulatory standards, and best practices—subscribe today to Spectroscopy.

Related Content