Chemometrics in Spectroscopy Linearity in Calibration: Quantifying Non-linearity

This column presents results from some computer experiments designed to assess a method of quantifying the amount of non-linearity present in a dataset, assuming that the test for the presence of non-linearity already has been applied and found that a measurable, statistically significant degree of non-linearity exists.
Dec 01, 2005


Jerome Workman Jr. & Howard Mark
In our last few columns (1–4), we discussed shortcomings of current methods used to assess the presence of non-linearity in data, and presented a new method that addresses those shortcomings. This new method is statistically sound, provides an objective means to determine if non-linearity is present in the relationship between two sets of data, and is inherently suitable for implementation as a computer program.

A shortcoming of the method presented is one that it has in common with virtually all statistical tests: while it provides a means of unambiguously and objectively determining the presence of non-linearity, if we find that non-linearity is present, it does not address the question of how much non-linearity is present. This column, therefore, presents results from some computer experiments designed to assess a method of quantifying the amount of non-linearity present in a data set, assuming that the test for the presence of non-linearity already has been applied and found that indeed, a measurable, statistically significant degree of non-linearity exists.

The spectroscopic community, and indeed, the chemical community at large, is not the only group of scientists concerned with these issues. Other scientific disciplines also are concerned with ways to evaluate methods of chemical analysis. Notable among them are the pharmaceutical communities and the clinical chemistry communities. In those communities, considerations of the sort we are addressing are even more important, for at least two reasons:

  • These disciplines are regulated by governmental agencies, especially the Food and Drug Administration. In fact, it was considerations of the requirements of a regulatory agency that created the impetus for this series of columns in the first place (1).
  • The second reason is what drives the whole effort of ensuring that everything that is done, is done "right." An error in an analytical result can conceivably, in literal fact, cause illness or even death.

Thus, the clinical chemistry community also has investigated issues such as the linearity of the relationship between test results and actual chemical composition, and an interesting article provides the impetus for creating a method of assessing the degree of non-linearity present in the relationship between two sets of data (5).

Degree of Non-linearity

The basis for this calculation of the amount of non-linearity is illustrated in Figure 1. In Figure 1a, we see a set of data showing some non-linearity between the test results and the actual values. If a straight line and a quadratic polynomial both are fit to the data, then the difference between the predicted values from the two curves gives a measure of the amount of non-linearity. Figure 1a shows data subject to both random error and nonlinearity, and the different ways linear and quadratic polynomials fit the data. As shown in Figure 1a, at any given point, there is a difference between the two functions that represents the difference between the Y-values corresponding to a given X-value.

Figure 1b shows that irrespective of the random error of the data, the difference between the two functions depends only upon the nature of the functions and can be calculated from the difference between the Y-values corresponding to each X-value. If there is no non-linearity at all, then the two functions will coincide, and all the differences will be zero. Increasing amounts of non-linearity will cause increasingly large differences between the values of the two functions corresponding to each X-value, and these can be used to calculate the nonlinearity.