Sep 01, 2008

Spectroscopy

Volume 23, Issue 9

Even when a straight-line calibration (measured response versus sample concentration) appears satisfactory to the eye, and even when the regression coefficient value for that straight line closely approaches the ideal value of 1, an F-test and a residuals analysis should be used to assess the quality of the data, and to uncover properties not apparent in the straight-line plot. Both can be completed easily using standard tools in software packages; the residuals analysis especially provides visual plots that alert the analyst to hidden properties in the data set. The F-test compares the variances calculated for two different data sets. These data sets commonly are chosen to be repetitive measurements of instrument response taken at the high- and low- sample concentration ends of a putative calibration range. Given the specified desired confidence limits for the regression, the F-test indicates whether the variances (the square of the standard deviations) are within an "allowable" range. An F-test value falls within the allowable range when the data is homoscedastic — defined as when the standard deviation of data sets measured at the different sample concentrations is the same. In such a situation, use of a weighted regression would not be appropriate, and would not be supported within the analytical guidelines (2). A simple straight-line model will model the data accurately. Commonly, however, especially across a broader range of sample concentrations, the standard deviations will vary with sample concentration, with larger absolute standard deviations seen at higher sample concentrations. The influence of these larger standard deviations on the regression line will be substantial. An F-test value outside of the accepted range will indicate that the data sets are heteroscedastic, and a weighted regression analysis treatment of the data is appropriate. For such heteroscedastic data, the relative standard deviation might be more or less constant across the concentration range of interest. When a simple plot of the measured values at each sample concentration of interest is created (as might presage the construction of a linear calibration plot), the heteroscedastic nature of the data might be obscured. Variations in the data (especially at lower sample concentrations) might be smaller than the size of the symbol used to mark the values on the graphical plot. In a residuals analysis, a value such as percentage deviation from the mean ( |