Where Perception Meets Reality: The Science of Measuring Color

Published on: 
Spectroscopy, October 2022, Volume 37, Issue 10
Pages: 21–24,35

Color is something that most people take for granted. A key assumption in color science is that our perceptions are similar and individual differences are small. Predictable rules, such as additive color mixing, make color modeling possible so that we can describe the richness of color in relatively low-dimensional spaces like red, green, and blue (RGB). Here, we look at how scientists define and calibrate color, various color measurement methods, and issues that arise when attempting to accurately measure and quantify color.

Spectroscopy can be defined as “the interaction of light with matter,” and typically involves judgments about the matter based on light absorption or emission. Hence, spectroscopists deal with the electromagnetic spectrum regularly, and we’re particularly interested in emission or absorption of particular wavelengths of light. The initial modern study of matter with light may be traced to the flame spectroscope devised by Bunsen and Kirchoff in the 1800s in Heidelberg, which allowed them to identify elements by their emission spectrum when they were injected into the eponymous Bunsen burner. The motivation for this occurrence was almost certainly the addition of a common salt such as table salt, which generated the intense emission of the sodium “D” lines at 589.0 and 589.6 nm. The pair went on to discover two new elements, cesium and rubidium, with their spectroscope.

Although spectroscopists work continuously with light of various wavelengths, we are typically thinking of discrete wavelengths corresponding with emission or absorption. Few of us have had a formal introduction to color science, and how the perception of “color” is derived from viewing either continuous sources (like a tungsten filament or other hot source emitting blackbody radiation) or a combination of discrete sources with narrower wavelength emissions.

Here, we cover some of the basics of color science with my colleague, Ethan Montag, who is an expert in color science. I had the opportunity to ask him a number of questions about color. Hopefully, the discussion of how color is parameterized and how we make accurate color measurements serves as a good introduction that piques your interest!

There are lots of ways to describe color, known as color spaces. Most readers of Spectroscopy are probably familiar with the red, green, blue (RGB) color system, whereby the three primary colors can be additively combined to make any other color. Could you describe any other color systems? For example, what is the purpose and origin of the International Commission on Illumination Laboratory (CIELAB)?

Color is a perception. It is not a property of an object, but the effect of light emitted or reflected from objects impinging on the eye and interpreted by the nervous system. The ability to describe color using different color spaces all derive from the fact that there are three independent channels for conveying color information from the eye to the brain. The process of distinguishing color begins in the retina, where there are three types of cone receptors, each sensitive to a different, but overlapping, region of the visible spectrum. My thesis advisor, Robert M. Boynton, liked to point out that the Young-Helmholtz Theory of Trichromacy is the only theory in psychology to be experimentally confirmed.

There are many different types of color spaces that are used for different purposes. There are device-dependent color spaces (for example, cyan, magenta, yellow, key [CMYK] and RGB) that are used to specify the pro- duction of color by subtractive (dyes and pigments) or additive (displays) mixing of primary colors. Then, there are device-independent color spaces (for example, standard RGB [sRGB] and Pantone) used for color communication and specifying colors between devices. For example, profiles can be made that can transform the RGB of a particular calibrated display to sRGB and then back to the RGB of a different, calibrated display so that the two displays match.

In color metrology, we use the device-independent color spaces adopted by the CIE based on color matching experiments performed in the last century. These include the CIE 1931 2° XYZ color space and the CIE 1964 10° supplementary XYZ color space. These color spaces are based on linear transforms of specific RGB primaries used in color-matching experiments. Because of trichromacy, these XYZ primaries are linear transforms of the actual cone response in the eye. Because the curves describing the sensitivity of cones to the visible spectrum overlap, it was impossible to measure them directly when the system of colorimetry was being developed. Vision researchers now have excellent values for the long-, medium-, and short-wavelength sensitive cones (abbreviated as LWS, MWS, and SWS, respectively), but the color science community has been slow to adopt these physiologically based primaries.

These primary-based color spaces are actually “color blind.” That is, they tell you whether two color patches match (to the ideal observer) under specific geometric illumination and viewing conditions, but they do not tell you what the color looks like. An empirically derived color space, CIELAB, converts XYZ color coordinates to a three-dimensional (3D) color space where Euclidian distances (ΔE*ab) are meant to be correlated with perceived color differences. A unit distance between the [L*, a*, b*] coordinates of two colors represents a noticeable differ- ence between the colors. This space allows for the setting of color tolerances for color reproduction. Over the years, the use of this space has been refined by replacing the Euclidian distance with more complicated color difference equations using the same coordinate system. Although often represented as a color appearance space, CIELAB is more of an intermediate stage between the CIE XYZ and appearance.

There are color spaces, such as the physical Munsell Book of Colors and various color appearance models (CAMs), that attempt to organize colors based on perceptual attributes such as lightness, hue, and chroma (colorfulness). The opponent processing of color is built into these spaces. The appearance of color is based on two opponent channels, red–green and blue–yellow. These dimensions are typically part of an appearance space. However, the appearance of a color is difficult to quantify because it depends on the observer’s state of adaptation, the context of the colored area being viewed (its complex surroundings), and both the conscious and subconscious expectations of the observer.

What are some of the challenges in making standardized color measure- ments? How does one “normalize” illumination in the computation of color values?

As mentioned in the last answer, colorimetry is based on an ideal (theoretical) observer’s response derived from a limited set of human observers making real color-matching judgments of isolated color patches. Briefly, the color of a reflective surface is calculated by integrating over the visible spectrum the product of the illuminant spectral power, the reflectance factor of the surface, and each of the standard observer’s color matching functions (x, y, and z) to get tristimulus values (X, Y, and Z). The system was designed so that the color matching functions are all positive and y is the luminous efficiency function, V(λ), so that Y is directly proportional to luminance. Therefore, they do not look like any real primaries you might see in a display. A non-spectroscopic form of color measurement uses a colorimeter that has three sensors, each with a sensitivity close to the color matching functions (or a linear transform of them). You might notice that when we do this calculation, the reflectance of the object is lost. This observation might be surprising to spectroscopists unfamiliar with color measurement.

A consequence of this occurrence is color metamerism. Illuminant metamerism is the change in the color of an object because of a change in the illumination. Two surfaces might match under one illuminant but appear different under another. The ramifications of this for color inspection is that the desired illuminant must be specified. The quantification and evaluation of the effects of illumination metamerism is part of the portfolio of the field of illumination engineering.

Observer metamerism is the mismatch between colors because of changes in the observer. Two people may have slightly different spectral sensitivities in their LWS cones, so a match for one of them does not hold for the other. In the extreme, this issue is manifested by color-blindness, dichromacy, a missing cone type, or anomalous trichromacy where LWS or MWS cones have sensitivities shifted closer to each other. In the normal color population, there are differences between people because of slight wavelength shifts in the LWS and MWS cone types. Pre-retinal pigments can also contribute to observer metamerism. As one ages, changes in the densities of optical structures in the eye can cause spectrally selective changes as well, so that you can make different color matches than your younger self.

A controversial topic in color vision is tetrachromacy. It has been observed that multiple MWS and LWS cone pigments with slightly different peaks can be expressed in the cones. A small percentage of biological females have four cone types where the additional cone has a shifted SWS or LWS cone pigment. Although this trait is rare in biological females, it influences color matching behavior and discrimination (they’re better than us!). The consensus understanding is that the neural pathways to the brain are still trichromatic, meaning that these individuals do not have any additional dimensions of color sensations. An analogous effect is that under mesopic conditions, illumination levels where the rod photoreceptors and the cones both can respond. The color-matching behavior is different than under photopic illumination where only the cones contribute because the rod response is saturated. We are all monochromats under scotopic (low-light) illumination where only the rods are functional.

This information leads to the questions that people often ask about color: Do colors look alike to everyone? Is it possible that my “red” is perceived as “green” to you? Because the physiology, color-matching behavior, and color-opponency of color-normal humans is, to within close limits, the same, I tend to believe, based on transitive relationships, that colors “look alike” to normal trichromats.

However, it is important to note that it doesn’t mean that cultural and personal aspects of the “meaning” of color do not affect how we see color in a more expansive definition of what perception is (as opposed to sensation). The association of pink for girls and blue for boys is a cultural imposition, which actually flipped the convention from the early 1900s. It is very difficult to control for the cultural influence of color meaning when trying to study whether different colors have emotional or physiological effects. I refer you to the literature on Baker-Miller Pink, or the question of why we don’t have yellow fire engines if you want to enter the morass I try to avoid.

Why do we measure color? Are there particular benefits from using one or another of the structured color sys- tems? How are color measurements calibrated?

Why do we measure color? Before I attempt to answer this, we should address what color is used for. I can list three purposes of color: identification; quality assessment; and aesthetics. Color is perhaps the first tool we use to help us identify objects in the world. Think of a bookshelf in your home. If you want to find a particular one, you might typically think of the color of the object and narrow your search by focusing only on the books with that color. Likewise, when you try to find your car in a crowded parking lot, color is what you use to search efficiently. Camouflage is a nice example of how color helps us in identification.

There is a theory that color vision may have evolved to aid in the identification of ripe fruit in a background of leaves. Although it may be problematic to apply a teleologic explanation to the development of a biological characteristic, it is hardly controversial to say that color vision aids us in functions that help us maintain our species. Although we can see the shape of a fruit without color, we use color to assess its quality. Likewise, we use color to assess other aspects of fitness and health.


Therefore, it is conceivable that aesthetics plays the biggest role in our use of color. Because of this, color measurement becomes important so that we can ensure the faithful and consistent reproduction of objects we use in the world. Unlike other aspects of spectroscopy, color measurement is rarely used to assess the physical characteristics of objects. Surface reflectance in the visible spectrum does not give us much information about material properties. We leave this to other regions of the spectrum or high energy applications such as Raman or laser-induced breakdown spectroscopy (LIBS).

Therefore, colorimetry is primarily used to ensure visual consistency of the color of an object. For example, if your company produces widgets, the process in which they are colored may be susceptible to changes over time because of changes in the formulation or other process changes. You may want to measure a few of your widgets coming from a new batch to make sure they are the right color. If your widgets are made up of multiple parts joined together, you may want to bin your parts based on their color so that you join matched parts together. In some industries, there is a need for an 100% inspection of color for quality control (QC). If you buy paint, you want the color to match from one can to the next. If you buy floor tiles, you want all the tiles to match.

However, even the measurement of surface reflectance is complicated. To ensure that our color measurements are not misinterpreted, the method in which they are made are also specified.

To fully describe the reflectance of an object, one would need to measure the bidirectional reflectance distribution function (BRDF). The BDRF describes the ratio of reflected light to incident light at every angle of incidence and reflectance. A perfectly diffuse surface reflects light equally in all directions. A perfect mirror reflects light only in the specular direction. In between these extremes are materials with various degrees of gloss. There may also be a change in reflectance that is wavelength-dependent as seen in iridescent and pearlescent materials.

Generally, it is too expensive and time-consuming to make goniometric measurements of a surface. Instead, there are standard geometries that are typically used with spectrophotometers for color measurement. For example, we use integrating spheres to gather light at all angles to get an estimate of the overall reflectance factor. We describe the illumination as diffuse if it enters the side of the sphere and then specify the angle from the normal at which the spectrometer measures the sample, as shown in Figure 1. We define this geometry as d/8°. Because of Helmholtz reciprocity, this geometry is equivalent to 8°/d where we switch the positions of the detector and the illumination. In this sphere configuration, all the light reflected from the surface is measured, including the specular reflectance that is reflected off the surface at -8°. We call this configuration specular component included (SCI). We can put a hole (a light trap) in the sphere at -8° so that the specular component of the reflectance leaves the sphere, which we call the specular component excluded (SCE).

Another common configuration is to illuminate the surface of the sample at 45° and measure at 0°. Typically, a ring of lights is positioned at the 45° angle for this 45°/0° configuration (in this scenario, there is no specular component contributing to the measurement). Custom configurations can be used, but the measurement geometry must be specified in the reporting of the measurement. Different instruments have been developed for multi-angle measurement. For example, multi-angle instruments are used in the automotive paint industry because of the complex nature of the coatings used on cars. In addition to the illumination and measurement geometry, the size of the measured area must also be considered. For example, if you are measuring a surface with texture or a printed surface with a halftone pattern, you must ensure that you are measuring a large enough area to integrate over the local nonuniformities.

Let’s get into the weeds a bit to describe color measurement. What we are measuring with these different geometries is the spectral reflectance factor. This is the fraction of light reflected off the sample relative to a perfectly diffuse reflector. We calibrate the spectrophotometer with a calibrated plaque with a known reflectance factor. In its simplest form, the reflectance factor is calculating using this equation:

where Ri is the calculated reflectance factor of sample, i; Di is the spectrometer response from the sample; Dw is the spectrometer response from the calibration plaque, and Rw is the reflectance factor of the calibration plaque. In practice, correction factors may need to be applied.

Once we have measured the reflectance, we can use CIE colorimetry to calculate the [X, Y, Z] tristimulus values for any illuminant, which map to the RGB coordinates. The CIE has tabulated a variety of standard illuminants and the American Standards of Testing and Materials (ASTM) has standards for calculating tristimulus values as well.

Now that we have calculated the tristimulus values of a sample, we may want to find out how different they are from the color we want to achieve. As mentioned above, this is where CIELAB may be used. The CIELAB calculation uses a white point to normalize its coordinate system The white point is the tristimulus values of the perfect diffuse reflector (a reflectance value of one at all wavelengths). Now we can calculate the color coordinates in CIELAB space of the sample and the standard and use a color difference equation to determine if the sample is within the desired tolerance.

Setting these tolerance limits is a science within itself. There are different techniques that can be used including performing visual experiments to find tolerance limits and adjusting coefficients in color difference equations. The goal is to choose the proper geometry that produces measurements that correlate with the perceptual assessments for the samples you are measuring.

Are there any other interesting facts about color and color science you would like to share?

I focused mainly on the practical aspect of simple color measurement and inspection in the last answer. But this is only a small part of color science and the study of color vision. There are too many interesting topics to choose from. It was fun when “the dress” picture (1) went viral in 2015, and I still get asked about this. Even regarding basic color measurement, we encounter interesting phenomena such as the changes in surface color that can be caused by changes in temperature and humidity. As a multidisciplinary field, there are surprises around every corner.


Thanks to Ethan for an interesting and entertaining introduction to color science and perception. I certainly learned a lot of new vocabulary and gained some new understanding through working with him on this article, which is entirely the point of the “Lasers and Optics” column in Spectroscopy magazine. I will note here that this article concludes nearly 7 years of my contributions to this column—allowing you, dear readers, some new and fresh perspectives. It has been a pleasure to write for the wonderful staff at Spectroscopy, and you will no doubt hear more from me occasionally in this magazine and on other channels. I hope that these columns have been useful to you, and I wish each of you continued fun in your explorations of the interactions of light and matter!


(1) Wikipedia, The dress (accessed September 2022).